scispace - formally typeset
Search or ask a question

Showing papers in "Artificial Intelligence Review in 2019"


Journal ArticleDOI
TL;DR: A survey of metaheuristic research in literature consisting of 1222 publications from year 1983 to 2016 is performed to highlight potential open questions and critical issues raised in literature and provides guidance for future research to be conducted more meaningfully.
Abstract: Because of successful implementations and high intensity, metaheuristic research has been extensively reported in literature, which covers algorithms, applications, comparisons, and analysis. Though, little has been evidenced on insightful analysis of metaheuristic performance issues, and it is still a “black box” that why certain metaheuristics perform better on specific optimization problems and not as good on others. The performance related analyses performed on algorithms are mostly quantitative via performance validation metrics like mean error, standard deviation, and co-relations have been used. Moreover, the performance tests are often performed on specific benchmark functions—few studies are those which involve real data from scientific or engineering optimization problems. In order to draw a comprehensive picture of metaheuristic research, this paper performs a survey of metaheuristic research in literature which consists of 1222 publications from year 1983 to 2016 (33 years). Based on the collected evidence, this paper addresses four dimensions of metaheuristic research: introduction of new algorithms, modifications and hybrids, comparisons and analysis, and research gaps and future directions. The objective is to highlight potential open questions and critical issues raised in literature. The work provides guidance for future research to be conducted more meaningfully that can serve for the good of this area of research.

467 citations


Journal ArticleDOI
TL;DR: The heuristic and hybrid approaches utilized in ANFIS training are examined in order to guide researchers in their study and it has been observed that there is a trend toward heuristic based ANfIS training algorithms for better performance recently.
Abstract: In the structure of ANFIS, there are two different parameter groups: premise and consequence. Training ANFIS means determination of these parameters using an optimization algorithm. In the first ANFIS model developed by Jang, a hybrid learning approach was proposed for training. In this approach, while premise parameters are determined by using gradient descent (GD), consequence parameters are found out with least squares estimation (LSE) method. Since ANFIS has been developed, it is used in modelling and identification of numerous systems and successful results have been achieved. The selection of optimization method utilized in training is very important to get effective results with ANFIS. It is seen that derivate based (GD, LSE etc.) and non-derivative based (heuristic algorithms such us GA, PSO, ABC etc.) algorithms are used in ANFIS training. Nevertheless, it has been observed that there is a trend toward heuristic based ANFIS training algorithms for better performance recently. At the same time, it seems to be proposed in derivative and heuristic based hybrid algorithms. Within the scope of this study, the heuristic and hybrid approaches utilized in ANFIS training are examined in order to guide researchers in their study. In addition, the final status in ANFIS training is evaluated and it is aimed to shed light on further studies related to ANFIS training.

454 citations


Journal ArticleDOI
TL;DR: This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software that is capable of scaling computation effectively and efficiently in the era of Big Data.
Abstract: The combined impact of new computing resources and techniques with an increasing avalanche of large datasets, is transforming many research areas and may lead to technological breakthroughs that can be used by billions of people. In the recent years, Machine Learning and especially its subfield Deep Learning have seen impressive advances. Techniques developed within these two fields are now able to analyze and learn from huge amounts of real world examples in a disparate formats. While the number of Machine Learning algorithms is extensive and growing, their implementations through frameworks and libraries is also extensive and growing too. The software development in this field is fast paced with a large number of open-source software coming from the academy, industry, start-ups or wider open-source communities. This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software. It also provides an overview of massive parallelism support that is capable of scaling computation effectively and efficiently in the era of Big Data.

443 citations


Journal ArticleDOI
TL;DR: This study provides a comprehensive review of deep learning-based recommendation approaches to enlighten and guide newbie researchers interested in the subject.
Abstract: Recommender systems are effective tools of information filtering that are prevalent due to increasing access to the Internet, personalization trends, and changing habits of computer users Although existing recommender systems are successful in producing decent recommendations, they still suffer from challenges such as accuracy, scalability, and cold-start In the last few years, deep learning, the state-of-the-art machine learning technique utilized in many complex tasks, has been employed in recommender systems to improve the quality of recommendations In this study, we provide a comprehensive review of deep learning-based recommendation approaches to enlighten and guide newbie researchers interested in the subject We analyze compiled studies within four dimensions which are deep learning models utilized in recommender systems, remedies for the challenges of recommender systems, awareness and prevalence over recommendation domains, and the purposive properties We also provide a comprehensive quantitative assessment of publications in the field and conclude by discussing gained insights and possible future work on the subject

294 citations


Journal ArticleDOI
TL;DR: In this article, the authors divide semantic image segmentation methods into two categories: traditional and recent DNN method, and comprehensively investigate recent methods based on DNN which are described in the eight aspects: fully convolutional network, up-sample ways, FCN joint with CRF methods, dilated convolution approaches, progresses in backbone network, pyramid methods, multi-level feature and multi-stage method, supervised, weakly-supervised and unsupervised methods.
Abstract: Semantic image segmentation, which becomes one of the key applications in image processing and computer vision domain, has been used in multiple domains such as medical area and intelligent transportation. Lots of benchmark datasets are released for researchers to verify their algorithms. Semantic segmentation has been studied for many years. Since the emergence of Deep Neural Network (DNN), segmentation has made a tremendous progress. In this paper, we divide semantic image segmentation methods into two categories: traditional and recent DNN method. Firstly, we briefly summarize the traditional method as well as datasets released for segmentation, then we comprehensively investigate recent methods based on DNN which are described in the eight aspects: fully convolutional network, up-sample ways, FCN joint with CRF methods, dilated convolution approaches, progresses in backbone network, pyramid methods, Multi-level feature and multi-stage method, supervised, weakly-supervised and unsupervised methods. Finally, a conclusion in this area is drawn.

257 citations


Journal ArticleDOI
TL;DR: An overview on Pythagorean fuzzy set is presented with aim of offering a clear perspective on the different concepts, tools and trends related to their extension, and two novel algorithms in decision making problems under Pythagorian fuzzy environment are provided.
Abstract: Pythagorean fuzzy set, generalized by Yager, is a new tool to deal with vagueness considering the membership grade $$\mu $$ and non-membership $$ u $$ satisfying the condition $$\mu ^2+ u ^2\le 1$$ . It can be used to characterize the uncertain information more sufficiently and accurately than intuitionistic fuzzy set. Pythagorean fuzzy set has attracted great attention of many scholars that have been extended to new types and these extensions have been used in many areas such as decision making, aggregation operators, and information measures. Because of such a growth, we present an overview on Pythagorean fuzzy set with aim of offering a clear perspective on the different concepts, tools and trends related to their extension. In particular, we provide two novel algorithms in decision making problems under Pythagorean fuzzy environment. It may be served as a foundation for developing more algorithms in decision making.

245 citations


Journal ArticleDOI
TL;DR: An extensive survey on existing methods for selecting SVM training data from large datasets is provided, which helps understand the underlying ideas behind these algorithms, which may be useful in designing new methods to deal with this important problem.
Abstract: Support vector machines (SVMs) are a supervised classifier successfully applied in a plethora of real-life applications. However, they suffer from the important shortcomings of their high time and memory training complexities, which depend on the training set size. This issue is especially challenging nowadays, since the amount of data generated every second becomes tremendously large in many domains. This review provides an extensive survey on existing methods for selecting SVM training data from large datasets. We divide the state-of-the-art techniques into several categories. They help understand the underlying ideas behind these algorithms, which may be useful in designing new methods to deal with this important problem. The review is complemented with the discussion on the future research pathways which can make SVMs easier to exploit in practice.

240 citations


Journal ArticleDOI
TL;DR: A review on evolution of linear support vector machine classification, its solvers, strategies to improve solvers), experimental results, current challenges and research directions is presented.
Abstract: Support vector machine (SVM) is an optimal margin based classification technique in machine learning. SVM is a binary linear classifier which has been extended to non-linear data using Kernels and multi-class data using various techniques like one-versus-one, one-versus-rest, Crammer Singer SVM, Weston Watkins SVM and directed acyclic graph SVM (DAGSVM) etc. SVM with a linear Kernel is called linear SVM and one with a non-linear Kernel is called non-linear SVM. Linear SVM is an efficient technique for high dimensional data applications like document classification, word-sense disambiguation, drug design etc. because under such data applications, test accuracy of linear SVM is closer to non-linear SVM while its training is much faster than non-linear SVM. SVM is continuously evolving since its inception and researchers have proposed many problem formulations, solvers and strategies for solving SVM. Moreover, due to advancements in the technology, data has taken the form of ‘Big Data’ which have posed a challenge for Machine Learning to train a classifier on this large-scale data. In this paper, we have presented a review on evolution of linear support vector machine classification, its solvers, strategies to improve solvers, experimental results, current challenges and research directions.

240 citations


Journal ArticleDOI
TL;DR: This paper represents a complete, multilateral and systematic review of opinion mining and sentiment analysis to classify available methods and compare their advantages and drawbacks, in order to have better understanding of available challenges and solutions to clarify the future direction.
Abstract: Opinion mining is considered as a subfield of natural language processing, information retrieval and text mining. Opinion mining is the process of extracting human thoughts and perceptions from unstructured texts, which with regard to the emergence of online social media and mass volume of users’ comments, has become to a useful, attractive and also challenging issue. There are varieties of researches with different trends and approaches in this area, but the lack of a comprehensive study to investigate them from all aspects is tangible. In this paper we represent a complete, multilateral and systematic review of opinion mining and sentiment analysis to classify available methods and compare their advantages and drawbacks, in order to have better understanding of available challenges and solutions to clarify the future direction. For this purpose, we present a proper framework of opinion mining accompanying with its steps and levels and then we completely monitor, classify, summarize and compare proposed techniques for aspect extraction, opinion classification, summary production and evaluation, based on the major validated scientific works. In order to have a better comparison, we also propose some factors in each category, which help to have a better understanding of advantages and disadvantages of different methods.

231 citations


Journal ArticleDOI
TL;DR: A comprehensive survey of various techniques explored for face detection in digital images is presented in this paper, where the practical aspects towards the development of a robust face detection system and several promising directions for future research are discussed.
Abstract: With the marvelous increase in video and image database there is an incredible need of automatic understanding and examination of information by the intelligent systems as manually it is getting to be plainly distant. Face plays a major role in social intercourse for conveying identity and feelings of a person. Human beings have not tremendous ability to identify different faces than machines. So, automatic face detection system plays an important role in face recognition, facial expression recognition, head-pose estimation, human–computer interaction etc. Face detection is a computer technology that determines the location and size of a human face in a digital image. Face detection has been a standout amongst topics in the computer vision literature. This paper presents a comprehensive survey of various techniques explored for face detection in digital images. Different challenges and applications of face detection are also presented in this paper. At the end, different standard databases for face detection are also given with their features. Furthermore, we organize special discussions on the practical aspects towards the development of a robust face detection system and conclude this paper with several promising directions for future research.

227 citations


Journal ArticleDOI
TL;DR: Survey of text classification, process of different term weighing methods and comparison between different classification techniques are surveyed.
Abstract: Supervised machine learning studies are gaining more significant recently because of the availability of the increasing number of the electronic documents from different resources. Text classification can be defined that the task was automatically categorized a group documents into one or more predefined classes according to their subjects. Thereby, the major objective of text classification is to enable users for extracting information from textual resource and deals with process such as retrieval, classification, and machine learning techniques together in order to classify different pattern. In text classification technique, term weighting methods design suitable weights to the specific terms to enhance the text classification performance. This paper surveys of text classification, process of different term weighing methods and comparison between different classification techniques.

Journal ArticleDOI
TL;DR: A goal-driven overview of numerous theoretical developments recently reported in this area, and an overview of the existing software tools enabling the implementation of both existing FCM schemes as well as prospective theoretical and/or practical contributions.
Abstract: Fuzzy cognitive maps (FCMs) keep growing in popularity within the scientific community. However, despite substantial advances in the theory and applications of FCMs, there is a lack of an up-to-date, comprehensive presentation of the state-of-the-art in this domain. In this review study we are filling that gap. First, we present basic FCM concepts and analyze their static and dynamic properties, and next we elaborate on existing algorithms used for learning the FCM structure. Second, we provide a goal-driven overview of numerous theoretical developments recently reported in this area. Moreover, we consider the application of FCMs to time series forecasting and classification. Finally, in order to support the readers in their own research, we provide an overview of the existing software tools enabling the implementation of both existing FCM schemes as well as prospective theoretical and/or practical contributions.

Journal ArticleDOI
TL;DR: A comprehensive review of different versions of the KH algorithm and their engineering applications is presented and specific features of KH and future directions are discussed.
Abstract: Krill herd (KH) is a novel swarm-based metaheuristic optimization algorithm inspired by the krill herding behavior. The objective function in the KH optimization process is based on the least distance between the food location and position of a krill. The KH method has been proven to outperform several state-of-the-art metaheuristic algorithms on many benchmarks and engineering cases. This paper presents a comprehensive review of different versions of the KH algorithm and their engineering applications. The study is divided into the following general parts: KH variants, engineering optimization/application, and theoretical analysis. In addition, specific features of KH and future directions are discussed.

Journal ArticleDOI
TL;DR: A novel type of soft rough covering is introduced by means of soft neighborhoods, and then it is used to improve decision making in a multicriteria group environment.
Abstract: In this paper, we contribute to a recent and successful modelization of uncertainty, which the practitioner often encounters in the formulation of multicriteria group decision making problems. To be precise, in order to approach the uncertainty issue we introduce a novel type of soft rough covering by means of soft neighborhoods, and then we use it to improve decision making in a multicriteria group environment. Our research method is as follows. Firstly we introduce the soft covering upper and lower approximation operators of soft rough coverings. Then its relationships with well-established types of soft rough coverings are analyzed. Secondly, we define and investigate the measure degree of our novel soft rough covering. With this tool we produce a new class of soft rough sets. Finally, we propose an application of such soft rough covering model to multicriteria group decision making by means of an algorithmic solution. A fully developed example supports the implementability of this decision making method.

Journal ArticleDOI
TL;DR: This paper provides a comprehensive review of all issues related to FPA: biological inspiration, fundamentals, previous studies and comparisons, implementation, variants, hybrids, and applications, and a comparison between FPA and six different metaheuristics on solving a constrained engineering optimization problem.
Abstract: Flower pollination algorithm (FPA) is a computational intelligence metaheuristic that takes its metaphor from flowers proliferation role in plants. This paper provides a comprehensive review of all issues related to FPA: biological inspiration, fundamentals, previous studies and comparisons, implementation, variants, hybrids, and applications. Besides, it makes a comparison between FPA and six different metaheuristics such as genetic algorithm, cuckoo search, grasshopper optimization algorithm, and others on solving a constrained engineering optimization problem . The experimental results are statistically analyzed with non-parametric Friedman test which indicates that FPA is superior more than other competitors in solving the given problem.

Journal ArticleDOI
TL;DR: This study analyzed the data logged by a technology-enhanced learning (TEL) system called digital electronics education and design suite (DEEDS) using machine learning algorithms to predict the difficulties that students will encounter in a subsequent digital design course session.
Abstract: The student’s performance prediction is an important research topic because it can help teachers prevent students from dropping out before final exams and identify students that need additional assistance. The objective of this study is to predict the difficulties that students will encounter in a subsequent digital design course session. We analyzed the data logged by a technology-enhanced learning (TEL) system called digital electronics education and design suite (DEEDS) using machine learning algorithms. The machine learning algorithms included an artificial neural networks (ANNs), support vector machines (SVMs), logistic regression, Naive bayes classifiers and decision trees. The DEEDS system allows students to solve digital design exercises with different levels of difficulty while logging input data. The input variables of the current study were average time, total number of activities, average idle time, average number of keystrokes and total related activity for each exercise during individual sessions in the digital design course; the output variables were the student(s) grades for each session. We then trained machine learning algorithms on the data from the previous session and tested the algorithms on the data from the upcoming session. We performed k-fold cross-validation and computed the receiver operating characteristic and root mean square error metrics to evaluate the models’ performances. The results show that ANNs and SVMs achieve higher accuracy than do other algorithms. ANNs and SVMs can easily be integrated into the TEL system; thus, we would expect instructors to report improved student’s performance during the subsequent session.

Journal ArticleDOI
TL;DR: This paper discusses different datasets and metrics used in summarization and compare performances of different approaches, first in general and then focused to legal text, and briefly covers a few software tools used in legal text summarization.
Abstract: Enormous amount of online information, available in legal domain, has made legal text processing an important area of research. In this paper, we attempt to survey different text summarization techniques that have taken place in the recent past. We put special emphasis on the issue of legal text summarization, as it is one of the most important areas in legal domain. We start with general introduction to text summarization, briefly touch the recent advances in single and multi-document summarization, and then delve into extraction based legal text summarization. We discuss different datasets and metrics used in summarization and compare performances of different approaches, first in general and then focused to legal text. we also mention highlights of different summarization techniques. We briefly cover a few software tools used in legal text summarization. We finally conclude with some future research directions.

Journal ArticleDOI
TL;DR: This review aims to help with the understanding of various elements associated with fault prediction process and to explore various issues involved in the software fault prediction.
Abstract: Software fault prediction aims to identify fault-prone software modules by using some underlying properties of the software project before the actual testing process begins It helps in obtaining d

Journal ArticleDOI
TL;DR: The review highlights the exceptional performance of AI methods in optimization of various objective functions essential for industrial decision making including minimum miscibility pressure, oil production rate, and volume of CO2 sequestration.
Abstract: In recent years, artificial intelligence (AI) has been widely applied to optimization problems in the petroleum exploration and production industry. This survey offers a detailed literature review based on different types of AI algorithms, their application areas in the petroleum industry, publication year, and geographical regions of their development. For this purpose, we classify AI methods into four main categories including evolutionary algorithms, swarm intelligence, fuzzy logic, and artificial neural networks. Additionally, we examine these types of algorithms with respect to their applications in petroleum engineering. The review highlights the exceptional performance of AI methods in optimization of various objective functions essential for industrial decision making including minimum miscibility pressure, oil production rate, and volume of $$\hbox {CO}_{2}$$ sequestration. Furthermore, hybridization and/or combination of various AI techniques can be successfully applied to solve important optimization problems and obtain better solutions. The detailed descriptions provided in this review serve as a comprehensive reference of AI optimization techniques for further studies and research in this area.

Journal ArticleDOI
TL;DR: Some different algorithms of parameter reduction based on some types of (fuzzy) soft sets are reviewed to emphasize their respective advantages and disadvantages, and give some examples to illustrate their differences.
Abstract: As is well known, soft set theory can have a bearing on making decisions in many fields. Particularly important is parameter reduction of soft sets, an essential topic both for information sciences and artificial intelligence. Parameter reduction studies the largest pruning of the amount of parameters that define a given soft set without changing its original choice objects. Therefore it can spare computationally costly tests in the decision making process. In the present article, we review some different algorithms of parameter reduction based on some types of (fuzzy) soft sets. Finally, we compare these algorithms to emphasize their respective advantages and disadvantages, and give some examples to illustrate their differences.

Journal ArticleDOI
TL;DR: This paper is to provide a comprehensive overview of the NILM method and present a comparative review of modern approaches, finding many obstacles make an objective comparison almost impossible.
Abstract: Non-intrusive load monitoring (NILM) is the prevailing method used to monitor the energy profile of a domestic building and disaggregate the total power consumption into consumption signals by appliance. Whilst the most popular disaggregation algorithms are based on Hidden Markov Model solutions based on deep neural networks have attracted interest from researchers. The objective of this paper is to provide a comprehensive overview of the NILM method and present a comparative review of modern approaches. In this effort, many obstacles are identified. The plethora of metrics, the variety of datasets and the diversity of methodologies make an objective comparison almost impossible. An extensive analysis is made in order to scrutinize these problems. Possible solutions and improvements are suggested, while future research directions are discussed.

Journal ArticleDOI
TL;DR: A survey on the state-of-the-art spectrum allocation algorithms based on reinforcement learning techniques in cognitive radio networks and the advantages and disadvantages of each algorithm are analyzed in their specific practical application scenarios.
Abstract: Cognitive radio is an emerging technology that is considered to be an evolution for software device radio in which cognition and decision-making components are included. The main function of cognitive radio is to exploit "spectrum holes" or "white spaces" to address the challenge of the low utilization of radio resources. Dynamic spectrum allocation, whose significant functions are to ensure that cognitive users access the available frequency and bandwidth to communicate in an opportunistic manner and to minimize the interference between primary and secondary users, is a key mechanism in cognitive radio networks. Reinforcement learning, which rapidly analyzes the amount of data in a model-free manner, dramatically facilitates the performance of dynamic spectrum allocation in real application scenarios. This paper presents a survey on the state-of-the-art spectrum allocation algorithms based on reinforcement learning techniques in cognitive radio networks. The advantages and disadvantages of each algorithm are analyzed in their specific practical application scenarios. Finally, we discuss open issues in dynamic spectrum allocation that can be topics of future research.

Journal ArticleDOI
TL;DR: A methodology for applying k-nearest neighbor regression on a time series forecasting context is developed and it resolves the selection of important modeling parameters, such as k or the input variables, combining several models with different parameters.
Abstract: In this paper a methodology for applying k-nearest neighbor regression on a time series forecasting context is developed. The goal is to devise an automatic tool, i.e., a tool that can work without human intervention; furthermore, the methodology should be effective and efficient, so that it can be applied to accurately forecast a great number of time series. In order to be incorporated into our methodology, several modeling and preprocessing techniques are analyzed and assessed using the N3 competition data set. One interesting feature of the proposed methodology is that it resolves the selection of important modeling parameters, such as k or the input variables, combining several models with different parameters. In spite of the simplicity of k-NN regression, our methodology seems to be quite effective.

Journal ArticleDOI
TL;DR: A critical review of empirically grounded agent-based models of innovation diffusion can be found in this paper, where the authors identify four major issues in model calibration and validation, and suggest potential solutions.
Abstract: Innovation diffusion has been studied extensively in a variety of disciplines, including sociology, economics, marketing, ecology, and computer science. Traditional literature on innovation diffusion has been dominated by models of aggregate behavior and trends. However, the agent-based modeling (ABM) paradigm is gaining popularity as it captures agent heterogeneity and enables fine-grained modeling of interactions mediated by social and geographic networks. While most ABM work on innovation diffusion is theoretical, empirically grounded models are increasingly important, particularly in guiding policy decisions. We present a critical review of empirically grounded agent-based models of innovation diffusion, developing a categorization of this research based on types of agent models as well as applications. By connecting the modeling methodologies in the fields of information and innovation diffusion, we suggest that the maximum likelihood estimation framework widely used in the former is a promising paradigm for calibration of agent-based models for innovation diffusion. Although many advances have been made to standardize ABM methodology, we identify four major issues in model calibration and validation, and suggest potential solutions.

Journal ArticleDOI
TL;DR: This paper presents the results from an extensive study of 83 published papers from previous studies related to GWO in various applications such as parameter tuning, economy dispatch problem, and cost estimating to name a few.
Abstract: Today, finding a viable solution for any real world problem focusing on combinatorial of problems is a crucial task. However, using optimisation techniques, a viable best solution for a specific problem can be obtained, developed and solved despite the existing limitations of the implemented technique. Furthermore, population based optimisation techniques are now a current interest and has spawned many new and improved techniques to rectify many engineering problems. One of these methods is the Grey Wolf Optimiser (GWO), which resembles the grey wolf’s leadership hierarchy and its hunting behavior in nature. The GWO adopts the hierarchical nature of grey wolfs and lists the best solution as alpha, followed by beta and delta in descending order. Additionally, its hunting technique of tracking, encircling and attacking are also modeled mathematically to find the best optimised solution. This paper presents the results from an extensive study of 83 published papers from previous studies related to GWO in various applications such as parameter tuning, economy dispatch problem, and cost estimating to name a few. A discussion on the properties of GWO algorithm and how it minimises the different problems in the different applications is presented, as well as an analysis on the research trend of GWO optimisation technique in various applications from year 2014 to 2017. Based on the literatures, it was observed that GWO has the ability to solve single and multi-objective problems efficiently due to its good local search criteria that performs exceptionally well for different problems and solutions.

Journal ArticleDOI
TL;DR: A novel approach is introduced, which relies on the fusion of visual words of scale-invariant feature transform (SIFT) and binary robust invariant scalable keypoints (BRISK) descriptors based on the visual-bag-of-words approach to lessen the semantic gap between high-level semantics and local attributes of the image.
Abstract: Despite broad investigation in content-based image retrieval (CBIR), issue to lessen the semantic gap between high-level semantics and local attributes of the image is still an important issue. The local attributes of an image such as shape, color, and texture are not sufficient for effective CBIR. Visual similarity is a principal step in CBIR and in the baseline approach. In this article, we introduce a novel approach, which relies on the fusion of visual words of scale-invariant feature transform (SIFT) and binary robust invariant scalable keypoints (BRISK) descriptors based on the visual-bag-of-words approach. The two local feature descriptors are chosen as their fusion adds complementary improvement to CBIR. The SIFT descriptor is capable of detecting objects robustly under cluttering due to its invariance to scale, rotation, noise, and illumination variance. However, SIFT descriptor does not perform well at low illumination or poorly localized keypoints within an image. Due to this reason, the discriminative power of the SIFT descriptor is lost during the quantization process, which also reduces the performance of CBIR. However, the BRISK descriptor provides scale and rotation-invariant scale-space, high quality and adaptive performance in classification based applications. It also performs better at poorly localized keypoints along the edges of an object within an image as compared to the SIFT descriptor. The suggested approach based on the fusion of visual words achieves effective results on the Corel-1K, Corel-1.5K, Corel-5K, and Caltech-256 image repositories as equated to the feature fusion of both descriptors and latest CBIR approaches with the surplus assistances of scalability and fast indexing.

Journal ArticleDOI
TL;DR: This paper presents a comprehensive survey of sentiment analysis of scientific citations and presents the normal process, and this includes citation context extraction, public data sources, and feature selection.
Abstract: Sentiment analysis of scientific citations has received much attention in recent years because of the increased availability of scientific publications. Scholarly databases are valuable sources for publications and citation information where researchers can publish their ideas and results. Sentiment analysis of scientific citations aims to analyze the authors’ sentiments within scientific citations. During the last decade, some review papers have been published in the field of sentiment analysis. Despite the growth in the size of scholarly databases and researchers’ interests, no one as far as we know has carried out an in-depth survey in a specific area of sentiment analysis in scientific citations. This paper presents a comprehensive survey of sentiment analysis of scientific citations. In this review, the process of scientific citation sentiment analysis is introduced and recently proposed methods with the main challenges are presented, analyzed and discussed. Further, we present related fields such as citation function classification and citation recommendation that have recently gained enormous attention. Our contributions include identifying the most important challenges as well as the analysis and classification of recent methods used in scientific citation sentiment analysis. Moreover, it presents the normal process, and this includes citation context extraction, public data sources, and feature selection. We found that most of the papers use classical machine learning methods. However, due to limitations of performance and manual feature selection in machine learning, we believe that in the future hybrid and deep learning methods can possibly handle the problems of scientific citation sentiment analysis more efficiently and reliably.

Journal ArticleDOI
TL;DR: A criterion is proposed to assess the association between a cluster and a partition which is called Edited Normalized Mutual Information, ENMI criterion and it is shown that the proposed method outperforms other well-known ensembles.
Abstract: It is highly likely that there is a partition that is judged by a stability measure as a bad one while it contains one (or more) high quality cluster(s); and then it is totally neglected. So, inspiring from the evaluation of partitions, researchers turn to define measures for evaluation of clusters. Many stability measures have been proposed such as Normalized Mutual Information to validate a partition. The defined measures are based on Normalized Mutual Information. The drawback of the commonly used approach will be discussed in this paper and a criterion is proposed to assess the association between a cluster and a partition which is called Edited Normalized Mutual Information, ENMI criterion. The ENMI criterion compensates the drawback of the common Normalized Mutual Information (NMI) measure. Also, a clustering ensemble method that is based on aggregating a subset of primary clusters is proposed. The proposed method uses the Average ENMI as fitness measure to select a number of clusters. The clusters that satisfy a predefined threshold of the mentioned measure are selected to participate in the final ensemble. To combine the chosen clusters a set of consensus function methods are employed. One class of the used consensus functions is the co-association based consensus functions. Since the Evidence Accumulation Clustering, EAC, method can’t derive the co-association matrix from a subset of clusters, Extended EAC, EEAC, is employed to construct the co-association matrix from the chosen subset of clusters. The second class of the used consensus functions is based on hyper graph partitioning algorithms. The other class of the used consensus functions considers the chosen clusters as a new feature space and uses a simple clustering algorithm to extract the consensus partitioning. The empirical studies show that the proposed method outperforms other well-known ensembles.

Journal ArticleDOI
TL;DR: The result shows that some of the modified versions of firefly algorithm produce superior results with a tradeoff of high computational time, which will help practitioners to decide which modified version to apply based on the computational resource available and the sensitivity of the problem.
Abstract: Firefly algorithm is a swarm based metaheuristic algorithm designed for continuous optimization problems. It works by following better solutions and also with a random search mechanism. It has been successfully used in different problems arising in different disciplines and also modified for discrete problems. Unlike its easiness to understand and to implement; its effectiveness is highly affected by the parameter values. In addition modifying the search mechanism may give better performance. Hence different modified versions are introduced to overcome its limitations and increase its performance. In this paper, the modifications done on firefly algorithm for continuous optimization problems will be reviewed with a critical analysis. A detailed discussion on the modifications with possible future works will also be presented. In addition a comparative study will be conducted using forty benchmark problems with different dimensions based on ten base functions. The result shows that some of the modified versions produce superior results with a tradeoff of high computational time. Hence, this result will help practitioners to decide which modified version to apply based on the computational resource available and the sensitivity of the problem.

Journal ArticleDOI
TL;DR: This paper discusses important design decisions of a data analyst working on food sales prediction, such as the temporal granularity of sales data, the input variables to use for predicting sales and the representation of the sales output variable.
Abstract: Food sales prediction is concerned with estimating future sales of companies in the food industry, such as supermarkets, groceries, restaurants, bakeries and patisseries. Accurate short-term sales prediction allows companies to minimize stocked and expired products inside stores and at the same time avoid missing sales. This paper reviews existing machine learning approaches for food sales prediction. It discusses important design decisions of a data analyst working on food sales prediction, such as the temporal granularity of sales data, the input variables to use for predicting sales and the representation of the sales output variable. In addition, it reviews machine learning algorithms that have been applied to food sales prediction and appropriate measures for evaluating their accuracy. Finally, it discusses the main challenges and opportunities for applied machine learning in the domain of food sales prediction.