scispace - formally typeset
Search or ask a question

Showing papers in "Artificial Intelligence Review in 2016"


Journal ArticleDOI
TL;DR: A comprehensive comparative analysis is conducted among different approaches of aspect extraction, which not only elaborates the performance of any technique but also guides the reader to compare the accuracy with other state-of-the-art and most recent approaches.
Abstract: Sentiment analysis (SA) has become one of the most active and progressively popular areas in information retrieval and text mining due to the expansion of the World Wide Web (WWW). SA deals with the computational treatment or the classification of user's sentiments, opinions and emotions hidden within the text. Aspect extraction is the most vital and extensively explored phase of SA to carry out the classification of sentiments in precise manners. During the last decade, enormous number of research has focused on identifying and extracting aspects. Therefore, in this survey, a comprehensive overview has been attempted for different aspect extraction techniques and approaches. These techniques have been categorized in accordance with the adopted approach. Despite being a traditional survey, a comprehensive comparative analysis is conducted among different approaches of aspect extraction, which not only elaborates the performance of any technique but also guides the reader to compare the accuracy with other state-of-the-art and most recent approaches.

162 citations


Journal ArticleDOI
TL;DR: This survey introduces the basic concepts of the qualities of labels and learning models, and introduces open accessible real-world data sets collected from crowdsourcing systems and open source libraries and tools.
Abstract: With the rapid growing of crowdsourcing systems, quite a few applications based on a supervised learning paradigm can easily obtain massive labeled data at a relatively low cost. However, due to the variable uncertainty of crowdsourced labelers, learning procedures face great challenges. Thus, improving the qualities of labels and learning models plays a key role in learning from the crowdsourced labeled data. In this survey, we first introduce the basic concepts of the qualities of labels and learning models. Then, by reviewing recently proposed models and algorithms on ground truth inference and learning models, we analyze connections and distinctions among these techniques as well as clarify the level of the progress of related researches. In order to facilitate the studies in this field, we also introduce open accessible real-world data sets collected from crowdsourcing systems and open source libraries and tools. Finally, some potential issues for future studies are discussed.

148 citations


Journal ArticleDOI
TL;DR: In this paper, the history development, and the state-of-the-art of the BSO algorithm are reviewed and the convergent operation and divergent operation in the BSA algorithm are discussed from the data analysis perspective.
Abstract: For swarm intelligence algorithms, each individual in the swarm represents a solution in the search space, and it also can be seen as a data sample from the search space. Based on the analyses of these data, more effective algorithms and search strategies could be proposed. Brain storm optimization (BSO) algorithm is a new and promising swarm intelligence algorithm, which simulates the human brainstorming process. Through the convergent operation and divergent operation, individuals in BSO are grouped and diverged in the search space/objective space. In this paper, the history development, and the state-of-the-art of the BSO algorithm are reviewed. In addition, the convergent operation and divergent operation in the BSO algorithm are also discussed from the data analysis perspective. Every individual in the BSO algorithm is not only a solution to the problem to be optimized, but also a data point to reveal the landscape of the problem. Swarm intelligence and data mining techniques can be combined to produce benefits above and beyond what either method could achieve alone.

146 citations


Journal ArticleDOI
TL;DR: The traditional statistical models and state-of-the-art intelligent methods for financial distress forecasting are summarized, with the emphasis on the most recent achievements as the promising trend in this area.
Abstract: The assessment of financial credit risk is an important and challenging research topic in the area of accounting and finance. Numerous efforts have been devoted into this field since the first attempt last century. Today the study of financial credit risk assessment attracts increasing attentions in the face of one of the most severe financial crisis ever observed in the world. The accurate assessment of financial credit risk and prediction of business failure play an essential role both on economics and society. For this reason, more and more methods and algorithms were proposed in the past years. From this point, it is of crucial importance to review the nowadays methods applied to financial credit risk assessment. In this paper, we summarize the traditional statistical models and state-of-the-art intelligent methods for financial distress forecasting, with the emphasis on the most recent achievements as the promising trend in this area.

128 citations


Journal ArticleDOI
TL;DR: Different aspects of novelty detection in data streams, like the offline and online phases, the number of classes considered at each phase, the use of ensemble versus a single classifier, supervised and unsupervised approaches for the learning task, and how to deal with recurring classes are presented.
Abstract: In massive data analysis, data usually come in streams. In the last years, several studies have investigated novelty detection in these data streams. Different approaches have been proposed and validated in many application domains. A review of the main aspects of these studies can provide useful information to improve the performance of existing approaches, allow their adaptation to new applications and help to identify new important issues to be addresses in future studies. This article presents and analyses different aspects of novelty detection in data streams, like the offline and online phases, the number of classes considered at each phase, the use of ensemble versus a single classifier, supervised and unsupervised approaches for the learning task, information used for decision model update, forgetting mechanisms for outdated concepts, concept drift treatment, how to distinguish noise and outliers from novelty concepts, classification strategies for data with unknown label, and how to deal with recurring classes. This article also describes several applications of novelty detection in data streams investigated in the literature and discuss important challenges and future research directions.

88 citations


Journal ArticleDOI
TL;DR: A study on the existing approaches for the detection of unsafe driving patterns of a vehicle used to predict accidents and some of the critical open questions that need to be addressed for road safety using AI techniques are identified.
Abstract: Accident prediction is one of the most critical aspects of road safety, whereby an accident can be predicted before it actually occurs and precautionary measures taken to avoid it. For this purpose, accident prediction models are popular in road safety analysis. Artificial intelligence (AI) is used in many real world applications, especially where outcomes and data are not same all the time and are influenced by occurrence of random changes. This paper presents a study on the existing approaches for the detection of unsafe driving patterns of a vehicle used to predict accidents. The literature covered in this paper is from the past 10 years, from 2004 to 2014. AI techniques are surveyed for the detection of unsafe driving style and crash prediction. A number of statistical methods which are used to predict the accidents by using different vehicle and driving features are also covered in this paper. The approaches studied in this paper are compared in terms of datasets and prediction performance. We also provide a list of datasets and simulators available for the scientific community to conduct research in the subject domain. The paper also identifies some of the critical open questions that need to be addressed for road safety using AI techniques.

65 citations


Journal ArticleDOI
TL;DR: This survey investigates several research studies that have been conducted in the field of Arabic text summarization, and addresses summarization and evaluation methods, as well as the corpora used in those studies.
Abstract: This survey investigates several research studies that have been conducted in the field of Arabic text summarization. Specifically, it addresses summarization and evaluation methods, as well as the corpora used in those studies. The literature in this field is fairly limited and relatively new compared to the available literature on other languages, such as English. Therefore, there exists a great opportunity for further research in Arabic text summarization. In addition, one of the largest problems in Arabic summarization was the absence of Arabic gold standard summaries, although this situation is beginning to change, especially with the inclusion of Arabic language as a part of the corpora and tasks in the TAC 2011 MultiLing Pilot and ACL 2013 MultiLing Workshop. Finally, providing the required corpora and adopting them in Arabic summarization studies is an essential demand.

62 citations


Journal ArticleDOI
TL;DR: Recent developments in human motion analysis and biometric recognition suggest that both can be combined to develop a fully automated system, with a special focus on surveillance scenarios.
Abstract: Interest in the security of individuals has increased in recent years. This increase has in turn led to much wider deployment of surveillance cameras worldwide, and consequently, automated surveillance systems research has received more attention from the scientific community than before. Concurrently, biometrics research has become more popular as well, and it is supported by the increasing number of approaches devised to address specific degradation factors of unconstrained environments. Despite these recent efforts, no automated surveillance system that performs reliable biometric recognition in such an environment has become available. Nevertheless, recent developments in human motion analysis and biometric recognition suggest that both can be combined to develop a fully automated system. As such, this paper reviews recent advances in both areas, with a special focus on surveillance scenarios. When compared to previous studies, we highlight two distinct features, i.e., (1) our emphasis is on approaches that are devised to work in unconstrained environments and surveillance scenarios; and (2) biometric recognition is the final goal of the surveillance system, as opposed to behavior analysis, anomaly detection or action recognition.

60 citations


Journal ArticleDOI
TL;DR: Several important aspects of FPN’s background, history and formalisms are discussed, including the reasoning algorithm and relevant industrial applications; after which the conclusions and suggestions for future research are presented.
Abstract: Fuzzy Petri net (FPN) provides an extremely competent basis for the implementation of computing reasoning processes and the modeling of systems with uncertainty. This paper reviews recent developments of the FPN and its industrial applications. Several important aspects of FPN's background, history and formalisms are discussed, including the reasoning algorithm and relevant industrial applications; after which we present our conclusions and suggestions for future research.

60 citations


Journal ArticleDOI
TL;DR: All Hadith relevant methods and algorithms from the literature are discussed and analyzed in terms of functionality, simplicity, F-score and accuracy and it is revealed that neural networks classify the Hadith with 94 % accuracy.
Abstract: Hadiths are important textual sources of law, tradition, and teaching in the Islamic world. Analyzing the unique linguistic features of Hadiths (e.g. ancient Arabic language and story-like text) results to compile and utilize specific natural language processing methods. In the literature, no study is solely focused on Hadith from artificial intelligence perspective, while many new developments have been overlooked and need to be highlighted. Therefore, this review analyze all academic journal and conference publications that using two main methods of artificial intelligence for Hadith text: Hadith classification and mining. All Hadith relevant methods and algorithms from the literature are discussed and analyzed in terms of functionality, simplicity, F-score and accuracy. Using various different Hadith datasets makes a direct comparison between the evaluation results impossible. Therefore, we have re-implemented and evaluated the methods using a single dataset (i.e. 3150 Hadiths from Sahih Al-Bukhari book). The result of evaluation on the classification method reveals that neural networks classify the Hadith with 94 % accuracy. This is because neural networks are capable of handling complex (high dimensional) input data. The Hadith mining method that combines vector space model, Cosine similarity, and enriched queries obtains the best accuracy result (i.e. 88 %) among other re-evaluated Hadith mining methods. The most important aspect in Hadith mining methods is query expansion since the query must be fitted to the Hadith lingo. The lack of knowledge based methods is evident in Hadith classification and mining approaches and this absence can be covered in future works using knowledge graphs.

57 citations


Journal ArticleDOI
TL;DR: The aim of this paper is to categorize multicue tracking methods into single- modal and multi-modal and to list out new trends in this field via investigation of representative work to give detailed overview of latest advancement.
Abstract: The performance of single cue object tracking algorithms may degrade due to complex nature of visual world and environment challenges. In recent past, multicue object tracking methods using single or multiple sensors such as vision, thermal, infrared, laser, radar, audio, and RFID are explored to a great extent. It was acknowledged that combining multiple orthogonal cues enhance tracking performance over single cue methods. The aim of this paper is to categorize multicue tracking methods into single-modal and multi-modal and to list out new trends in this field via investigation of representative work. The categorized works are also tabulated in order to give detailed overview of latest advancement. The person tracking datasets are analyzed and their statistical parameters are tabulated. The tracking performance measures are also categorized depending upon availability of ground truth data. Our review gauges the gap between reported work and future demands for object tracking.

Journal ArticleDOI
TL;DR: This paper provides an extensive empirical analysis on a large benchmark of minimization problems, with the objective to identify those crossover PSO algorithms that perform best with respect to accuracy, success rate, and efficiency.
Abstract: Since its inception in 1995, many improvements to the original particle swarm optimization (PSO) algorithm have been developed. This paper reviews one class of such PSO variations, i.e. PSO algorithms that make use of crossover operators. The review is supplemented with a more extensive sensitivity analysis of the crossover PSO algorithms than provided in the original publications. Two adaptations of a parent-centric crossover PSO algorithm are provided, resulting in improvements with respect to solution accuracy compared to the original parent-centric PSO algorithms. The paper then provides an extensive empirical analysis on a large benchmark of minimization problems, with the objective to identify those crossover PSO algorithms that perform best with respect to accuracy, success rate, and efficiency.

Journal ArticleDOI
TL;DR: A self-contained exposition of various decision-theoretic and learning techniques from the field of AI and machine-learning that are relevant to the problem of cognitive routing in CRNs and their application in the context of CRNs in general and for the routing problem in particular are presented.
Abstract: Cognitive radio networks (CRNs) are networks of nodes equipped with cognitive radios that can optimize performance by adapting to network conditions. Although various routing protocols incorporating varying degrees of adaptiveness and cognition have been proposed for CRNs, these works have mostly been limited by their system-level focus (that emphasizes optimization at the level of an individual cognitive radio system). The vision of CRNs as cognitive networks, however, requires that the research focus progresses from its current system-level fixation to the a network-wide optimization focus. This motivates the development of cognitive routing protocols envisioned as routing protocols that fully and seamlessly incorporate artificial intelligence (AI)-based techniques into their design. In this paper, we provide a self-contained exposition of various decision-theoretic and learning techniques from the field of AI and machine-learning that are relevant to the problem of cognitive routing in CRNs. Apart from providing necessary background, we present for each technique discussed in this paper their application in the context of CRNs in general and for the routing problem in particular. We also highlight challenges associated with these techniques and common pitfalls. Finally, open research issues and future directions of work are identified.

Journal ArticleDOI
TL;DR: This paper is first academic systematic literature review of CF technique along with implicit data from user behaviors and activities to aggregate existing evidence as a synthesis of best quality scientific studies.
Abstract: User profiles in collaborative filtering (CF) recommendation technique are built based on ratings given by users on a set of items. The most eminent shortcoming of the CF technique is the sparsity problem. This problem refers to the low ratio of rated items by users to the total number of available items; hence the quality of recommendation will be affected. Most researchers use implicit data as a solution for sparsity problem, to decrease the dependency of CF technique on the user's rating and this term is more common in this field. The aim of this research is to aggregate evidence on state of research and practice of CF and implicit data applying systematic literature review (SLR) which is a method for evidence-based software engineering (EBSE). EBSE has the potential value for synthesizing evidence and make this evidence available to practitioners and researchers with providing the best references and appropriate software engineering solutions for sparsity problem. We executed the standard systematic literature review method using a manual search in 5 prestigious databases and 38 studies were finally included for analyzing. This paper follows manifestation of Kitchenham's SLR guidelines and describes in a great detail the process of selecting and analyzing research papers. This paper is first academic systematic literature review of CF technique along with implicit data from user behaviors and activities to aggregate existing evidence as a synthesis of best quality scientific studies. The 38 research papers are categorized into eleven application fields (movie, shopping, books, Social systems, music and others) and six data mining techniques (dimensionality reduction, association rule, heuristic methods and other). According to the review results, neighborhood formation is a relevant aspect of CF and it can be improved with the use of user-item preference matrix as implicit feedback mechanism, the most common domains of CF are in e-commerce and movie software applications.

Journal ArticleDOI
Sengul Dogan1
TL;DR: Chaos maps are used to improve the data hiding technique based on the genetic algorithm and it is observed that gauss, logistic and tent maps are faster than random function for proposed data hiding method.
Abstract: Data hiding algorithms, which have many methods describing in the literature, are widely used in information security. In data hiding applications, optimization techniques are utilized in order to improve the success of algorithms. The genetic algorithm is one of the largely using heuristic optimization techniques in these applications. Long running time is a disadvantage of the genetic algorithm. In this paper, chaotic maps are used to improve the data hiding technique based on the genetic algorithm. Peak signal to noise ratio (PSNR) is chosen as the fitness function. Different sized secret data are embedded into the cover object using random function of MATLAB and chaotic maps. Randomness of genetic is performed by using different chaotic maps. The success of the proposed method is presented with comparative results. It is observed that gauss, logistic and tent maps are faster than random function for proposed data hiding method.

Journal ArticleDOI
TL;DR: The historical context and the conducive environment that accelerated this particular trend of inspiring analogies or metaphors from various natural phenomena are analysed and it is observed that stochastic implementations show greater resistance to changes in parameter values, still obtaining near optimal solutions.
Abstract: The application of metaheuristic algorithms to combinatorial optimization problems is on the rise and is growing rapidly now than ever before. In this paper the historical context and the conducive environment that accelerated this particular trend of inspiring analogies or metaphors from various natural phenomena are analysed. We have implemented the Ant System Model and the other variants of ACO including the 3-Opt, Max---Min, Elitist and the Rank Based Systems as mentioned in their original works and we converse the missing pieces of Dorigo's Ant System Model. Extensive analysis of the variants on Travelling Salesman Problem and Job Shop Scheduling Problem shows how much they really contribute towards obtaining better solutions. The stochastic nature of these algorithms has been preserved to the maximum extent to keep the implementations as generic as possible. We observe that stochastic implementations show greater resistance to changes in parameter values, still obtaining near optimal solutions. We report how Polynomial Turing Reduction helps us to solve Job Shop Scheduling Problem without making considerable changes in the implementation of Travelling Salesman Problem, which could be extended to solve other NP-Hard problems. We elaborate on the various parallelization options based on the constraints enforced by strong scaling (fixed size problem) and weak scaling (fixed time problem). Also we elaborate on how probabilistic behaviour helps us to strike a balance between intensification and diversification of the search space.

Journal ArticleDOI
TL;DR: A survey on humanaction retrieval studies is presented that the methodologies have been analyzed from action representation and retrieving perspectives and limitations and common datasets of human action retrieval are introduced before describing the state-of-the-arts’ methodologies.
Abstract: Today, the number of available videos on the Internet is significantly increased. Content-based video retrieval is used for finding the users' desired items among these big video data. Memorizing details of the videos and intricate relations between included objects in videos can be considered as the major challenges of this big data topic. A large portion of video data relates to the humans. Thus, human action retrieval has been introduced as a new big data topic that seeks to find video objects based on the included human action. Human action retrieval has been applicated in different domains such as video search, intelligent human---computer interaction, robotics, video surveillance and human behavior analysis. There are some challenges such as variations in rotation, scale, style and above-mentioned challenges for the big video data that can impress the retrieval accuracy. In this paper, a survey on human action retrieval studies is presented that the methodologies have been analyzed from action representation and retrieving perspectives. Moreover, limitations and common datasets of human action retrieval are introduced before describing the state-of-the-arts' methodologies.

Journal ArticleDOI
TL;DR: This work presents the main approaches mixing De with global algorithms, DE with local algorithms and DE with global and local algorithms, with a special attention given to the situations in which DE is employed as a local search procedure or DE principles are included in other global search methods.
Abstract: Improving the performance of optimization algorithms is a trend with a continuous growth, powerful and stable algorithms being always in demand, especially nowadays when in the majority of cases, the computational power is not an issue. In this context, differential evolution (DE) is optimized by employing different approaches belonging to different research directions. The focus of the current review is on two main directions: (a) the replacement of manual control parameter setting with adaptive and self-adaptive methods; and (b) hybridization with other algorithms. The control parameters have a big influence on the algorithms performance, their correct setting being a crucial aspect when striving to obtain optimal solutions. Since their values are problem dependent, setting them is not an easy task. The trial and error method initially used is time and resource consuming, and in the same time, does not guarantee optimal results. Therefore, new approaches were proposed, the automatic control being one of the best solution developed by researchers. Concerning hybridization, the scope was to combine two or more algorithms in order to eliminate or to reduce the drawbacks of each individual algorithm. In this manner, different combinations at different levels were proposed. This work presents the main approaches mixing DE with global algorithms, DE with local algorithms and DE with global and local algorithms. In addition, a special attention was given to the situations in which DE is employed as a local search procedure or DE principles are included in other global search methods.

Journal ArticleDOI
TL;DR: This work reviews recent years’ research and applications of multi agent systems in healthcare published in different research journals, international conferences, and implemented practically and provided recommendations forMulti agent systems applied in healthcare domain.
Abstract: The successful use of intelligent agents in healthcare has attracted researchers to apply this emerging software engineering paradigm in more advanced and complex applications. Main success factor is the natural mapping of real world medical problems into cyber world. Multi-agent architecture can easily model the heterogeneous, distributed and autonomous health care systems. The multi agent systems have been applied from single healthcare activity like knowledge based medical system to complex, multi-component based systems like complete healthcare unit. The use of multi agent systems in health care domain has also opened the ways to find out new applications like personalized and socialized health care systems. This versatile use of multi agent systems has also posed new problems for researchers like; security, communication, and different social issues. This work reviews recent years' research and applications of multi agent systems in healthcare published in different research journals, international conferences, and implemented practically. We reviewed five subdomains and three systems in each subdomain. A set of common parameters of these systems has been extracted and compared to analyze systems' merits and deficiencies. Based on our analysis, we have provided recommendations for multi agent systems applied in healthcare domain. Future research directions for interested researchers and practitioners are also discussed. As our own future research work, we intend to study healthcare and multi agent systems in e-commerce.

Journal ArticleDOI
TL;DR: It is concluded that models such as the Five Factor Model, the Myers–Briggs Type Indicator personality model, and Felder–Silverman learning styles model have the two most important features, which are simplicity and comprehensiveness, which have made these psychology models the most favorable in the virtual world.
Abstract: Today, one of the most important and challenging issues in artificial intelligence is modeling human behavior in virtual environments. Furthermore, studying e-learning environments is in great demand in computer science which requires understanding human behaviors. Thus, considering human behavior factors, such as personality, mood, and emotion, and modeling them in e-learning environments is a challenging issue in artificial intelligence. The purpose of this paper is to review the psychological models of personality used in computer science. In addition, the most important applications of personality models and their direct related topics in learning, i.e. learning style issues in e-learning environments, are presented. The study shows that researchers tend to use models that are simple to implement in virtual world and are as comprehensive as possible to cover all the features of human behavior. Finally, we concluded that models such as the Five Factor Model, the Myers---Briggs Type Indicator personality model, and Felder---Silverman learning styles model have the two most important features, which are simplicity and comprehensiveness. These two features have made these psychology models the most favorable in the virtual world.

Journal ArticleDOI
TL;DR: The aim of this article is to show that firefly algorithm is able to find multiple solutions in multimodal problems, and it is shown that the proposed algorithm has a high ability to find the multimodAL optimal points.
Abstract: Optimization has been one of significant research fields in the past few decades, most of the real-world problems are multimodal optimization problems. The prime target of multimodal optimization is to find multiple global and local optima of a problem in one single run. The multimodal optimization problems have drawn attention to evolutionary algorithms. Firefly algorithm is a recently proposed stochastic optimization technique. This algorithm is a global search algorithm. On the other hand, because this algorithm has multimodal characteristics, it has the capacity and capability to change into multimodal optimization method. The aim of this article is to show that firefly algorithm is able to find multiple solutions in multimodal problems. Therefore, in this study, a new technique, is introduced for multimodal optimization. In the proposed algorithm, the multimodal optima are detected through separately evolving sub-populations. A stability criterion is used to determine the stability and instability of the sub-population. If a sub-population is regarded as stable, it has an optima stored in an external memory called Archive. After some iterations, the archive includes all of the optimums. The proposed algorithm utilizes a simulated annealing local optimization algorithm to increase search power, accuracy and speed of the algorithm. The proposed algorithm is tested on a set of criterion functions. The results show that the proposed algorithm has a high ability to find the multimodal optimal points.

Journal ArticleDOI
TL;DR: This paper provides a critical review of visual descriptors used for scene categorization, from both methodological and experimental perspectives, and presents an empirical study conducted on four benchmark data sets assessing the classification accuracy and class separability of state-of-the-art visual descriptor.
Abstract: Humans are endowed with the ability to grasp the overall meaning or the gist of a complex visual scene at a glance. We need only a fraction of a second to decide if a scene is indoors, outdoors, on a busy street, or on a clear beach. In recent years, computational gist recognition or scene categorization has been actively pursued, given its numerous applications in image and video search, surveillance, and assistive navigation. Many visual descriptors have been developed to address the challenges in scene categorization, including the large number of semantic categories and the tremendous variations caused by imaging conditions. This paper provides a critical review of visual descriptors used for scene categorization, from both methodological and experimental perspectives. We present an empirical study conducted on four benchmark data sets assessing the classification accuracy and class separability of state-of-the-art visual descriptors.

Journal ArticleDOI
TL;DR: In practice, selecting the less biased extrapolation of the Species Accumulation Curve allows to forecast the supplementary sampling effort necessary to reach a given increase of sampling completeness more accurately than the usual procedures, involving arbitrarily chosen empirical models.
Abstract: Under-sampling becomes the current situation for an increasing part of biodiversity surveys, as more and more speciose assemblages and increasingly complex taxonomic groups are progressively addressed. Accordingly, (i) extrapolating the Species Accumulation Curve and (ii) estimating the total species richness of partially-sampled species assemblages (or taxonomicgroups) both become major issues for many naturalists nowadays. Numerous different solutions have been proposed to address these issues. Yet, no general consensus has been reached regarding which particular solution among them should be preferred according to each case. This unsatisfactory situation follows from the empirical nature of traditional approaches, especially regarding the extrapolation of the Species Accumulation Curve. Fortunately, reconsidering the problem on decidedly more theoretical basis, including the consideration of general mathematical relationships universally constraining the expression of any theoretical (or rarefied) Species Accumulation Curves, allows a more relevant modeling for the extrapolation of species accumulation. In turn, this theoretical approach provides a rational key to Method Article Béguinot; AIR, 7(3): 1-16, 2016; Article no.AIR.26387 2 select the more appropriate, less biased type of species-richness estimator and the associated, less biased expression for the extrapolation of the Species Accumulation Curve, according to the context of sampling. In particular, the wide relevance of the series of ‘Jackknife-type’ estimators is highlighted (as had been already argued for specific cases, on semi-empirical basis). In practice, selecting the less biased extrapolation of the Species Accumulation Curve allows to forecast the supplementary sampling effort necessary to reach a given increase of sampling completeness more accurately than the usual procedures, involving arbitrarily chosen empirical models.

Journal ArticleDOI
TL;DR: In this paper, a detailed assessment of feature selection techniques, viz, squared Pearson correlation (R 2 ), principal componentanalysis (PCA), kernel principal component analysis (kPCA) and fast correlation-based filter (FCBF), and the learning algorithms: linear discriminant analysis (LDA), support vector machines (SVM), and feed forward neural network (NN).
Abstract: Recognition of motor imagery tasks (MI) from electroencephalographic (EEG) signals is crucial for developing rehabilitation andmotor assisted devices based on brain-computer interfaces (BCI). Here we consider the challenge of learning a classifier, basedon relevant patterns of the EEG signals; this learning step typically involves both feature selection, as well as a base learningalgorithm. However, in many cases it is not clear what combination of these methods will yield the best classifier. This papercontributes a detailed assessment of feature selection techniques, viz. , squared Pearson’s correlation (R 2 ), principal componentanalysis (PCA), kernel principal component analysis (kPCA) and fast correlation-based filter (FCBF); and the learning algorithms:linear discriminant analysis (LDA), support vector machines (SVM), and Feed Forward Neural Network (NN). A systematicevaluation of the combinations of these methods was performed in three two-class classification scenarios: rest vs. movement,upper vs. lower limb movement and right vs. left hand movement. FCBF in combination with SVM achieved the best results witha classification accuracy of 81.45%, 77.23% and 68.71% in the three scenarios, respectively. Importantly, FCBF determines, basedon the feature set, whether a classifier can be learned, and if so, automatically identifies the subset of relevant and non-correlatedfeatures. This suggests that FCBF is a powerful method for BCI systems based on MI. Knowledge gained here about proceduralcombinations has the potential to produce useful BCI tool, that can provide effective motor control for the users.

Journal ArticleDOI
TL;DR: This paper reviews image denoising algorithms which are based on wavelet, ridgelet, curvelet and contourlet transforms and benchmarks them based on the published results and introduces a new robust parameter Performance measure ‘P’.
Abstract: Digital images always inherit some extent of noise in them. This noise affects the information content of the image. Removal of this noise is very important to extract useful information from an image. However noise cannot be eliminated, it can only be minimized due to overlap between the signal and noise characteristics. This paper reviews image denoising algorithms which are based on wavelet, ridgelet, curvelet and contourlet transforms and benchmarks them based on the published results. This article presents the techniques, parameters used for benchmarking, denoising performance on standard images and a comparative analysis of the same. This paper highlights various trends in denoising techniques, based on which it has been concluded that a single parameter Peak Signal to Noise Ratio (PSNR) cannot exactly represent the denoising performance until other parameters are consistent. A new robust parameter Performance measure `P' is presented as a measure of denoising performance on the basis of a new concept named Noise Improvement Rectangle followed by its analysis. The results of the published algorithms are presented in tabular format in terms of PSNR and P which facilitates readers to have a bird's eye view of the research work in the field of image denoising and restoration.

Journal ArticleDOI
TL;DR: Dennis et al. as mentioned in this paper used chitosan and silica nanocomposites (CISNC) for biological control of tomato seedlings against R. solanacearum disease.
Abstract: Background: Biological control holds promise in managing bacterial wilt disease. However, its efficacy is limited by harsh environmental conditions when applied without use of suitable carrier materials. Aim: The study entailed synthesis of nanocarrier materials for biological control agents (BCAs) using Chitosan and silica nanocomposites. Original Research Article Dennis et al.; AIR, 6(3): 1-23, 2016; Article no.AIR.22742 2 Site and Duration: The experiments were carried out at Jomo Kenyatta University of Agriculture and Technology for a period of two years June 2013 to June 2015. Methodology: The experiments were conducted using a completely randomized design with three replications. Deacetylation, functionalization and immobilization of chitin on mesoporous silica nanoparticles (MSN) to form chitosan immobilized silica nanocomposites (CISNC) gel was done. Results: This resulted in formation of chitosan nanoparticles and CISNC with crystallite sizes of 2.8 and 4.4 nm respectively. BCAs were adsorbed on CISNC gel. Characterization of the bionanocomposites showed that they had physisorption properties thus, ideal carriers for BCAs. CISNC gel had the highest significant (P=.05) sorption properties with 75% and 65% adsorption and desorption respectively of BCAs. Efficacy trials were done by in vitro pathogen inhibition and greenhouse bioassays using tomato seedlings. Adsorption of BCAs on CISNC gel significantly (P=.05) increased inhibition efficacy of BCAs on R. solanacearum from 50 to 70%. This was attributed to the antibacterial effect of the individual substances and the overall synergy acquired. Further, BCA-CISNC gel forms a film around root hairs, initiates fast wound healing mechanism and induce prophylactic effect on tomato seedlings challenged with R. solanacearum pathogen, decreasing wilting incidences from 45 to 25%. Additionally, BCA-CISNC complex significantly (P=.05) increased tomato seed germination from 70 to 80% and growth rate from 12 to 15% due to enhanced water utilization efficiency, induced phytohormones and nutritional benefit. BCAs also aided faster nutrient release, absorption and utilization by tomato plants. Conclusion: Therefore, adsorption of bacterial, fungal and phage biocontrol agents on CISNC gel, a complex hitherto not reported to have been used in R. solanacearum disease control, enhanced microbial efficacy against the pathogen and increased tomato productivity.

Journal ArticleDOI
TL;DR: This paper proposes a selection hyper-heuristic process with the intention to rise the level of generality and solve consistently well a wide range of constraint satisfaction problems and confirms the robustness of the proposed approach and how high-level heuristics trained for some specific classes of instances can also be applied to unseen classes without significant lost of efficiency.
Abstract: Selection hyper-heuristics are a technology for optimization in which a high-level mechanism controls low-level heuristics, so as to be capable of solving a wide range of problem instances efficiently. Hyper-heuristics are used to generate a solution process rather than producing an immediate solution to a given problem. This process is a re-usable mechanism that can be applied both to seen and unseen problem instances. In this paper, we propose a selection hyper-heuristic process with the intention to rise the level of generality and solve consistently well a wide range of constraint satisfaction problems. The hyper-heuristic technique is based on a messy genetic algorithm that generates high-level heuristics formed by rules (condition $$\rightarrow $$ź heuristic). The high-level heuristics produced are seen to be good at solving instances from certain parts of the parameterized space of problems, producing results using effort comparable to the best single heuristic per instance. This is beneficial, as the choice of best heuristic varies from instance to instance, so the high-level heuristics are definitely preferable to selecting any one low-level heuristic for all instances. The results confirm the robustness of the proposed approach and how high-level heuristics trained for some specific classes of instances can also be applied to unseen classes without significant lost of efficiency. This paper contributes to the understanding of heuristics and the way they can be used in a collaborative way to benefit from their combined strengths.

Journal ArticleDOI
TL;DR: A multilayered fault estimation classifier, based on the Dominance based rough set is proposed, which provides an effective solution to the system’s operator for the proper diagnosis of the faults and intrusions by classifying the state of the system.
Abstract: Nowadays, power grid monitoring systems are shifting towards more disseminating and distributive operations. The diagnosis of faults using knowledge discovery techniques has become an essential component of the process, accounting to the challenges faced by the electrical power monitoring systems. The system's operators employ various energy management techniques which play important roles in the overall management, reliability and operational sustainability of smart grid utilities. Sometimes, these systems are disrupted by events like a short circuit in the system or any external incursion that could pose a threat to the public safety as well as to the critical infrastructure of the grid. This paper aims at providing a robust system for fault diagnosis using the status of an intelligent electronic device and circuit breakers which can be tripped by any kind of fault. To protect the system from vulnerabilities and different kinds of faults, a multilayered fault estimation classifier, based on the Dominance based rough set is proposed. This technique provides an effective solution to the system's operator for the proper diagnosis of the faults and intrusions by classifying the state of the system. In addition to this, the operator can take preventative measures to protect the system from further damage.

Journal ArticleDOI
TL;DR: This work presents an industrial application that fills this lack of research and thus provides a solution with a high practical impact to survive in the tough competition of the automotive industry.
Abstract: The automotive industry is in the strongest competition ever, as this sector gets disrupted by new arising competitors. Providing services to maximum customer satisfaction will be one of the most crucial competitive advantages in the future. Around 1 Terabyte of objective data is created every hour today. This volume will significantly grow in the future by the increasing number of connected services within the automotive industry. However, customer satisfaction determination is solely based on subjective questionnaires today without taking the vast amount of objective sensor and service process data into account. This work presents an industrial application that fills this lack of research and thus provides a solution with a high practical impact to survive in the tough competition of the automotive industry. Therefore, the work addresses these fundamental business questions: 1) Can dissatisfied customers be classified based on data that is produced during every service visit? 2) Can the dissatisfaction indicators be derived from service process data? A machine learning problem is set up that compared 5 classifiers and analyzed data from 19,008 real service visits from an automotive company. The 105 extracted features were drawn from the most significant available sources: warranty, diagnostic, dealer system and general vehicle data. The best result for customer dissatisfaction classification was 88.8% achieved with the SVM classifier (RBF kernel). Furthermore, the 46 most potential indicators for dissatisfaction were identified by the evolutionary feature selection. Our system was capable of classifying customer dissatisfaction solely based on the objective data that is generated by almost every service visit. As the amount of these data is continuously growing, we expect that the presented data-driven approach can achieve even better results in the future with a higher amount of data.

Journal ArticleDOI
TL;DR: A generic framework for data preprocessing is proposed, based on a survey with data mining experts, as well as a literature and software review, which enables pipelining preprocessing algorithms and methods which facilitate further automated preprocessing design and the selection of a suitable preprocessing stream.
Abstract: Clustering is among the most popular data mining algorithm families. Before applying clustering algorithms to datasets, it is usually necessary to preprocess the data properly. Data preprocessing is a crucial, still neglected step in data mining. Although preprocessing techniques and algorithms are well-known, the preprocessing process is very complex and takes usually a lot of time. Instead of handling preprocessing more systematically, it is usually undervalued, i.e. more emphasis is put on choosing the appropriate clustering algorithm and setting its parameters. In our opinion, this is not because preprocessing is less important, but because it is difficult to choose the best sequence of preprocessing algorithms. We argue that it is important to better standardize this process so it is performed efficiently. Therefore, this paper proposes a generic framework for data preprocessing. It is based on a survey with data mining experts, as well as a literature and software review. The framework enables pipelining preprocessing algorithms and methods which facilitate further automated preprocessing design and the selection of a suitable preprocessing stream. The proposed framework is easily extendible, so it can be applied to other data mining algorithm families that have their own idiosyncrasies.