scispace - formally typeset
Search or ask a question

Showing papers in "Expert Systems With Applications in 2014"


Journal ArticleDOI
TL;DR: The purpose is to review the most up-to-date state-of-the-art of GVRP, discuss how the traditional VRP variants can interact with G VRP and offer an insight into the next wave of research into GVRp.
Abstract: Green Logistics has emerged as the new agenda item in supply chain management. The traditional objective of distribution management has been upgraded to minimizing system-wide costs related to economic and environmental issues. Reflecting the environmental sensitivity of vehicle routing problems (VRP), an extensive literature review of Green Vehicle Routing Problems (GVRP) is presented. We provide a classification of GVRP that categorizes GVRP into Green-VRP, Pollution Routing Problem, VRP in Reverse Logistics, and suggest research gaps between its state and richer models describing the complexity in real-world cases. The purpose is to review the most up-to-date state-of-the-art of GVRP, discuss how the traditional VRP variants can interact with GVRP and offer an insight into the next wave of research into GVRP. It is hoped that OR/MS researchers together with logistics practitioners can be inspired and cooperate to contribute to a sustainable industry.

741 citations


Journal ArticleDOI
TL;DR: This paper proposes a model where widely known classification algorithms in combination with similarity techniques and prediction mechanisms provide the necessary means for retrieving recommendations in RSs, and adopts the widely known dataset provided by the GroupLens research group.
Abstract: A recommender system (RS) aims to provide personalized recommendations to users for specific items (e.g., music, books). Popular techniques involve content-based (CB) models and collaborative filtering (CF) approaches. In this paper, we deal with a very important problem in RSs: The cold start problem. This problem is related to recommendations for novel users or new items. In case of new users, the system does not have information about their preferences in order to make recommendations. We propose a model where widely known classification algorithms in combination with similarity techniques and prediction mechanisms provide the necessary means for retrieving recommendations. The proposed approach incorporates classification methods in a pure CF system while the use of demographic data help for the identification of other users with similar behavior. Our experiments show the performance of the proposed system through a large number of experiments. We adopt the widely known dataset provided by the GroupLens research group. We reveal the advantages of the proposed solution by providing satisfactory numerical results in different experimental scenarios.

515 citations


Journal ArticleDOI
TL;DR: A hybrid intelligent machine learning technique for computer-aided detection system for automatic detection of brain tumor through magnetic resonance images is proposed and demonstrates its effectiveness compared with the other machine learning recently published techniques.
Abstract: Computer-aided detection/diagnosis (CAD) systems can enhance the diagnostic capabilities of physicians and reduce the time required for accurate diagnosis. The objective of this paper is to review the recent published segmentation and classification techniques and their state-of-the-art for the human brain magnetic resonance images (MRI). The review reveals the CAD systems of human brain MRI images are still an open problem. In the light of this review we proposed a hybrid intelligent machine learning technique for computer-aided detection system for automatic detection of brain tumor through magnetic resonance images. The proposed technique is based on the following computational methods; the feedback pulse-coupled neural network for image segmentation, the discrete wavelet transform for features extraction, the principal component analysis for reducing the dimensionality of the wavelet coefficients, and the feed forward back-propagation neural network to classify inputs into normal or abnormal. The experiments were carried out on 101 images consisting of 14 normal and 87 abnormal (malignant and benign tumors) from a real human brain MRI dataset. The classification accuracy on both training and test images is 99% which was significantly good. Moreover, the proposed technique demonstrates its effectiveness compared with the other machine learning recently published techniques. The results revealed that the proposed hybrid approach is accurate and fast and robust. Finally, possible future directions are suggested.

482 citations


Journal ArticleDOI
TL;DR: A comparative analysis of the systems based on market prediction based on online-text-mining expands onto the theoretical and technical foundations behind each and should help the research community to structure this emerging field and identify the exact aspects which require further research and are of special significance.
Abstract: The quality of the interpretation of the sentiment in the online buzz in the social media and the online news can determine the predictability of financial markets and cause huge gains or losses. That is why a number of researchers have turned their full attention to the different aspects of this problem lately. However, there is no well-rounded theoretical and technical framework for approaching the problem to the best of our knowledge. We believe the existing lack of such clarity on the topic is due to its interdisciplinary nature that involves at its core both behavioral-economic topics as well as artificial intelligence. We dive deeper into the interdisciplinary nature and contribute to the formation of a clear frame of discussion. We review the related works that are about market prediction based on online-text-mining and produce a picture of the generic components that they all have. We, furthermore, compare each system with the rest and identify their main differentiating factors. Our comparative analysis of the systems expands onto the theoretical and technical foundations behind each. This work should help the research community to structure this emerging field and identify the exact aspects which require further research and are of special significance.

476 citations


Journal ArticleDOI
TL;DR: A hybrid of K-means and support vector machine (K-SVM) algorithms is developed to diagnose breast cancer based on the extracted tumor features and shows time savings during the training phase.
Abstract: With the development of clinical technologies, different tumor features have been collected for breast cancer diagnosis. Filtering all the pertinent feature information to support the clinical disease diagnosis is a challenging and time consuming task. The objective of this research is to diagnose breast cancer based on the extracted tumor features. Feature extraction and selection are critical to the quality of classifiers founded through data mining methods. To extract useful information and diagnose the tumor, a hybrid of K-means and support vector machine (K-SVM) algorithms is developed. The K-means algorithm is utilized to recognize the hidden patterns of the benign and malignant tumors separately. The membership of each tumor to these patterns is calculated and treated as a new feature in the training model. Then, a support vector machine (SVM) is used to obtain the new classifier to differentiate the incoming tumors. Based on 10-fold cross validation, the proposed methodology improves the accuracy to 97.38%, when tested on the Wisconsin Diagnostic Breast Cancer (WDBC) data set from the University of California - Irvine machine learning repository. Six abstract tumor features are extracted from the 32 original features for the training phase. The results not only illustrate the capability of the proposed approach on breast cancer diagnosis, but also shows time savings during the training phase. Physicians can also benefit from the mined abstract tumor features by better understanding the properties of different types of tumors.

440 citations


Journal ArticleDOI
TL;DR: Based on a new effective and feasible representation of uncertain information, called D numbers, a D-AHP method is proposed for the supplier selection problem, which extends the classical analytic hierarchy process (AHP) method.
Abstract: Supplier selection is an important issue in supply chain management (SCM), and essentially is a multi-criteria decision-making problem. Supplier selection highly depends on experts' assessments. In the process of that, it inevitably involves various types of uncertainty such as imprecision, fuzziness and incompleteness due to the inability of human being's subjective judgment. However, the existing methods cannot adequately handle these types of uncertainties. In this paper, based on a new effective and feasible representation of uncertain information, called D numbers, a D-AHP method is proposed for the supplier selection problem, which extends the classical analytic hierarchy process (AHP) method. Within the proposed method, D numbers extended fuzzy preference relation has been involved to represent the decision matrix of pairwise comparisons given by experts. An illustrative example is presented to demonstrate the effectiveness of the proposed method.

419 citations


Journal ArticleDOI
TL;DR: This review pursues a twofold goal, to preserve and enhance the chronicles of recent educational data mining (EDM) advances development, and provides an analysis of the EDM strengths, weakness, opportunities, and threats, whose factors represent, in a sense, future work to be fulfilled.
Abstract: This review pursues a twofold goal, the first is to preserve and enhance the chronicles of recent educational data mining (EDM) advances development; the second is to organize, analyze, and discuss the content of the review based on the outcomes produced by a data mining (DM) approach. Thus, as result of the selection and analysis of 240 EDM works, an EDM work profile was compiled to describe 222 EDM approaches and 18 tools. A profile of the EDM works was organized as a raw data base, which was transformed into an ad-hoc data base suitable to be mined. As result of the execution of statistical and clustering processes, a set of educational functionalities was found, a realistic pattern of EDM approaches was discovered, and two patterns of value-instances to depict EDM approaches based on descriptive and predictive models were identified. One key finding is: most of the EDM approaches are ground on a basic set composed by three kinds of educational systems, disciplines, tasks, methods, and algorithms each. The review concludes with a snapshot of the surveyed EDM works, and provides an analysis of the EDM strengths, weakness, opportunities, and threats, whose factors represent, in a sense, future work to be fulfilled.

414 citations


Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed hybrid intrusion detection method is better than the conventional methods in terms of the detection rate for both unknown and known attacks while it maintains a low false positive rate.
Abstract: In this paper, a new hybrid intrusion detection method that hierarchically integrates a misuse detection model and an anomaly detection model in a decomposition structure is proposed. First, a misuse detection model is built based on the C4.5 decision tree algorithm and then the normal training data is decomposed into smaller subsets using the model. Next, multiple one-class SVM models are created for the decomposed subsets. As a result, each anomaly detection model does not only use the known attack information indirectly, but also builds the profiles of normal behavior very precisely. The proposed hybrid intrusion detection method was evaluated by conducting experiments with the NSL-KDD data set, which is a modified version of well-known KDD Cup 99 data set. The experimental results demonstrate that the proposed method is better than the conventional methods in terms of the detection rate for both unknown and known attacks while it maintains a low false positive rate. In addition, the proposed method significantly reduces the high time complexity of the training and testing processes. Experimentally, the training and testing time of the anomaly detection model is shown to be only 50% and 60%, respectively, of the time required for the conventional models.

414 citations


Journal ArticleDOI
TL;DR: A robust system based on the concepts of Mutual Direction Symmetry (MDS), Mutual Magnitude Symmetric (MMS) and Gradient Vector Symmeter (GVS) properties to identify text pixel candidates regardless of any orientations including curves from natural scene images is presented.
Abstract: Text detection in the real world images captured in unconstrained environment is an important yet challenging computer vision problem due to a great variety of appearances, cluttered background, and character orientations. In this paper, we present a robust system based on the concepts of Mutual Direction Symmetry (MDS), Mutual Magnitude Symmetry (MMS) and Gradient Vector Symmetry (GVS) properties to identify text pixel candidates regardless of any orientations including curves (e.g. circles, arc shaped) from natural scene images. The method works based on the fact that the text patterns in both Sobel and Canny edge maps of the input images exhibit a similar behavior. For each text pixel candidate, the method proposes to explore SIFT features to refine the text pixel candidates, which results in text representatives. Next an ellipse growing process is introduced based on a nearest neighbor criterion to extract the text components. The text is verified and restored based on text direction and spatial study of pixel distribution of components to filter out non-text components. The proposed method is evaluated on three benchmark datasets, namely, ICDAR2005 and ICDAR2011 for horizontal text evaluation, MSRA-TD500 for non-horizontal straight text evaluation and on our own dataset (CUTE80) that consists of 80 images for curved text evaluation to show its effectiveness and superiority over existing methods.

413 citations


Journal ArticleDOI
TL;DR: A detailed and up-to-date survey of the field, considering the different kinds of interfaces, the diversity of recommendation algorithms, the functionalities offered by these systems and their use of Artificial Intelligence techniques.
Abstract: Recommender systems are currently being applied in many different domains. This paper focuses on their application in tourism. A comprehensive and thorough search of the smart e-Tourism recommenders reported in the Artificial Intelligence journals and conferences since 2008 has been made. The paper provides a detailed and up-to-date survey of the field, considering the different kinds of interfaces, the diversity of recommendation algorithms, the functionalities offered by these systems and their use of Artificial Intelligence techniques. The survey also provides some guidelines for the construction of tourism recommenders and outlines the most promising areas of work in the field for the next years.

402 citations


Journal ArticleDOI
TL;DR: The results based on Kapur's entropy reveal that CS, ELR-CS and WDO method can be accurately and efficiently used in multilevel thresholding problem.
Abstract: The objective of image segmentation is to extract meaningful objects. A meaningful segmentation selects the proper threshold values to optimize a criterion using entropy. The conventional multilevel thresholding methods are efficient for bi-level thresholding. However, they are computationally expensive when extended to multilevel thresholding since they exhaustively search the optimal thresholds to optimize the objective functions. To overcome this problem, two successful swarm-intelligence-based global optimization algorithms, cuckoo search (CS) algorithm and wind driven optimization (WDO) for multilevel thresholding using Kapur's entropy has been employed. For this purpose, best solution as fitness function is achieved through CS and WDO algorithm using Kapur's entropy for optimal multilevel thresholding. A new approach of CS and WDO algorithm is used for selection of optimal threshold value. This algorithm is used to obtain the best solution or best fitness value from the initial random threshold values, and to evaluate the quality of a solution, correlation function is used. Experimental results have been examined on standard set of satellite images using various numbers of thresholds. The results based on Kapur's entropy reveal that CS, ELR-CS and WDO method can be accurately and efficiently used in multilevel thresholding problem.

Journal ArticleDOI
TL;DR: By dividing the research into four main groups based on the problem-solving approaches and identifying the investigated quality of service parameters, intended objectives, and developing environments, beneficial results and statistics are obtained that can contribute to future research.
Abstract: The increasing tendency of network service users to use cloud computing encourages web service vendors to supply services that have different functional and nonfunctional (quality of service) features and provide them in a service pool. Based on supply and demand rules and because of the exuberant growth of the services that are offered, cloud service brokers face tough competition against each other in providing quality of service enhancements. Such competition leads to a difficult and complicated process to provide simple service selection and composition in supplying composite services in the cloud, which should be considered an NP-hard problem. How to select appropriate services from the service pool, overcome composition restrictions, determine the importance of different quality of service parameters, focus on the dynamic characteristics of the problem, and address rapid changes in the properties of the services and network appear to be among the most important issues that must be investigated and addressed. In this paper, utilizing a systematic literature review, important questions that can be raised about the research performed in addressing the above-mentioned problem have been extracted and put forth. Then, by dividing the research into four main groups based on the problem-solving approaches and identifying the investigated quality of service parameters, intended objectives, and developing environments, beneficial results and statistics are obtained that can contribute to future research.

Journal ArticleDOI
TL;DR: A hybrid prediction algorithm comprised of Support Vector Regression and Modified Firefly Algorithm is proposed to provide the short term electrical load forecast and the experimental results affirm that the proposed algorithm outperforms other techniques.
Abstract: Precise forecast of the electrical load plays a highly significant role in the electricity industry and market. It provides economic operations and effective future plans for the utilities and power system operators. Due to the intermittent and uncertain characteristic of the electrical load, many research studies have been directed to nonlinear prediction methods. In this paper, a hybrid prediction algorithm comprised of Support Vector Regression (SVR) and Modified Firefly Algorithm (MFA) is proposed to provide the short term electrical load forecast. The SVR models utilize the nonlinear mapping feature to deal with nonlinear regressions. However, such models suffer from a methodical algorithm for obtaining the appropriate model parameters. Therefore, in the proposed method the MFA is employed to obtain the SVR parameters accurately and effectively. In order to evaluate the efficiency of the proposed methodology, it is applied to the electrical load demand in Fars, Iran. The obtained results are compared with those obtained from the ARMA model, ANN, SVR-GA, SVR-HBMO, SVR-PSO and SVR-FA. The experimental results affirm that the proposed algorithm outperforms other techniques.

Journal ArticleDOI
TL;DR: This paper provides some answers from the practitioner’s perspective by focusing on three crucial issues: unbalancedness, non-stationarity and assessment in fraud detection algorithms.
Abstract: Billions of dollars of loss are caused every year due to fraudulent credit card transactions. The design of efficient fraud detection algorithms is key for reducing these losses, and more algorithms rely on advanced machine learning techniques to assist fraud investigators. The design of fraud detection algorithms is however particularly challenging due to non-stationary distribution of the data, highly imbalanced classes distributions and continuous streams of transactions. At the same time public data are scarcely available for confidentiality issues, leaving unanswered many questions about which is the best strategy to deal with them. In this paper we provide some answers from the practitioner’s perspective by focusing on three crucial issues: unbalancedness, non-stationarity and assessment. The analysis is made possible by a real credit card dataset provided by our industrial partner.

Journal ArticleDOI
TL;DR: A framework based on fuzzy analytical hierarchy process and fuzzy technique for order performance by similarity to ideal solution (TOPSIS) to identify and rank the solutions of KM adoption in SC and overcome its barriers is proposed.
Abstract: The aim of this study is to identify and prioritize the solutions of Knowledge Management (KM) adoption in Supply Chain (SC) to overcome its barriers. It helps organizations to concentrate on high rank solutions and develop strategies to implement them on priority. This paper proposes a framework based on fuzzy analytical hierarchy process (AHP) and fuzzy technique for order performance by similarity to ideal solution (TOPSIS) to identify and rank the solutions of KM adoption in SC and overcome its barriers. The AHP is used to determine weights of the barriers as criteria, and fuzzy TOPSIS method is used to obtain final ranking of the solutions of KM adoption in SC. The empirical case study analysis of an Indian hydraulic valve manufacturing organization is conducted to illustrate the use of the proposed framework for ranking the solutions of KM adoption in SC to overcome its barriers. This proposed framework provides a more accurate, effective and systematic decision support tool for stepwise implementation of the solutions of KM adoption in SC to increase its success rate.

Journal ArticleDOI
TL;DR: Experimental results that were achieved using the proposed novel HGA-NN classifier are promising for feature selection and classification in retail credit risk assessment and indicate that the H GA-NNclassifier is a promising addition to existing data mining techniques.
Abstract: In this paper, an advanced novel heuristic algorithm is presented, the hybrid genetic algorithm with neural networks (HGA-NN), which is used to identify an optimum feature subset and to increase the classification accuracy and scalability in credit risk assessment. This algorithm is based on the following basic hypothesis: the high-dimensional input feature space can be preliminarily restricted to only the important features. In this preliminary restriction, fast algorithms for feature ranking and earlier experience are used. Additionally, enhancements are made in the creation of the initial population, as well as by introducing an incremental stage in the genetic algorithm. The performances of the proposed HGA-NN classifier are evaluated using a real-world credit dataset that is collected at a Croatian bank, and the findings are further validated on another real-world credit dataset that is selected in a UCI database. The classification accuracy is compared with that presented in the literature. Experimental results that were achieved using the proposed novel HGA-NN classifier are promising for feature selection and classification in retail credit risk assessment and indicate that the HGA-NN classifier is a promising addition to existing data mining techniques.

Journal ArticleDOI
TL;DR: Two independent hybrid mining algorithms to improve the classification accuracy rates of decision tree (DT) and naive Bayes (NB) classifiers for the classification of multi-class problems are introduced.
Abstract: In this paper, we introduce two independent hybrid mining algorithms to improve the classification accuracy rates of decision tree (DT) and naive Bayes (NB) classifiers for the classification of multi-class problems. Both DT and NB classifiers are useful, efficient and commonly used for solving classification problems in data mining. Since the presence of noisy contradictory instances in the training set may cause the generated decision tree suffers from overfitting and its accuracy may decrease, in our first proposed hybrid DT algorithm, we employ a naive Bayes (NB) classifier to remove the noisy troublesome instances from the training set before the DT induction. Moreover, it is extremely computationally expensive for a NB classifier to compute class conditional independence for a dataset with high dimensional attributes. Thus, in the second proposed hybrid NB classifier, we employ a DT induction to select a comparatively more important subset of attributes for the production of naive assumption of class conditional independence. We tested the performances of the two proposed hybrid algorithms against those of the existing DT and NB classifiers respectively using the classification accuracy, precision, sensitivity-specificity analysis, and 10-fold cross validation on 10 real benchmark datasets from UCI (University of California, Irvine) machine learning repository. The experimental results indicate that the proposed methods have produced impressive results in the classification of real life challenging multi-class problems. They are also able to automatically extract the most valuable training datasets and identify the most effective attributes for the description of instances from noisy complex training databases with large dimensions of attributes.

Journal ArticleDOI
TL;DR: An approach to implement vibration, pressure, and current signals for fault diagnosis of the valves in reciprocating compressors is presented and the superiority of DBN in fault classification is compared with that of relevant vector machine and back propagation neuron networks.
Abstract: This paper presents an approach to implement vibration, pressure, and current signals for fault diagnosis of the valves in reciprocating compressors. Due to the complexity of structure and motion of such compressor, the acquired vibration signal normally involves transient impacts and noise. This causes the useful information to be corrupted and difficulty in accurately diagnosing the faults with traditional methods. To reveal the fault patterns contained in this signal, the Teager-Kaiser energy operation (TKEO) is proposed to estimate the amplitude envelopes. In case of pressure and current, the random noise is removed by using a denoising method based on wavelet transform. Subsequently, statistical measures are extracted from all signals to represent the characteristics of the valve conditions. In order to classify the faults of compressor valves, a new type of learning architecture for deep generative model called deep belief networks (DBNs) is applied. DBN employs a hierarchical structure with multiple stacked restricted Boltzmann machines (RBMs) and works through a greedy layer-by-layer learning algorithm. In pattern recognition research areas, DBN has proved to be very effective and provided with high performance for binary values. However, for implementing DBN to fault diagnosis where most of signals are real-valued, RBM with Bernoulli hidden units and Gaussian visible units is considered in this study. The proposed approach is validated with the signals from a two-stage reciprocating air compressor under different valve conditions. To confirm the superiority of DBN in fault classification, its performance is compared with that of relevant vector machine and back propagation neuron networks. The achieved accuracy indicates that the proposed approach is highly reliable and applicable in fault diagnosis of industrial reciprocating machinery.

Journal ArticleDOI
TL;DR: A greedy feature selection method using mutual information that combines both feature–feature mutual information and feature–class mutual information to find an optimal subset of features to minimize redundancy and to maximize relevance among features is introduced.
Abstract: Feature selection is used to choose a subset of relevant features for effective classification of data. In high dimensional data classification, the performance of a classifier often depends on the feature subset used for classification. In this paper, we introduce a greedy feature selection method using mutual information. This method combines both feature–feature mutual information and feature–class mutual information to find an optimal subset of features to minimize redundancy and to maximize relevance among features. The effectiveness of the selected feature subset is evaluated using multiple classifiers on multiple datasets. The performance of our method both in terms of classification accuracy and execution time performance, has been found significantly high for twelve real-life datasets of varied dimensionality and number of instances when compared with several competing feature selection techniques.

Journal ArticleDOI
TL;DR: Experimental results show that AC particularly MCAC detects phishing websites with higher accuracy than other intelligent algorithms and generates new hidden knowledge (rules) that other algorithms are unable to find and this has improved its classifiers predictive performance.
Abstract: Website phishing is considered one of the crucial security challenges for the online community due to the massive numbers of online transactions performed on a daily basis. Website phishing can be described as mimicking a trusted website to obtain sensitive information from online users such as usernames and passwords. Black lists, white lists and the utilisation of search methods are examples of solutions to minimise the risk of this problem. One intelligent approach based on data mining called Associative Classification (AC) seems a potential solution that may effectively detect phishing websites with high accuracy. According to experimental studies, AC often extracts classifiers containing simple “If-Then” rules with a high degree of predictive accuracy. In this paper, we investigate the problem of website phishing using a developed AC method called Multi-label Classifier based Associative Classification (MCAC) to seek its applicability to the phishing problem. We also want to identify features that distinguish phishing websites from legitimate ones. In addition, we survey intelligent approaches used to handle the phishing problem. Experimental results using real data collected from different sources show that AC particularly MCAC detects phishing websites with higher accuracy than other intelligent algorithms. Further, MCAC generates new hidden knowledge (rules) that other algorithms are unable to find and this has improved its classifiers predictive performance.

Journal ArticleDOI
TL;DR: Sn-grams can be applied in any natural language processing (NLP) task where traditional n- grams are used and described how sn-rams were applied to authorship attribution.
Abstract: In this paper we introduce and discuss a concept of syntactic n-grams (sn-grams). Sn-grams differ from traditional n-grams in the manner how we construct them, i.e., what elements are considered neighbors. In case of sn-grams, the neighbors are taken by following syntactic relations in syntactic trees, and not by taking words as they appear in a text, i.e., sn-grams are constructed by following paths in syntactic trees. In this manner, sn-grams allow bringing syntactic knowledge into machine learning methods; still, previous parsing is necessary for their construction. Sn-grams can be applied in any natural language processing (NLP) task where traditional n-grams are used. We describe how sn-grams were applied to authorship attribution. We used as baseline traditional n-grams of words, part of speech (POS) tags and characters; three classifiers were applied: support vector machines (SVM), naive Bayes (NB), and tree classifier J48. Sn-grams give better results with SVM classifier.

Journal ArticleDOI
TL;DR: It is concluded that the embedding and extraction of the proposed algorithm is well optimized, robust and show an improvement over other similar reported methods.
Abstract: This paper presents an optimized watermarking scheme based on the discrete wavelet transform (DWT) and singular value decomposition (SVD). The singular values of a binary watermark are embedded in singular values of the LL3 sub-band coefficients of the host image by making use of multiple scaling factors (MSFs). The MSFs are optimized using a newly proposed Firefly Algorithm having an objective function which is a linear combination of imperceptibility and robustness. The PSNR values indicate that the visual quality of the signed and attacked images is good. The embedding algorithm is robust against common image processing operations. It is concluded that the embedding and extraction of the proposed algorithm is well optimized, robust and show an improvement over other similar reported methods.

Journal ArticleDOI
TL;DR: A hybrid intelligent system that consists of the Fuzzy Min-Max neural network, the Classification and Regression Tree, and the Random Forest model is proposed, and its efficacy as a decision support tool for medical data classification is examined.
Abstract: In this paper, a hybrid intelligent system that consists of the Fuzzy Min-Max neural network, the Classification and Regression Tree, and the Random Forest model is proposed, and its efficacy as a decision support tool for medical data classification is examined. The hybrid intelligent system aims to exploit the advantages of the constituent models and, at the same time, alleviate their limitations. It is able to learn incrementally from data samples (owing to Fuzzy Min-Max neural network), explain its predicted outputs (owing to the Classification and Regression Tree), and achieve high classification performances (owing to Random Forest). To evaluate the effectiveness of the hybrid intelligent system, three benchmark medical data sets, viz., Breast Cancer Wisconsin, Pima Indians Diabetes, and Liver Disorders from the UCI Repository of Machine Learning, are used for evaluation. A number of useful performance metrics in medical applications which include accuracy, sensitivity, specificity, as well as the area under the Receiver Operating Characteristic curve are computed. The results are analyzed and compared with those from other methods published in the literature. The experimental outcomes positively demonstrate that the hybrid intelligent system is effective in undertaking medical data classification tasks. More importantly, the hybrid intelligent system not only is able to produce good results but also to elucidate its knowledge base with a decision tree. As a result, domain users (i.e., medical practitioners) are able to comprehend the prediction given by the hybrid intelligent system; hence accepting its role as a useful medical decision support tool.

Journal ArticleDOI
TL;DR: A new finger vein recognition algorithm based on Band Limited Phase Only Correlation (BLPOC) and a new type of geometrical features called Width-Centroid Contour Distance (WCCD) which can improve the accuracy of finger geometry recognition.
Abstract: A new finger vein recognition algorithm based on Band Limited Phase Only Correlation.Finger width and Centroid Contour Distance for finger geometry recognition.The fusion of vein and geometry for a finger based bimodal biometrics system.A new infrared finger image database is made publicly available on the web. In this paper, a new approach of multimodal finger biometrics based on the fusion of finger vein and finger geometry recognition is presented. In the proposed method, Band Limited Phase Only Correlation (BLPOC) is utilized to measure the similarity of finger vein images. Unlike previous methods, BLPOC is resilient to noise, occlusions and rescaling factors; thus can enhance the performance of finger vein recognition. As for finger geometry recognition, a new type of geometrical features called Width-Centroid Contour Distance (WCCD) is proposed. This WCCD combines the finger width with Centroid Contour Distance (CCD). As compared with the single type of feature, the fusion of W and CCD can improve the accuracy of finger geometry recognition. Finally, we integrate the finger vein and finger geometry recognitions by a score-level fusion method based on the weighted SUM rule. Experimental evaluation using our own database which was collected from 123 volunteers resulted in an efficient recognition performance where the equal error rate (EER) was 1.78% with a total processing time of 24.22ms.

Journal ArticleDOI
TL;DR: A novel hybrid MCDM model that combines fuzzy Decision Making Trial and Evaluation Laboratory Model (DEMATEL), fuzzy Analytical Network Process (ANP) and fuzzy Visekriterijumska Optimizacija i kompromisno Resenje (VIKOR) methods is developed and successfully performed in this paper for the City of Belgrade.
Abstract: City logistics (CL) tends to increase efficiency and mitigate the negative effects of logistics processes and activities and at the same time to support the sustainable development of urban areas. Accordingly, various measures and initiatives are applying and various conceptual solutions are defining. The effects vary depending on the characteristics of the city. This paper proposes a framework for the selection of the CL concept which would be most appropriate for different participants, stakeholders, and which would comply with attributes of the surroundings. CL participants have different, usually conflicting goals and interests, so it is necessary to define a large number of criteria for concepts evaluation. On the other hand, the importance of the criteria is dependent on the specific situation, i.e., a large number of factors describing the surroundings. In situations like this, selecting the best alternative is a complex multi-criteria decision-making (MCDM) problem consisting of conflicting and uncertain elements. A novel hybrid MCDM model that combines fuzzy Decision Making Trial and Evaluation Laboratory Model (DEMATEL), fuzzy Analytical Network Process (ANP) and fuzzy Visekriterijumska Optimizacija i kompromisno Resenje (VIKOR) methods is developed in this paper. The model provides support to decision makers (planners, city administration, logistics service providers, users, etc.) when selecting the CL concept, which is successfully performed in this paper for the City of Belgrade.

Journal ArticleDOI
TL;DR: It was found that the Fuzzy Logic and LS-SVR approaches can be employed successfully in modeling the daily evaporation process from the available climatic data, and results showed that the machine learning models outperform the traditional HGS and SS empirical methods.
Abstract: This paper investigates the abilities of Artificial Neural Networks (ANN), Least Squares – Support Vector Regression (LS-SVR), Fuzzy Logic, and Adaptive Neuro-Fuzzy Inference System (ANFIS) techniques to improve the accuracy of daily pan evaporation estimation in sub-tropical climates. Meteorological data from the Karso watershed in India (consisting of 3801 daily records from the year 2000 to 2010) were used to develop and test the models for daily pan evaporation estimation. The measured meteorological variables include daily observations of rainfall, minimum and maximum air temperatures, minimum and maximum humidity, and sunshine hours. Prior to model development, the Gamma Test (GT) was used to derive estimates of the noise variance for each input–output set in order to identify the most useful predictors for use in the machine learning approaches used in this study. The ANN models consisted of feed forward backpropagation (FFBP) models with Bayesian Regularization (BR), along with the Levenberg–Marquardt (LM) algorithm. A comparison was made between the estimates provided by the ANN, LS-SVR, Fuzzy Logic, and ANFIS models. The empirical Hargreaves and Samani method (HGS), as well as the Stephens–Stewart (SS) method, were also considered for comparison with the newer machine learning methods. The Root Mean Square Error (RMSE) and Correlation Coefficient (CORR) were the statistical performance indices that were used to evaluate the accuracy of the various models. Based on the comparison, it was found that the Fuzzy Logic and LS-SVR approaches can be employed successfully in modeling the daily evaporation process from the available climatic data. In addition, results showed that the machine learning models outperform the traditional HGS and SS empirical methods.

Journal ArticleDOI
TL;DR: The proposed mode ensemble operator is found to produce the most accurate forecasts, followed by the median, while the mean has relatively poor performance, suggesting that the mode operator should be considered as an alternative to the mean and median operators in forecasting applications.
Abstract: The combination of forecasts resulting from an ensemble of neural networks has been shown to outperform the use of a single ''best'' network model. This is supported by an extensive body of literature, which shows that combining generally leads to improvements in forecasting accuracy and robustness, and that using the mean operator often outperforms more complex methods of combining forecasts. This paper proposes a mode ensemble operator based on kernel density estimation, which unlike the mean operator is insensitive to outliers and deviations from normality, and unlike the median operator does not require symmetric distributions. The three operators are compared empirically and the proposed mode ensemble operator is found to produce the most accurate forecasts, followed by the median, while the mean has relatively poor performance. The findings suggest that the mode operator should be considered as an alternative to the mean and median operators in forecasting applications. Experiments indicate that mode ensembles are useful in automating neural network models across a large number of time series, overcoming issues of uncertainty associated with data sampling, the stochasticity of neural network training, and the distribution of the forecasts.

Journal ArticleDOI
TL;DR: The main idea in this paper is to describe key papers and provide some guidelines to help medical practitioners to explore previous works and identify interesting areas for future research.
Abstract: Data mining is a powerful method to extract knowledge from data. Raw data faces various challenges that make traditional method improper for knowledge extraction. Data mining is supposed to be able to handle various data types in all formats. Relevance of this paper is emphasized by the fact that data mining is an object of research in different areas. In this paper, we review previous works in the context of knowledge extraction from medical data. The main idea in this paper is to describe key papers and provide some guidelines to help medical practitioners. Medical data mining is a multidisciplinary field with contribution of medicine and data mining. Due to this fact, previous works should be classified to cover all users' requirements from various fields. Because of this, we have studied papers with the aim of extracting knowledge from structural medical data published between 1999 and 2013. We clarify medical data mining and its main goals. Therefore, each paper is studied based on the six medical tasks: screening, diagnosis, treatment, prognosis, monitoring and management. In each task, five data mining approaches are considered: classification, regression, clustering, association and hybrid. At the end of each task, a brief summarization and discussion are stated. A standard framework according to CRISP-DM is additionally adapted to manage all activities. As a discussion, current issue and future trend are mentioned. The amount of the works published in this scope is substantial and it is impossible to discuss all of them on a single work. We hope this paper will make it possible to explore previous works and identify interesting areas for future research.

Journal ArticleDOI
TL;DR: Dendroid, a system based on text mining and information retrieval techniques for malware analysis, is introduced, suggesting that the approach is remarkably accurate and deals efficiently with large databases of malware instances.
Abstract: The rapid proliferation of smartphones over the last few years has come hand in hand with and impressive growth in the number and sophistication of malicious apps targetting smartphone users. The availability of reuse-oriented development methodologies and automated malware production tools makes exceedingly easy to produce new specimens. As a result, market operators and malware analysts are increasingly overwhelmed by the amount of newly discovered samples that must be analyzed. This situation has stimulated research in intelligent instruments to automate parts of the malware analysis process. In this paper, we introduce Dendroid, a system based on text mining and information retrieval techniques for this task. Our approach is motivated by a statistical analysis of the code structures found in a dataset of Android OS malware families, which reveals some parallelisms with classical problems in those domains. We then adapt the standard Vector Space Model and reformulate the modelling process followed in text mining applications. This enables us to measure similarity between malware samples, which is then used to automatically classify them into families. We also investigate the application of hierarchical clustering over the feature vectors obtained for each malware family. The resulting dendograms resemble the so-called phylogenetic trees for biological species, allowing us to conjecture about evolutionary relationships among families. Our experimental results suggest that the approach is remarkably accurate and deals efficiently with large databases of malware instances.

Journal ArticleDOI
TL;DR: A new framework for measurement of customer satisfaction for mobile services by combining VIKOR (in Serbian: ViseKriterijumsa Optimizacija I Kompromisno Resenje) and sentiment analysis is developed, which believes that the proposed customer-review-based approach not only saves time and effort in measuring customer satisfaction, but also captures the real voices of customers.
Abstract: With the rapid growth and dissemination of mobile services, enhancement of customer satisfaction has emerged as a core issue. Customer reviews are recognized as fruitful information sources for monitoring and enhancing customer satisfaction levels, particularly as they convey the real voices of actual customers expressing relatively unambiguous opinions. As a methodological means of customer review analysis, sentiment analysis has come to the fore. Although several sentiment analysis approaches have proposed extraction of the emotional information from customer reviews, however, a lacuna remains as to how to effectively analyze customer reviews for the purpose of monitoring customer satisfaction with mobile services. In response, the present study developed a new framework for measurement of customer satisfaction for mobile services by combining VIKOR (in Serbian: ViseKriterijumsa Optimizacija I Kompromisno Resenje) and sentiment analysis. With VIKOR, which is a compromise ranking method of the multicriteria decision making (MCDM) approach, customer satisfaction for mobile services can be accurately measured by a sentiment-analysis scheme that simultaneously considers maximum group utility and individual regret. The suggested framework consists mainly of two stages: data collection and preprocessing, and measurement of customer satisfaction. In the first, data collection and preprocessing stage, text mining is utilized to compile customer-review-based dictionaries of attributes and sentiment words. Then, using sentiment analysis, sentiment scores for attributes are calculated for each mobile service. In the second stage, levels of customer satisfaction are measured using VIKOR. For the purpose of illustration, an empirical case study was conducted on customer reviews of mobile application services. We believe that the proposed customer-review-based approach not only saves time and effort in measuring customer satisfaction, but also captures the real voices of customers.