scispace - formally typeset
Search or ask a question

Showing papers in "Expert Systems in 2014"


Journal ArticleDOI
TL;DR: A new association measure, weighted support count WSC, based on association rule mining, is proposed to represent both the intensity and nature of the relationships between products in a distribution centre to facilitate efficient order picking by applying WSC.
Abstract: Among the warehousing activities in distribution centres, order picking is the most time-consuming and labour-intensive. As a result, order picking may become a bottleneck preventing distribution centres from maximizing the effectiveness of their warehousing activities. Although storage location assignment or product allocation is a tactical decision, it is especially influential on the effectiveness of order picking. In previous studies, most storage assignment approaches considered the order frequency of individual products rather than that of product groups, which often are purchased together. This paper proposes a new association measure, weighted support count WSC, based on association rule mining, to represent both the intensity and nature of the relationships between products in a distribution centre. This paper presents two storage assignment heuristics, the modified class-based heuristic MCBH and the association seed based heuristic ASBH, designed to facilitate efficient order picking by applying WSC. The real-world data set of a grocery distribution centre is used to verify the effectiveness of the proposed approaches. From the computational results, MCBH cuts at most 4% from the travel distance for order picking per month, as compared with the traditional class-based approach. Meanwhile, ASBH achieves at most a 13% reduction in travel distance.

52 citations


Journal ArticleDOI
TL;DR: This work analyzed a distributed software project from the information flow perspective, and developed specific techniques to improve information flow in distributed software development according to the FLOW Method.
Abstract: Communication is a key success factor of distributed software projects. Poor communication has been identified as a main obstacle to successful collaboration. Global projects are especially endangered by information gaps between collaborating sites. Different communication styles, technical equipment, and missing awareness of each other can cause severe problems. Knowledge about actual and desired channels, paths, and modes of communication is required for improving communication in a globally distributed project. However, many project participants know little about communication and information flow in their projects. In this contribution, we focus on knowledge about communication and information flow. It is acquired by modelling on-going and desired flows of information, including documented and non-documented channels of information flow. We analyzed a distributed software project from the information flow perspective. Based on the findings, we developed specific techniques to improve information flow in distributed software development according to the FLOW Method. In a second distributed project, we evaluated one of the techniques. We found the FLOW mapping technique to be suitable for effectively spreading knowledge about communication and information flow in global software projects.

50 citations


Journal ArticleDOI
TL;DR: The purpose of this paper is to describe the use of source separation for sEMG using ICA, and to demonstrate the actual use in practical s EMG experiments, when the number of recording channels for electrical muscle activities is varied.
Abstract: Surface electromyogram sEMG is a technique in which electrodes are placed on the skin overlying a muscle to detect the electrical activity. Multiple electrical sensors are essential for extracting intrinsic physiological and contextual information from the corresponding sEMG signals. The reason, why more than just one sEMG signal capture has to be used, is as follows: Due to signal propagation inside the human body in terms of an electrical conductor, there cannot be a one-to-one mapping of activities between muscle fibre groups and corresponding sEMG sensing electrodes. Each of such electrodes rather records a composition of many, and widely activity-independent signals, and such kind of raw signal capture cannot be efficiently used for pattern matching due to its linear dependency. On the other hand, Independent component analysis ICA provides the perfect answer of separating skin surface recordings into a set of independent muscle actions. Hence, there is a need for a method that indicates the quality of the sensor placements in sEMG. The purpose of this paper is to describe the use of source separation for sEMG using ICA. The actual use in practical sEMG experiments is demonstrated, when the number of recording channels for electrical muscle activities is varied.

46 citations


Journal ArticleDOI
TL;DR: The proposed system shows great promise in automatic classification of normal and abnormal breast thermograms without the need for subjective interpretation.
Abstract: Breast cancer is a leading cancer affecting women worldwide. Mammography is a scanning procedure involvingX-rays of the breast. It causes discomfort and may cause high incidence of false negatives. Breast thermography is a new screening method of breast that helps in the early detection of cancer. It is a non-invasive imaging procedure that captures the infrared heat radiating off from the breast surface using an infrared camera. The main objective of this work is to evaluate the use of higher order spectral features extracted from thermograms in classifying normal and abnormal thermograms. For this purpose, we extracted five higher order spectral features and used them in a feed-forward artificial neural network ANN classifier and a support vector machine SVM. Fifty thermograms 25 each of normal and abnormal were used for analysis.SVM presented a good sensitivity of 76% and specificity of 84%, and theANN classifier demonstrated higher values of sensitivity 92% and specificity 88%. The proposed system, therefore, shows great promise in automatic classification of normal and abnormal breast thermograms without the need for subjective interpretation.

45 citations


Journal ArticleDOI
TL;DR: A catalog of evaluated solutions and associated recommendations mapped to the identified problem areas to find ways to minimise knowledge‐transfer problems in offshore outsourcing software development projects is found.
Abstract: Knowledge transfer is a critical factor in ensuring the success of offshore outsourcing software development projects and is, in many cases, neglected. Compared to in-house or co-located projects, however, such globally distributed projects feature far greater complexity. In addition to language barriers, factors such as cultural differences, time zone variance, distinct methods and practices, as well as unique equipment and infrastructure can all lead to problems that negatively impact knowledge transfer, and as a result, a project's overall success. In order to help minimise such risks to knowledge transfer, we conducted a research study based on expert interviews in six projects. Our study used German clients and focused on offshore outsourcing software development projects. We first identified known problems in knowledge transfer that can occur with offshore outsourcing projects. Then we collected best-practice solutions proven to overcome the types of problems described. Afterward, we conducted a follow-up study to evaluate our findings. In this subsequent stage, we presented our findings to a different group of experts in five projects and asked them to evaluate these solutions and recommendations in terms of our original goal, namely to find ways to minimise knowledge-transfer problems in offshore outsourcing software development projects. Thus, the result of our study is a catalog of evaluated solutions and associated recommendations mapped to the identified problem areas.

39 citations


Journal ArticleDOI
TL;DR: This work reveals the possibility of developing a system that could identify the six emotional states in a user‐independent manner using electrocardiogram signals and investigates the power and entropy of the individual IMFs.
Abstract: Emotion recognition using physiological signals has gained momentum in the field of human computer-interaction. This work focuses on developing a user-independent emotion recognition system that would classify five emotions happiness, sadness, fear, surprise and disgust and neutral state. The various stages such as design of emotion elicitation protocol, data acquisition, pre-processing, feature extraction and classification are discussed. Emotional data were obtained from 30 undergraduate students by using emotional video clips. Power and entropy features were obtained in three ways - by decomposing and reconstructing the signal using empirical mode decomposition, by using a Hilbert-Huang transform and by applying a discrete Fourier transform to the intrinsic mode functions IMFs. Statistical analysis using analysis of variance indicates significant differences among the six emotional states p<0.001. Classification results indicate that applying the discrete Fourier transform instead of the Hilbert transform to the IMFs provides comparatively better accuracy for all the six classes with an overall accuracy of 52%. Although the accuracy is less, it reveals the possibility of developing a system that could identify the six emotional states in a user-independent manner using electrocardiogram signals. The accuracy of the system can be improved by investigating the power and entropy of the individual IMFs.

38 citations


Journal ArticleDOI
TL;DR: A fuzzy multi-attribute decision making MADM algorithm based on fuzzy analytic hierarchy process FAHP and Technique for Order Preference by Similarity to Ideal Solution TOPSIS methods is proposed for the selection of candidates eligible to become basketball players in Mugla, Turkey.
Abstract: The selection of skilful players is a complicated process due to the problem criteria consisting of both qualitative and quantitative attributes as well as vague linguistic terms. This study seeks to develop a decision support framework for the selection of candidates eligible to become basketball players through the use of a fuzzy multi-attribute decision making MADM algorithm. The proposed model is based on fuzzy analytic hierarchy process FAHP and Technique for Order Preference by Similarity to Ideal Solution TOPSIS methods. The model was employed in the Youth and Sports Center of Mugla, Turkey, with the participation of seven junior basketball players aged between 7 and 14. In the present study, physical fitness measurement values and observation values of technical skills were utilized. FAHP was used to determine the weights of the criteria and the observation values of technical skills by decision makers. Physical fitness measurement values were converted to fuzzy values by using a fuzzy set approach. Subsequently, the overall ranking of the candidate players was determined by the TOPSIS method. Results were compared with human experts' opinions. It is observed that the developed model is more reliable to be used in decision making. The model architecture and experimental results along with illustrative examples are further demonstrated in the study.

38 citations


Journal ArticleDOI
TL;DR: Tests showed that the classifiers developed by CDKML have better performance than machine‐learning classifiers generated on a training dataset that does not adequately represent all real‐life cases of the learned concept.
Abstract: This paper presents a method for combining domain knowledge and machine learning CDKML for classifier generation and online adaptation. The method exploits advantages in domain knowledge and machine learning as complementary information sources. Whereas machine learning may discover patterns in interest domains that are too subtle for humans to detect, domain knowledge may contain information on a domain not present in the available domain dataset. CDKML has three steps. First, prior domain knowledge is enriched with relevant patterns obtained by machine learning to create an initial classifier. Second, genetic algorithms refine the classifier. Third, the classifier is adapted online on the basis of user feedback using the Markov decision process. CDKML was applied in fall detection. Tests showed that the classifiers developed by CDKML have better performance than machine-learning classifiers generated on a training dataset that does not adequately represent all real-life cases of the learned concept. The accuracy of the initial classifier was 10 percentage points higher than the best machine-learning classifier and the refinement added 3 percentage points. The online adaptation improved the accuracy of the refined classifier by an additional 15 percentage points.

34 citations


Journal ArticleDOI
TL;DR: A general‐purposed, fast and adaptive automatic disease diagnosis system, using the information generated by the newly‐designed classifier, which makes decisions with a simple rule base comprising the rules in ‘if‐then' form, based on the support vector machine (SVM), a powerful classification algorithm.
Abstract: Automatic disease diagnosis systems have been used to treat diseases for many years. The data used in the construction of these systems require correct classification. Therefore, previous literature has proposed a variety of methods. This paper develops a general-purposed, fast and adaptive automatic disease diagnosis system, using the information generated by the newly-designed classifier, which makes decisions with a simple rule base comprising the rules in 'if-then' form. This newly-proposed methodology is based on the support vector machine SVM, a powerful classification algorithm. In the proposed method of this study, we added a feature of adaptivity to an SVM. In order to increase the success rate and decrease the decision-making time, the bias value of the standard SVM is changed in an adaptive structure. This process introduces a new kind of SVM, 'adaptive SVM', seeking a diagnosis of diseases in a more successful way. During the training and test operations of this newly designed system, we used diabetes and breast cancer datasets, acquired from the medical database of California University. This newly proposed methodology has 100% correct classification rates on both diabetes and breast cancer datasets.

30 citations


Journal ArticleDOI
TL;DR: From this hybrid two‐stage evaluation process, cloud service providers can get improvement suggestions from intermediate information derived from the gap measurement, which is the main advantage of this evaluation process.
Abstract: In this paper, we address the cloud service trustworthiness evaluation problem, which in essence is a multi-attribute decision-making problem, by proposing a novel evaluation model based on the fuzzy gap measurement and the evidential reasoning approach. There are many sources of uncertainties in the process of cloud service trustworthiness evaluation. In addition to the intrinsic uncertainties, cloud service providers face the problem of discrepant evaluation information given by different users from different perspectives. To address these problems, we develop a novel fuzzy gap evaluation approach to assess cloud service trustworthiness and to provide evaluation values from different perspectives. From the evaluation values, the perception-importance, delivery-importance, and perception-delivery gaps are generated. These three gaps reflect the discrepancy evaluation of cloud service trustworthiness in terms of perception utility, delivery utility, and importance utility, respectively. Finally, the gap measurement of each perspective is represented by a belief structure and aggregated using the evidential reasoning approach to generate final evaluation results for informative and robust decision making. From this hybrid two-stage evaluation process, cloud service providers can get improvement suggestions from intermediate information derived from the gap measurement, which is the main advantage of this evaluation process.

29 citations


Journal ArticleDOI
TL;DR: The aim of this paper is to automate the process of depression grading using a neurofuzzy model (NFM) and concludes that NFM‐1 with GMF is the best model with average predicting accuracy of 94.4% and robustness.
Abstract: Manual grading of depression is sometimes difficult due to the subjective signs-symptoms. The aim of this paper is to automate the process of depression grading using a neurofuzzy model NFM. Two hundred and seventy real-world depression cases are considered in this work. Each case has seven symptoms, which are obtained according to DSM-IV-TR. Each case is graded as 'mild' or 'moderate'. However, in practice, the boundaries of 'mild' and 'moderate' grading are fuzzy in nature. The paper attempts to solve this fuzzy overlapping zone of these grades. To reduce the number of symptoms, significantly correlated symptoms are mined using a paired t-test. Then, two NFMs have been developed. NFM-1 has been developed with all seven symptoms, while only significantly correlated symptoms have been used to construct the NFM-2 model. Two fuzzy membership functions, such as triangular membership function TRMF and Gaussian membership function GMF have been considered to note with which better fuzzification could be achieved. The paper concludes that NFM-1 with GMF is the best model with average predicting accuracy of 94.4% and robustness.

Journal ArticleDOI
TL;DR: Results show that consensus is a reasonable indicator of calibration, and it was discovered that a subject that perceives itself as more knowledgeable than others likely also is more experienced.
Abstract: In situations when data collection through observations is difficult to perform, the use of expert judgement can be justified. A challenge with this approach is, however, to value the credibility of different experts. A natural and state-of-the art approach is to weight the experts' judgements according to their calibration, that is, on the basis of how well their estimates of a studied event agree with actual observations of that event. However, when data collection through observations is difficult to perform, it is often also difficult to estimate the calibration of experts. As a consequence, variables thought to indicate calibration are generally used as a substitute of it in practice. This study evaluates the value of three such indicative variables: consensus, experience and self-proclamation. The significances of these variables are analysed in four surveys covering different domains in cyber security, involving a total of 271 subjects. Results show that consensus is a reasonable indicator of calibration. The mean Pearson correlation between these two variables across the four studies was 0.407. No significant correlations were found between calibration and experience or calibration and self-proclamation. However, as a side result, it was discovered that a subject that perceives itself as more knowledgeable than others likely also is more experienced.

Journal ArticleDOI
TL;DR: It is concluded that through understanding both the nature of GSD and the KE challenges in depth, it will be possible for organizations to make their distributed operations successful.
Abstract: A number knowledge-related challenges may complicate the work in global software development GSD projects In practice, even a small amount of missing knowledge may cause an activity to fail to create and transfer information which is critical to later functions, causing these later functions to fail Thus, knowledge engineering holds a central role in order to succeed with globally distributed product development Furthermore, examining the challenges faced in GSD from a cognitive perspective will help to find solutions that take into account the knowledge needs of different stakeholders in GSD and thus help to establish conditions for successful GSD projects In this paper, we will discuss these challenges and solutions based on an extensive literature study and practical experience gained in several international projects over the last decade Altogether, over 50 case studies were analysed We analysed the challenges identified in the cases from a cognitive perspective for bridging and avoiding the knowledge gaps and, based on this analysis, we will present example solutions to address the challenges during the GSD projects We will conclude that through understanding both the nature of GSD and the KE challenges in depth, it will be possible for organizations to make their distributed operations successful

Journal ArticleDOI
TL;DR: An enhanced version of the Tango attacked, named Genetic Tango attack, that uses Genetic Programming to design approximations, easing the generation of automatic cryptanalysis and improving its power compared to a manually designed attack is presented.
Abstract: Radio frequency identification RFID is a powerful technology that enables wireless information storage and control in an economical way. These properties have generated a wide range of applications in different areas. Due to economic and technological constrains, RFID devices are seriously limited, having small or even tiny computational capabilities. This issue is particularly challenging from the security point of view. Security protocols in RFID environments have to deal with strong computational limitations, and classical protocols cannot be used in this context. There have been several attempts to overcome these limitations in the form of new lightweight security protocols designed to be used in very constrained sometimes called ultra-lightweight RFID environments. One of these proposals is the David-Prasad ultra-lightweight authentication protocol. This protocol was successfully attacked using a cryptanalysis technique named Tango attack. The capacity of the attack depends on a set of boolean approximations. In this paper, we present an enhanced version of the Tango attack, named Genetic Tango attack, that uses Genetic Programming to design those approximations, easing the generation of automatic cryptanalysis and improving its power compared to a manually designed attack. Experimental results are given to illustrate the effectiveness of this new attack.

Journal ArticleDOI
TL;DR: A new generalization of classical real‐valued information systems, that is, interval‐valued Information systems, is proposed by defining an interval-valued dominance relation on a condition attribute, a rough set model and attribute reduction are established over interval‐ valued information systems.
Abstract: This paper proposes a new generalization of classical real-valued information systems, that is, interval-valued information systems. By defining an interval-valued dominance relation on a condition attribute, a rough set model and attribute reduction are established over interval-valued information systems. Moreover, several interesting properties are investigated by constructive approach. Furthermore, knowledge reductions of consistent and inconsistent interval-valued dominance decision information systems are studied, respectively. Subsequently, some descriptive theorems of knowledge reduction are presented for interval-valued dominance decision information systems. Finally, the validity of the model and conclusions is verified by numerical example.

Journal ArticleDOI
TL;DR: In this work, topic models are employed to learn the latent structure and dynamics of sensor network data and have shown the ability to find routines of activity over sensor networkData in office environments.
Abstract: Recent advances on sensor network technology provide the infrastructure to create intelligent environments on physical places. One of the main issues of sensor networks is the large amount of data they generate. Therefore, it is necessary to have good data analysis techniques with the aim of learning and discovering what is happening on the monitored environment. The problem becomes even more challenging if this process is performed following an unsupervised way without having any a priori information and applied over a long-term timeline with many sensors. In this work, topic models are employed to learn the latent structure and dynamics of sensor network data. Experimental results using two realistic datasets, having over 50weeks of data, have shown the ability to find routines of activity over sensor network data in office environments.

Journal ArticleDOI
TL;DR: A new approach to recognize human actions in 2D sequences, based on real‐time visual tracking and simple feature extraction of human activities in video sequences, which is competitive against other state‐of‐the‐art methods.
Abstract: This paper proposes a new approach to recognize human actions in 2D sequences, based on real-time visual tracking and simple feature extraction of human activities in video sequences. The proposed method emphasizes the simplicity of the strategies used, in an attempt to describe human actions as precisely as other more sophisticated and more computationally demanding methods in the literature. Specifically, we propose three complementary modules for the following: a tracking; b feature extraction; and c action recognition. The first module is based on the hybridization of a particle filter and a local search procedure and makes use of a reduced integral image to speed up the weight computation. The feature extraction module characterizes the silhouette of the tracked person by dividing it into rectangular boxes. Then, the system computes statistics on the evolution of these rectangular boxes over time. Finally, the action recognition module passes these statistics to a support vector machine to classify the actions. Experimental results show that the proposed method works in real-time, and its performance is competitive against other state-of-the-art methods.

Journal ArticleDOI
TL;DR: The development of an ensemble learner using a member of the Prism family as the base classifier to reduce the overfitting of Prism algorithms on noisy datasets is described.
Abstract: Ensemble learning can be used to increase the overall classification accuracy of a classifier by generating multiple base classifiers and combining their classification results. A frequently used family of base classifiers for ensemble learning are decision trees. However, alternative approaches can potentially be used, such as the Prism family of algorithms that also induces classification rules. Compared with decision trees, Prism algorithms generate modular classification rules that cannot necessarily be represented in the form of a decision tree. Prism algorithms produce a similar classification accuracy compared with decision trees. However, in some cases, for example, if there is noise in the training and test data, Prism algorithms can outperform decision trees by achieving a higher classification accuracy. However, Prism still tends to overfit on noisy data; hence, ensemble learners have been adopted in this work to reduce the overfitting. This paper describes the development of an ensemble learner using a member of the Prism family as the base classifier to reduce the overfitting of Prism algorithms on noisy datasets. The developed ensemble classifier is compared with a stand-alone Prism classifier in terms of classification accuracy and resistance to noise.

Journal ArticleDOI
TL;DR: Alert intelligent device is an ambient assisted living system that allows the evaluation of potentially dangerous situations for elderly people living alone at home working in conjunction with an ambient intelligence layer embedded in a personal computer that learns from user behaviour patterns and warns when a detected pattern differs significantly from previously acquired normal patterns.
Abstract: Alert intelligent device is an ambient assisted living AAL system that allows the evaluation of potentially dangerous situations for elderly people living alone at home. This evaluation is obtained by an ad hoc network of sensor nodes, working in conjunction with an ambient intelligence layer embedded in a personal computer that learns from user behaviour patterns and warns when a detected pattern differs significantly from previously acquired normal patterns. Each new datum read from sensors is processed in the ambient intelligence layer through three processing levels: shallow, intermediate and deep. The shallow processing level focuses on physical data and sensory features. The intermediate level covers information interpretation and its translation into the form required by the third level: the reasoning processing or deep level.

Journal ArticleDOI
TL;DR: A sparse sequence classifier based on L1 regularization is proposed to avoid the problem of having to choose the proper number of dimensions of the common parameterization of 2D action descriptors computed for each one of the available viewpoints.
Abstract: Employing multiple camera viewpoints in the recognition of human actions increases performance. This paper presents a feature fusion approach to efficiently combine 2D observations extracted from different camera viewpoints. Multiple-view dimensionality reduction is employed to learn a common parameterization of 2D action descriptors computed for each one of the available viewpoints. Canonical correlation analysis and their variants are employed to obtain such parameterizations. A sparse sequence classifier based on L1 regularization is proposed to avoid the problem of having to choose the proper number of dimensions of the common parameterization. The proposed system is employed in the classification of the Inria Xmas Motion Acquisition Sequences IXMAS data set with successful results.

Journal ArticleDOI
TL;DR: An effective prediction method is required for improving predictive accuracy and the task of learning an accurate classifier from instances raises a number of new issues some of which have not been properly addressed by transportation research.
Abstract: Motivation Road traffic accidents are among the top leading causes of deaths and injuries of various levels in South Africa. With the wealth and huge amount of data generated from road traffic accidents, the issue of traffic accident prediction has become a central challenge in the field of transportation data analysis. Such accident prediction is designed to detect patterns involved in dangerous crashes and thus help decision making and planning before casualty and loss occur. Recently, numerous researchers have presented a wide range of prediction techniques. Most of these methods are based on statistical studies but usually fail to explain the insights of prediction results. This has led to the development and application of supervised learning algorithms (classifiers) in an attempt to provide more accurate accident prediction in terms of injury severity (fatal/serious/slight/property damage with no injury). Even then, the task of learning an accurate classifier from instances raises a number of new issues some of which have not been properly addressed by transportation research. Thus, an effective prediction method is required for improving predictive accuracy. RESULTS The essence of the paper is the proposal that prediction of accidents given poor data quality (in terms of incomplete data) can be improved by using a classifier based on grey relational analysis, a similarity-based method. We evaluate the grey relational classifier with other state-of-the-art classifiers including artificial neural networks, classification and regression trees, k-nearest neighbour, linear discriminant analysis, naive Bayes classifier, algorithm quasi-optimal and support vector machines. Real-world road traffic accident dataset is utilized for this task. Experimental results are provided to illustrate the efficiency and the robustness of the grey relational classifier algorithm in terms of road traffic accident predictive accuracy. 2013 Wiley Publishing Ltd. Language: en

Journal ArticleDOI
TL;DR: It is concluded that knowledge exchange in global software outsourcing is a by‐product of efforts to enhance communication and coordination, rather than specific technical solutions.
Abstract: Global outsourcing is a growing trend among independent software vendors. In these projects like other distributed work, distances have negative effects on communication and coordination, directly impacting performance. We present a normative model designed to address this issue by improving communication and knowledge exchange. The model consists of six distinct practices and a tool blueprint, each coming with practical guidelines. It is based in part on two case studies of Dutch software vendors who have successfully outsourced part of their activities to an Eastern European outsourcing vendor, and validated by a panel of six experts from industry and the scientific community. It is concluded that knowledge exchange in global software outsourcing is a by-product of efforts to enhance communication and coordination, rather than specific technical solutions. By committing to sharing knowledge, emphasizing transparency and integrating the outsourcing team into their organizations, customers from the product software business can realize the benefits of global outsourcing.

Journal ArticleDOI
TL;DR: VigilAgent is a methodology for the development of agent‐oriented monitoring applications that uses agents as the key abstraction elements of the involved models and uses fragments from Prometheus and INGENIAS methodologies for modelling tasks and the ICARO framework for implementation purposes.
Abstract: VigilAgent is a methodology for the development of agent-oriented monitoring applications that uses agents as the key abstraction elements of the involved models. It has not been developed from scratch, but it reuses fragments from Prometheus and INGENIAS methodologies for modelling tasks and the ICARO framework for implementation purposes. As VigilAgent intends to automate as much as possible the development process, it exploits.

Journal ArticleDOI
TL;DR: A slacks-based Data Envelopment Analysis assessment framework for assessing bank efficiency and soundness in a risk regulation setting is introduced, which is missing from the banking performance literature.
Abstract: Despite increasing deregulation and globalization in financial markets worldwide, banking is still one of the most regulated industries in many countries. The contribution of the present article is to introduce a slacks-based Data Envelopment Analysis assessment framework for assessing bank efficiency and soundness in a risk regulation setting, which is missing from the banking performance literature. Two main sub-processes within the service flow of a typical bank are considered - the primary banking business for making profit and dealing with the compliance requirements of risk regulations. A Data Envelopment Analysis model is applied to measure the performance of the two sub-processes, that is, profit-making efficiency and risk-controlling efficiency. The research framework and models are applied to an empirical study of the banking sector in Taiwan covering the period 2007 - 2010. We demonstrate how to use the empirical results to monitor the efficiency status for individual banks from 1year to another, providing an early warning for those with low efficiency. Our empirical results show that there is considerable potential for efficiency improvement in Taiwan's banking industry, and the room for risk-controlling efficiency improvement is even larger. The two efficiency estimates are positively correlated with each other, and both have been improved year by year. However, an economic recession can lower efficiency estimates.

Journal ArticleDOI
TL;DR: The proposed method, here called NMF_GA can find a near optimal solution to initialize the NMF components and was applied to JAFFE facial expression dataset, achieving superior results than vast variety of NMF initialization methods.
Abstract: Nonnegative matrix factorization NMF algorithms have been utilized in a wide range of real applications; however, the performance of NMF is highly dependent on three factors including: 1 choosing a problem dependent cost function; 2 using an effective initialization method to start the updating procedure from a near-optimal point; and 3 determining the rank of factorized matrices prior to decomposition. Due to the nonconvex nature of the NMF cost function, finding an analytical-based optimal solution is impossible. This paper is aimed at proposing an efficient initialization method to modify the NMF performance. To widely explore the search space for initializing the factorized matrices in NMF, the island genetic algorithm IGA is employed as a diverse multiagent search scheme. To adapt IGA for NMF initialization, we present a specific mutation operator. To assess how the proposed IGA initialization method efficiently enhances NMF performance, we have implemented state-of-the-art initialization methods and applied to the Japanese Female Facial Expression dataset to recognize the facial expression states. Experimental results demonstrate the superiority of the proposed approach to the compared methods in terms of relative error and fast convergence.

Journal ArticleDOI
TL;DR: The aim of this article is to evaluate a boosting-based ensemble approach, forward stage-wise additive modelling FSAM, to improve some widely used base regressors' prediction ability and empirically obtained that in general FSAM enhances the accuracies of base regressionors or it at least maintains the base regressor performance.
Abstract: Analysis of scientific data requires accurate regressor algorithms to decrease prediction errors. Lots of machine learning algorithms, that is, neural networks, rule-based algorithms, regression trees and some kinds of lazy learners, are used to realize this need. In recent years, different ensemble regression strategies were improved to obtain enhanced predictors with lower forecasting errors. Ensemble algorithms combine good models that make errors in different parts of analyzed data. There are mainly two approaches in ensemble regression algorithm generation; boosting and bagging. The aim of this article is to evaluate a boosting-based ensemble approach, forward stage-wise additive modelling FSAM, to improve some widely used base regressors' prediction ability. We used 10 regression algorithms in four different types to make predictions on 10 diverse data from different scientific areas and we compared the experimental results in terms of correlation coefficient, mean absolute error, and root mean squared error metrics. Furthermore, we made use of scatter plots to demonstrate the effect of ensemble modelling on the prediction accuracies of evaluated algorithms. We empirically obtained that in general FSAM enhances the accuracies of base regressors or it at least maintains the base regressor performance.

Journal ArticleDOI
TL;DR: A prototype, the Knowledge Extraction Workbench (KEWo), is presented, which supports the knowledge engineer in this task and is integrated into the open‐source case‐based reasoning tool myCBR Workbench.
Abstract: Web communities and the Web 2.0 provide a huge amount of experiences and there has been a growing availability of Linked Open Data. Making experiences and data available as knowledge to be used in case-based reasoning CBR systems is a current research effort. The process of extracting such knowledge from the diverse data types used in web communities, to transform data obtained from Linked Data sources, and then formalising it for CBR, is not an easy task. In this paper, we present a prototype, the Knowledge Extraction Workbench KEWo, which supports the knowledge engineer in this task. We integrated the KEWo into the open-source case-based reasoning tool myCBR Workbench. We provide details on the abilities of the KEWo to extract vocabularies from Linked Data sources and generate taxonomies from Linked Data as well as from web community data in the form of semi-structured texts.

Journal ArticleDOI
TL;DR: F fuzzy logic techniques are adopted to calculate the attractiveness of crime attractors in three suburban cities in the Metro Vancouver region of British Columbia, Canada and provide results comparable with real-life expectations that offenders do not necessarily commit significant crimes in the immediate neighbourhood of the attractors, but travel towards it, and commit crimes on the way.
Abstract: Crime attractors are locations e.g. shopping malls that attract criminally motivated offenders because of the presence of known criminal opportunities. Although there have been many studies that explore the patterns of crime in and around these locations, there are still many questions that linger. In recent years, there has been a growing interest to develop mathematical models in attempts to help answer questions about various criminological phenomena. In this paper, we are interested in applying a formal methodology to model the relative attractiveness of crime attractor locations based on characteristics of offenders and the crime they committed. To accomplish this task, we adopt fuzzy logic techniques to calculate the attractiveness of crime attractors in three suburban cities in the Metro Vancouver region of British Columbia, Canada. The fuzzy logic techniques provide results comparable with our real-life expectations that offenders do not necessarily commit significant crimes in the immediate neighbourhood of the attractors, but travel towards it, and commit crimes on the way. The results of this study could lead to a variety of crime prevention benefits and urban planning strategies.

Journal ArticleDOI
TL;DR: The DenGraph‐HO algorithm is applied to the real‐world datasets obtained from the online music platform Last.fm and from the former US company Enron.
Abstract: DenGraph-HO is an extension of the density-based graph clustering algorithm DenGraph. It is able to detect dense groups of nodes in a given graph and produces a hierarchy of clusters, which can be efficiently computed. The generated hierarchy can be used to investigate the structure and the characteristics of social networks. Each hierarchy level provides a different level of detail and can be used as the basis for interactive visual social network analysis. After a short introduction of the original DenGraph algorithm, we present DenGraph-HO and its top-down and bottom-up approaches. We describe the data structures and memory requirements and analyse the run-time complexity. Finally, we apply the DenGraph-HO algorithm to the real-world datasets obtained from the online music platform Last.fm and from the former US company Enron.

Journal ArticleDOI
TL;DR: An integral framework is proposed for the optimization of different ANN classifiers based on statistical hypothesis testing and results show the relevance of this framework, proving that its application improves the performance and efficiency of multiple classifiers.
Abstract: Artificial neural networks ANNs are flexible computing tools that have been applied to a wide range of domains with a notable level of accuracy. However, there are multiple choices of ANNs classifiers in the literature that produce dissimilar results. As a consequence of this, the selection of this classifier is crucial for the overall performance of the system. In this work, an integral framework is proposed for the optimization of different ANN classifiers based on statistical hypothesis testing. The framework is tested in a real ballistic scenario. The new quality measures introduced, based on the Student t-test, and employed throughout the framework, ensure the validity of results from a statistical standpoint; they reduce the appearance of experimental errors or the appearance of possible randomness. Results show the relevance of this framework, proving that its application improves the performance and efficiency of multiple classifiers.