scispace - formally typeset
Search or ask a question

Showing papers in "Progress in Artificial Intelligence in 2020"


Journal ArticleDOI
TL;DR: This paper mainly focus on the application of deep learning architectures to three major applications, namely (i) wild animal detection, (ii) small arm detection and (iii) human being detection.
Abstract: Deep learning has developed as an effective machine learning method that takes in numerous layers of features or representation of the data and provides state-of-the-art results. The application of deep learning has shown impressive performance in various application areas, particularly in image classification, segmentation and object detection. Recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we provide a detailed review of various deep architectures and model highlighting characteristics of particular model. Firstly, we described the functioning of CNN architectures and its components followed by detailed description of various CNN models starting with classical LeNet model to AlexNet, ZFNet, GoogleNet, VGGNet, ResNet, ResNeXt, SENet, DenseNet, Xception, PNAS/ENAS. We mainly focus on the application of deep learning architectures to three major applications, namely (i) wild animal detection, (ii) small arm detection and (iii) human being detection. A detailed review summary including the systems, database, application and accuracy claimed is also provided for each model to serve as guidelines for future work in the above application areas.

435 citations


Journal ArticleDOI
TL;DR: A hybrid approach that combines the synthetic minority oversampling technique with ensemble methods is proposed that proves that the proposed approach can be used as an efficient alternative in case of highly imbalanced datasets.
Abstract: Bankruptcy is one of the most critical financial problems that reflects the company’s failure. From a machine learning perspective, the problem of bankruptcy prediction is considered a challenging one mainly because of the highly imbalanced distribution of the classes in the datasets. Therefore, developing an efficient prediction model that is able to detect the risky situation of a company is a challenging and complex task. To tackle this problem, in this paper, we propose a hybrid approach that combines the synthetic minority oversampling technique with ensemble methods. Moreover, we apply five different feature selection methods to find out what are the most dominant attributes on bankruptcy prediction. The proposed approach is evaluated based on a real dataset collected from Spanish companies. The conducted experiments show promising results, which prove that the proposed approach can be used as an efficient alternative in case of highly imbalanced datasets.

61 citations


Journal ArticleDOI
TL;DR: A modification of the backpropagation algorithm for the sigmoid neurons training is proposed that suggests that the derivative’s modification produces the same accuracy in fewer training steps on most datasets.
Abstract: The vanishing gradient problem (VGP) is an important issue at training time on multilayer neural networks using the backpropagation algorithm. This problem is worse when sigmoid transfer functions are used, in a network with many hidden layers. However, the sigmoid function is very important in several architectures such as recurrent neural networks and autoencoders, where the VGP might also appear. In this article, we propose a modification of the backpropagation algorithm for the sigmoid neurons training. It consists of adding a small constant to the calculation of the sigmoid’s derivative so that the proposed training direction differs slightly from the gradient while keeping the original sigmoid function in the network. This approach suggests that the derivative’s modification produces the same accuracy in fewer training steps on most datasets. Moreover, due to VGP, the original derivative does not converge using sigmoid functions on more than five hidden layers. However, the modification allows backpropagation to train two extra hidden layers in feedforward neural networks.

46 citations


Journal ArticleDOI
TL;DR: The main motivation for this work is to automate the construction of similarity measures using machine learning while keeping training time as low as possible, and to investigate how to apply machine learning to effectively learn a similarity measure.
Abstract: Defining similarity measures is a requirement for some machine learning methods. One such method is case-based reasoning (CBR) where the similarity measure is used to retrieve the stored case or a set of cases most similar to the query case. Describing a similarity measure analytically is challenging, even for domain experts working with CBR experts. However, datasets are typically gathered as part of constructing a CBR or machine learning system. These datasets are assumed to contain the features that correctly identify the solution from the problem features; thus, they may also contain the knowledge to construct or learn such a similarity measure. The main motivation for this work is to automate the construction of similarity measures using machine learning. Additionally, we would like to do this while keeping training time as low as possible. Working toward this, our objective is to investigate how to apply machine learning to effectively learn a similarity measure. Such a learned similarity measure could be used for CBR systems, but also for clustering data in semi-supervised learning, or one-shot learning tasks. Recent work has advanced toward this goal which relies on either very long training times or manually modeling parts of the similarity measure. We created a framework to help us analyze the current methods for learning similarity measures. This analysis resulted in two novel similarity measure designs: The first design uses a pre-trained classifier as basis for a similarity measure, and the second design uses as little modeling as possible while learning the similarity measure from data and keeping training time low. Both similarity measures were evaluated on 14 different datasets. The evaluation shows that using a classifier as basis for a similarity measure gives state-of-the-art performance. Finally, the evaluation shows that our fully data-driven similarity measure design outperforms state-of-the-art methods while keeping training time low.

34 citations


Journal ArticleDOI
TL;DR: Cost-sensitive random forests over-performed other approaches in predicting bankruptcy, achieving a geometric mean of 90.7%, 0.094 and 0.088 type I & type II errors, respectively.
Abstract: Bankruptcy is an issue of interest in the business world since decades. It is a crucial endeavor for survival to predict this phenomenon in periods of economic turmoil and recession. In fact, bankruptcy modeling is challenging due to the complexity of contributing factors and the highly imbalanced distribution of available data sets. This work aims at improving the prediction power of bankruptcy modeling, by applying cost-sensitive ensemble methods on a real-world Spanish bankruptcy data set to generate prediction models. The performance of the prediction models is highly competitive in comparison with the related research in the field. Cost-sensitive random forests over-performed other approaches in predicting bankruptcy, achieving a geometric mean of 90.7%, 0.094 and 0.088 type I & type II errors, respectively.

22 citations


Journal ArticleDOI
TL;DR: This paper aims to provide an in-depth presentation of the contributions of multi-attribute value-based and outranking relations methods to a group of relevant financial applications in the period 2000–2018, putting the emphasis on the state-of-the-art developments and identifying open questions and critical challenges that deserve further research efforts.
Abstract: Over the last decades, the academic and professional communities have paid much attention toward the use of multi-criteria decision-making methods in a range of business and financial problems due to the variety and complexity of their decisions. Within this branch of operations research, the value-based and outranking relations approaches stand as two of the most powerful methodologies for decision-makers and analysts to produce accurate predictions and consistent evaluations in financial decision-making problems. This paper aims to provide an in-depth presentation of the contributions of multi-attribute value-based and outranking relations methods to a group of relevant financial applications in the period 2000–2018, putting the emphasis on the state-of-the-art developments and identifying open questions and critical challenges that deserve further research efforts.

21 citations


Journal ArticleDOI
TL;DR: The application of machine learning methods for the formation and adaptation of EDMS interface allows you to automate the process of personalizing it to the user’s individual characteristics, increase the system's flexibility and provide the best user experience at the first interaction with EDMS based on the intelligent analysis of data about other users.
Abstract: The topical problem in the development of electronic document management systems (EDMS) is their adaptation and personalization to the individual characteristics of the user. This article discusses the issue of development of an adaptation algorithm using machine learning methods for solving the problem of structural-parametric synthesis of EDMS. In the framework of the presented algorithm, the approaches to the formalization of workflow processes, ways to adapt the interface to the user parameters using artificial neural networks and a comprehensive assessment of the system’s adaptability are considered. The scientific novelty of the approach consists in the algorithmic and software development for automation of the data collection, analysis and interface adaptation through the use and integration of neural networks in the information system. The application of machine learning methods for the formation and adaptation of EDMS interface allows you to automate the process of personalizing it to the user’s individual characteristics, increase the system’s flexibility and provide the best user experience at the first interaction with EDMS based on the intelligent analysis of data about other users. The main scientific results obtained in the article include: formalized criteria for adapting EDMS; algorithm for designing and adapting EDMS; and development of software for adapting EDMS, including a trained neural network and API.

13 citations


Journal ArticleDOI
TL;DR: A Tag - based Query Semantic Reformulation process, which aims at reformulating the tag-based users’ queries, according to multiple semantic facets of the different images’ views, by using a set of predefined ontological semantic rules is proposed.
Abstract: With the increasing popularity of social photograph-sharing Web sites, a huge mass of digital images, associated with a set of tags voluntarily introduced by amateur photographers, is daily hosted and consequently, the Tag-based social Image Retrieval technique has been widely adopted. However, tag-based queries are often too ambiguous and abstract to be considered as an efficient solution for the retrieval of the most relevant images that meet the users’ needs. As an alternative, the Semantic-based social Image Retrieval technique has emerged for the purpose of retrieving the relevant images covering as much possible the topics that a given ambiguous query (q) may have. Actually, the diversification strategies are a great challenge for researchers. In this context, we jointly investigate two processes at the ambiguous query preprocessing and postprocessing levels. On the one hand, we propose a Tag-based Query Semantic Reformulation process, which aims at reformulating the tag-based users’ queries, according to multiple semantic facets of the different images’ views, by using a set of predefined ontological semantic rules. On the other hand, we propose a Multi-level Image Diversification process that can first perform a two-level-based image clustering offline, and second, filter and re-rank the image cluster retrieval results according to their pertinence versus the reformulated query online. The experimental results and statistical analysis performed on a collection of 25.000 socio-tagged images shared on Flickr demonstrate the effectiveness of the proposed technique, which is compared with the research technique based on one-level-based image clustering, tag-based image research technique and recent CBIR techniques.

12 citations


Journal ArticleDOI
TL;DR: A protocol is proposed to describe the review process, including the search sources, inclusion and exclusion criteria of candidate papers, the data extraction procedure and the categorisation of primary studies, which gives a precise picture of the current research state of the community, trends and future challenges.
Abstract: Since its appearance in 2001, search-based software engineering has allowed software engineers to use optimisation techniques to automate distinctive human problems related to software management and development. The scientific community in Spain has not been alien to these advances. Their contributions cover both the optimisation of software engineering tasks and the proposal of new search algorithms. This review compiles the research efforts of this community in the area. With this aim, we propose a protocol to describe the review process, including the search sources, inclusion and exclusion criteria of candidate papers, the data extraction procedure and the categorisation of primary studies. After retrieving more than 3700 papers, 232 primary studies have been selected, whose analysis gives a precise picture of the current research state of the community, trends and future challenges. With 145 authors from 19 distinct institutions, results show that a diversity of tasks, including software planning, requirements, design and testing, and a large variety of techniques has been used, from exact search to evolutionary computation and swarm intelligence. Further, since 2015, specific scientific events have helped to bring together the community, improving collaborations, financial funding and internationalisation.

5 citations


Journal ArticleDOI
TL;DR: An empirical mode decomposition (EMD)-based replay spoofing detection system is presented and it is shown that there is a potential in initial IMFs to carry replay attack patterns, and that is sufficient rather than processing the entire signal.
Abstract: Automatic speaker verification (ASV) systems have maximum threat from replay spoofing attacks. High frequency regions of the underlying audio signal exhibit the phenomenon about their presence. It is therefore useful to decompose the underlying audio signal into frequency bands or regions for possible analysis. In this paper, an empirical mode decomposition (EMD)-based replay spoofing detection system is presented. Using EMD, each signal is decomposed into several monotonic intrinsic mode functions (IMFs). The signal is reconstructed and represented using one or more subsets of these IMFs by performing different combinations for spoofing detection. Results on ASVspoof 2017 version 2.0 and AVspoof benchmark replay attack datasets indicate that there is a potential in initial IMFs to carry replay attack patterns, and that is sufficient rather than processing the entire signal. The proposed approach can also serve as a preprocessing technique by employing dimension reduction strategy. Cross-corpus experiments on the systems indicate the limitations of ASV antispoofing systems due to mismatched conditions.

4 citations


Journal ArticleDOI
TL;DR: Results showed the potential of motor current signal in bearing fault diagnosis with high classification accuracy and the possibility to provide a promised diagnostic model that can diagnose bearings of real faults with different fault severities using MCS.
Abstract: This study aims to enhance the condition monitoring of external ball bearings using the raw data provided by Paderborn University which provided sufficient data for motor current signal MCS. Three classes of bearings have been used: healthy bearings, bearings with an inner race defect, and bearings with outer race defect. Online data at different operating conditions, bearings, and faults extent of artificial and real damages have been chosen to provide the generalization and robustness of the model. After proper preprocessing to the raw data of vibration and MCS, time, frequency, and time–frequency domain features have been extracted. Then, optimal features have been selected using genetic algorithm. Artificial neural network with optimized structure using genetic algorithm has been implemented. A comparison between the performance of vibration and motor current signal has been presented. Moreover, our results are compared to previous work by using the same raw data. Results showed the potential of motor current signal in bearing fault diagnosis with high classification accuracy. Moreover, the results showed the possibility to provide a promised diagnostic model that can diagnose bearings of real faults with different fault severities using MCS.

Journal ArticleDOI
TL;DR: An algorithm is introduced, called Large Width (LW), that produces a multi-category classifier (defined on a distance space) with the property that the classifier has a large ‘sample width’ (Width is a notion similar to classification margin.)
Abstract: We introduce an algorithm, called Large Width (LW), that produces a multi-category classifier (defined on a distance space) with the property that the classifier has a large ‘sample width.’ (Width is a notion similar to classification margin.) LW is an incremental instance-based (also known as ‘lazy’) learning algorithm. Given a sample of labeled and unlabeled examples, it iteratively picks the next unlabeled example and classifies it while maintaining a large distance between each labeled example and its nearest-unlike prototype. (A prototype is either a labeled example or an unlabeled example which has already been classified.) Thus, LW gives a higher priority to unlabeled points whose classification decision ‘interferes’ less with the labeled sample. On a collection UCI benchmark datasets, the LW algorithm ranks at the top when compared to 11 instance-based learning algorithms (or configurations). When compared to the best candidate from instance-based learners, MLP, SVM, decision tree learner (C4.5) and Naive Bayes, LW is ranked at second place after only MLP which comes at first place by a single extra win against LW. The LW algorithm can be implemented in parallel distributed processing to yield a high speedup factor and is suitable for any distance space, with a distance function which need not necessarily satisfy the conditions of a metric.

Journal ArticleDOI
TL;DR: This work presents the implementation and evaluation of a clustering algorithm based on a multi-agent system, which automatically detects the number of groups and the group labels for a given dataset.
Abstract: Clustering algorithms aim to detect groups based on similarity, from a given set of objects. Many clustering techniques have been proposed, most requiring the user to set critical parameters, such as the number of groups. This work presents the implementation and evaluation of a clustering algorithm based on a multi-agent system, which automatically detects the number of groups and the group labels for a given dataset. Groups formed during the clustering process emerge as patterns from the interaction among agents. The proposed algorithm is experimentally validated over benchmark datasets from the literature. The quality of clustering results is computed using seven internal indexes and one external index. Under this methodology, the proposed algorithm is compared to K-means and DBSCAN (density-based spatial clustering of applications with noise).

Journal ArticleDOI
TL;DR: SMPSO/RPD is proposed, an algorithm that provides the search capabilities of SMPSO, incorporates an interactive preference articulation mechanism based on defining one or more reference points, and is able to deal with dynamic problems.
Abstract: Multi-objective optimization deals with problems having two or more conflicting objectives that have to be optimized simultaneously. When the objectives change somehow with time, the problems become dynamic, and if the decision maker indicates preferences at runtime, then the algorithms to solve them become interactive. In this paper, we propose the integration of SMPSO/RP, an interactive multi-objective particle swarm optimizer based on SMPSO, with InDM2, an algorithmic template for dynamic interactive optimization with metaheuristics. The result is SMPSO/RPD, an algorithm that provides the search capabilities of SMPSO, incorporates an interactive preference articulation mechanism based on defining one or more reference points, and is able to deal with dynamic problems. We conduct a qualitative study showing the working of SMPSO/RPD on three benchmark problems, remaining a qualitative analysis as an open line of future research.

Journal ArticleDOI
TL;DR: The combination of triplet loss manifold regularization with a novel denoising regularizer is injected to the objective function to generate features which are robust against perpendicular perturbation around data manifold and are sensitive enough to variation along the manifold.
Abstract: Although the regularized over-complete auto-encoders have shown great ability to extract meaningful representation from data and reveal the underlying manifold of them, their unsupervised learning nature prevents the consideration of class distinction in the representations. The present study aimed to learn sparse, robust, and discriminative features through supervised manifold regularized auto-encoders by preserving locality on the manifold directions around each data and enhancing between-class discrimination. The combination of triplet loss manifold regularization with a novel denoising regularizer is injected to the objective function to generate features which are robust against perpendicular perturbation around data manifold and are sensitive enough to variation along the manifold. Also, the sparsity ratio of the obtained representation is adaptive based on the data distribution. The experimental results on 12 real-world classification problems show that the proposed method has better classification performance in comparison with several recently proposed relevant models.

Journal ArticleDOI
TL;DR: The comparison experiments show that the proposed unsupervised keyphrase extraction model achieves the best results in the long documents and a competitive result in the short document, indicating that the model is effective and is superior to the state-of-the-art un supervised models.
Abstract: We proposed an unsupervised keyphrase extraction model that incorporates the structural information and the semantic information of a document. The structural information refers to the directed graph that is composed of keyphrase candidates and topics. The weight between two candidates is computed by their relative distance in the document and the positions of the corresponding sentences. Graph ranking algorithm is then applied to get the structural scores of the candidates. Then, the semantic score is obtained by the similarity between candidate and all sentences. The final score of a candidate is the sum of the structural score and the semantic score. The top N candidates with the highest scores are selected as the recommended keyphrases. The comparison experiments on three widely used datasets show that our model achieves the best results in the long documents and a competitive result in the short document. It indicates that our model is effective and is superior to the state-of-the-art unsupervised models.

Journal ArticleDOI
TL;DR: GDTM, a single-pass graph-based DTM algorithm, that combines a context-rich and incremental feature representation method with graph partitioning to address scalability and dynamicity and uses a rich language model to account for sparsity is presented.
Abstract: Dynamic Topic Modeling (DTM) is the ultimate solution for extracting topics from short texts generated in Online Social Networks (OSNs) like Twitter. It requires to be scalable and to be able to account for sparsity and dynamicity of short texts. Current solutions combine probabilistic mixture models like Dirichlet Multinomial or Pitman-Yor Process with approximate inference approaches like Gibbs Sampling and Stochastic Variational Inference to, respectively, account for dynamicity and scalability of DTM. However, these methods basically rely on weak probabilistic language models, which do not account for sparsity in short texts. In addition, their inference is based on iterative optimizations, which have scalability issues when it comes to DTM. We present GDTM, a single-pass graph-based DTM algorithm, to solve the problem. GDTM combines a context-rich and incremental feature representation method with graph partitioning to address scalability and dynamicity and uses a rich language model to account for sparsity. We run multiple experiments over a large-scale Twitter dataset to analyze the accuracy and scalability of GDTM and compare the results with four state-of-the-art models. In result, GDTM outperforms the best model by $$11\%$$ on accuracy and performs by an order of magnitude faster while creating four times better topic quality over standard evaluation metrics.

Journal ArticleDOI
TL;DR: Several encoding strategies based on neural networks are analyzed and applied and show that the order in which the musical pieces were listened to is relevant for the codification of items (songs), and that the encoding of user profiles should use a different amount of historical data depending on the learning task to be solved.
Abstract: The aim of Recommender Systems is to suggest items (products) to satisfy each user’s particular taste. Representation strategies play a very important role in these systems, as an adequate codification of users and items is expected to ease the induction of a model which synthesizes their tastes and make better recommendations. However, in addition to gathering information about users’ tastes, there is an additional aspect that can be relevant for a proper codification strategy, namely the order in which the user interacted with the items. In this paper, several encoding strategies based on neural networks are analyzed and applied to solve two different recommendation tasks in the context of music playlists. The results show that the order in which the musical pieces were listened to is relevant for the codification of items (songs). We also find that the encoding of user profiles should use a different amount of historical data depending on the learning task to be solved. In other words, we do not always have to use all the available data; sometimes, it is better to discard old information, as tastes change over time.

Journal ArticleDOI
TL;DR: This paper deals with a subtask arising within the picking task in a warehouse, when the picking policy follows the order batching strategy and orders are received online.
Abstract: Warehousing includes many different regular activities such as receiving, batching, picking, packaging, and shipping goods. Several authors indicate that the picking operation might consume up to 55% of the total operational costs. In this paper, we deal with a subtask arising within the picking task in a warehouse, when the picking policy follows the order batching strategy (i.e., orders are grouped into batches before being collected) and orders are received online. Particularly, once the batches have been compiled it is necessary to determine the moment in the time when the picker starts collecting each batch. The waiting time of the picker before starting to collect the next available batch is usually known as time window. In this paper, we compare the performance of two different time window strategies: Fixed Time Window and Variable Time Window. Since those strategies cannot be tested in isolation, we have considered: two different batching algorithms (First Come First Served and a Greedy algorithm based on weight); one routing algorithm (S-Shape); and a greedy selection algorithm for choosing the next batch to collect based on the weight.

Journal ArticleDOI
TL;DR: The experimental results on synthetic and real datasets, using the well-known neighborhood-based clustering (NBC) algorithm and the DBSCAN (density-based spatial clustering of applications with noise) algorithm, illustrate the superiority of the proposed index over some classical and recent indices and show its effectiveness for the evaluation of clustering algorithms and the selection of their appropriate parameters.
Abstract: Clustering has an important role in data mining field. However, there is a large variety of clustering algorithms and each could generate quite different results depending on input parameters. In the research literature, several cluster validity indices have been proposed to evaluate clustering results and find the partition that best fits the input dataset. However, these validity indices may fail to achieve satisfactory results, especially in case of clusters with arbitrary shapes. In this paper, we propose a new cluster validity index for density-based, arbitrarily shaped clusters. Our new index is based on the density and connectivity relations extracted among the data points, based on the proximity graph, Gabriel graph. The incorporation of the connectivity and density relations allows achieving the best clustering results in the case of clusters with any shape, size or density. The experimental results on synthetic and real datasets, using the well-known neighborhood-based clustering (NBC) algorithm and the DBSCAN (density-based spatial clustering of applications with noise) algorithm, illustrate the superiority of the proposed index over some classical and recent indices and show its effectiveness for the evaluation of clustering algorithms and the selection of their appropriate parameters.

Journal ArticleDOI
TL;DR: This paper analyzes some open-source technological frameworks available for data streams, detailing their main characteristics, and makes a performance and latency comparison between Spark Streaming, Spark Structured Streaming, Storm, Flink and Samza following the Yahoo Streaming Benchmark methodology.
Abstract: Real-time data analysis is becoming increasingly important in Big Data environments for addressing data stream issues To this end, several technological frameworks have been developed, both open-source and proprietary, for the analysis of streaming data This paper analyzes some open-source technological frameworks available for data streams, detailing their main characteristics The objective is to facilitate decisions on which framework to use, meeting the needs of data mining methods for data streams In this sense, there are important factors affecting the choice about which framework is most suitable for this purpose Some of these factors are the existence of data mining libraries, the available documentation, the maturity of the platform, fault tolerance and processing guarantees, among others Another decisive factor when choosing a data stream framework is its performance For this reason, two comparisons have been made: a performance and latency comparison between Spark Streaming, Spark Structured Streaming, Storm, Flink and Samza following the Yahoo Streaming Benchmark methodology, and a comparison between Spark Streaming and Flink with a clustering algorithm for data streaming called streaming K-means

Journal ArticleDOI
TL;DR: This paper aims to compute similarity between the words using their context information, syntactic information and occurrence statistics in external corpora through a kernel function that combines two sub-kernels.
Abstract: Performance of word sequential labelling tasks like named entity recognition and parts-of-speech tagging largely depends on the features chosen in the task. But, in general representing a word as well as capturing its characteristics properly through a set of features is quite difficult. Moreover, external resources often become essential in order to build a high-performance system. But, acquiring required knowledge demands domain-specific processing and feature engineering. Kernel functions along with support vector machine may offer an alternative way to more efficiently capture similarity between words using both the local context and the external corpora. In this paper, we aim to compute similarity between the words using their context information, syntactic information and occurrence statistics in external corpora. This similarity value is gathered through a kernel function. The proposed kernel function combines two sub-kernels. One of these captures global information through words co-occurrence statistics accumulated from a large corpora. The second kernel captures local semantic information of the words through word specific parse tree fragmentation. We test this proposed kernel using JNLPBA 2004 Biomedical Named Entity Recognition and BioCreative II 2006 Gene Mention Recognition task data-sets. In our experiments, we observe that the proposed method is effective on both the data-sets.

Journal ArticleDOI
TL;DR: This paper applies a class of Bayesian models that have been successfully used in streaming data context, to the problem of comparing multinomial populations, and shows how it is possible, by means of a relevant parameter, to decide whether two populations are different or not.
Abstract: Two-sample statistical tests are commonly used when deciding whether two samples can be considered to be drawn from the same population. However, statistical tests face problems when confronted to situations involving extremely large volumes of data, in which case the power of the test is so high that they reject the null hypothesis even if the differences found in the data are minimal. Furthermore, the fact that they may require to explore the whole sample each time they are applied is a serious limitation, for instance, in streaming data contexts. In this paper, we apply a class of Bayesian models that have been successfully used in streaming data context, to the problem of comparing multinomial populations. The underlying tool is latent variable models with hierarchical power priors. We show how it is possible, by means of a relevant parameter, to decide whether two populations are different or not.

Journal ArticleDOI
TL;DR: The proposed “coaching” approach focused on helping to accelerate learning for the system with a sparse environmental reward setting with linear epsilon-greedy Q-learning with eligibility traces and shows that this method could speed up the learning process of an agent in all tasks.
Abstract: The learning process in reinforcement learning is time-consuming because on early episodes agent relies too much on exploration. The proposed “coaching” approach focused on helping to accelerate learning for the system with a sparse environmental reward setting. This approach works well with linear epsilon-greedy Q-learning with eligibility traces. To coach an agent, an intermediate target is given by a human coach as a sub-goal for the agent to pursue. This sub-goal provides an additional clue that guides the agent toward the actual terminal state. In the coaching phase, the agent pursues an intermediate target with an aggressive policy. The aggressive reward from this intermediate target would not be used to update the state-action value directly but the environmental reward is used. After a small number of coaching episodes, the learning would proceed normally with an $$\epsilon $$-greedy policy. In this way, the agent will end up with an optimal policy which is not under influence or supervision of a human coach. The proposed method has been tested on three experimental tasks: mountain car, ball following, and obstacle avoidance. Even with the human coach of various skill levels, the experimental results show that this method could speed up the learning process of an agent in all tasks.