scispace - formally typeset
Search or ask a question
Author

Jorge Rodríguez

Bio: Jorge Rodríguez is an academic researcher from Monterrey Institute of Technology and Higher Education. The author has contributed to research in topics: Abstraction (linguistics) & Feature model. The author has an hindex of 6, co-authored 6 publications receiving 134 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A pattern-based classification mechanism is used to social bot detection, specifically for Twitter, and a new feature model is introduced, which extends (part of) an existing model with features out of Twitter account usage and tweet content sentiment analysis.
Abstract: Detecting non-human activity in social networks has become an area of great interest for both industry and academia. In this context, obtaining a high detection accuracy is not the only desired quality; experts in the application domain would also like having an understandable model, with which one may explain a decision. An explanatory decision model may help experts to consider, for example, taking legal action against an account that has displayed offensive behavior, or forewarning an account holder about suspicious activity. In this paper, we shall use a pattern-based classification mechanism to social bot detection, specifically for Twitter. Furthermore, we shall introduce a new feature model for social bot detection, which extends (part of) an existing model with features out of Twitter account usage and tweet content sentiment analysis. From our experimental results, we shall see that our mechanism outperforms other, state-of-the-art classifiers, not based on patterns; and that our feature model yields better classification results than others reported on in the literature.

53 citations

Journal ArticleDOI
TL;DR: This paper introduces and applies a protocol that evaluates minutia descriptors for latent fingerprint identification in terms of the identification rate plotted in the cumulative match characteristic (CMC) curve, and finds that all the evaluated minutian descriptors obtained identification rates lower than 10% for Rank−1 and 24% forRank−100 comparing the minutiae in the database NIST SD27.
Abstract: Latent fingerprint identification is attracting increasing interest because of its important role in law enforcement. Although the use of various fingerprint features might be required for successful latent fingerprint identification, methods based on minutiae are often readily applicable and commonly outperform other methods. However, as many fingerprint feature representations exist, we sought to determine if the selection of feature representation has an impact on the performance of automated fingerprint identification systems. In this paper, we review the most prominent fingerprint feature representations reported in the literature, identify trends in fingerprint feature representation, and observe that representations designed for verification are commonly used in latent fingerprint identification. We aim to evaluate the performance of the most popular fingerprint feature representations over a common latent fingerprint database. Therefore, we introduce and apply a protocol that evaluates minutia descriptors for latent fingerprint identification in terms of the identification rate plotted in the cumulative match characteristic (CMC) curve. From our experiments, we found that all the evaluated minutia descriptors obtained identification rates lower than 10% for Rank-1 and 24% for Rank-100 comparing the minutiae in the database NIST SD27, illustrating the need of new minutia descriptors for latent fingerprint identification.

51 citations

Journal ArticleDOI
TL;DR: A new cluster validity index is proposed, which attempts to avoid this bias using an ensemble of distinct supervised classifiers, this way the bias is not attributable to a specific classifier, but to a collection thereof, hence alleviating the problem.
Abstract: A cluster validity index is used to select which clustering algorithm to apply for a given problem. It works by evaluating the quality of a partition, as output by a candidate clustering algorithm, getting around the common case of the lack of an expert in the given domain of discourse. Most existing validity indexes make assumptions, such as each cluster of the partition having an underlying structure, for example, a hypersphere, yielding incorrect evaluations when they do not hold. Here, we propose a new cluster validity index, which attempts to avoid this bias using an ensemble of distinct supervised classifiers; this way the bias is not attributable to a specific classifier, but to a collection thereof, hence alleviating the problem. The rationale behind our index is that a good partition should induce the construction of also a good classifier; the better the classification performance, the better the quality of the partition under evaluation. Notice how we use the partition to be assessed as a sort of labeled dataset, where each object is labeled with the cluster label it belongs to. We have tested our index on 50 numerical datasets, grouped using six different clustering algorithms. In our experiments, our index outperforms five validity indexes, including the most popular ones.

30 citations

Journal ArticleDOI
29 Sep 2016-Sensors
TL;DR: This study introduces the One-Class K-means with Randomly-projected features Algorithm (OCKRA), an ensemble of one-class classifiers built over multiple projections of a dataset according to random feature subsets to improve the detection performance in the problem posed by the Personal RIsk DEtection dataset.
Abstract: This study introduces the One-Class K-means with Randomly-projected features Algorithm (OCKRA). OCKRA is an ensemble of one-class classifiers built over multiple projections of a dataset according to random feature subsets. Algorithms found in the literature spread over a wide range of applications where ensembles of one-class classifiers have been satisfactorily applied; however, none is oriented to the area under our study: personal risk detection. OCKRA has been designed with the aim of improving the detection performance in the problem posed by the Personal RIsk DEtection(PRIDE) dataset. PRIDE was built based on 23 test subjects, where the data for each user were captured using a set of sensors embedded in a wearable band. The performance of OCKRA was compared against support vector machine and three versions of the Parzen window classifier. On average, experimental results show that OCKRA outperformed the other classifiers for at least 0.53% of the area under the curve (AUC). In addition, OCKRA achieved an AUC above 90% for more than 57% of the users.

19 citations

Book ChapterDOI
17 Sep 2014
TL;DR: This paper has conducted two experiments for distinguishing the performance of two one-class classifiers, namely: Naive Bayes and Markov chains, considering single objects and the abstraction to user tasks, and shall see that in both cases, the task-based masquerader detector outperforms the individual object-based one.
Abstract: Nowadays, computers store critical information, prompting the development of mechanisms aimed to timely detect any kind of intrusion. Some of such mechanisms, called masquerade detectors, are often designed to signal an alarm whenever they detect an anomaly in system behavior. Usually, the profile of ordinary system behavior is built out of a history of command execution. However, in [1,2], we suggested that it is not a command, but the object upon which it is carried out what may distinguish a masquerade from user participation; also, we hypothesized that this approach provides a means for building masquerade detectors that work at a higher-level of abstraction. In this paper, we report on a successful step towards this hypothesis validation. The crux of our abstraction stems from that a directory often holds closely related objects, resembling a user task; thus, we do not have to account for the accesses to individual objects; instead, we simply take it to be an access to some ancestor directory of it, the user task. Indeed, we shall prove that by looking into the access to only a few such user tasks, we can build a masquerade detector, just as powerful as if we looked into the access to every single file system object. The advantages of this abstraction are paramount: it eases the construction and maintenance of a masquerade detection mechanism, as it yields much shorter models. Using the WUIL dataset [2], we have conducted two experiments for distinguishing the performance of two one-class classifiers, namely: Naive Bayes and Markov chains, considering single objects and our abstraction to user tasks. We shall see that in both cases, the task-based masquerader detector outperforms the individual object-based one.

19 citations


Cited by
More filters
01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations

Journal ArticleDOI
TL;DR: The focus of this review is to provide in-depth and comprehensive analysis of data fusion and multiple classifier systems techniques for human activity recognition with emphasis on mobile and wearable devices.

262 citations

Journal ArticleDOI
TL;DR: Both explainable and black-box models are suitable for solving practical problems, but experts in machine learning need to understand the input data, the problem to solve, and the best way for showing the output data before applying a machine learning model.
Abstract: Nowadays, in the international scientific community of machine learning, there exists an enormous discussion about the use of black-box models or explainable models; especially in practical problems. On the one hand, a part of the community defends that black-box models are more accurate than explainable models in some contexts, like image preprocessing. On the other hand, there exist another part of the community alleging that explainable models are better than black-box models because they can obtain comparable results and also they can explain these results in a language close to a human expert by using patterns. In this paper, advantages and weaknesses for each approach are shown; taking into account a state-of-the-art review for both approaches, their practical applications, trends, and future challenges. This paper shows that both approaches are suitable for solving practical problems, but experts in machine learning need to understand the input data, the problem to solve, and the best way for showing the output data before applying a machine learning model. Also, we propose some ideas for fusing both, explainable and black-box, approaches to provide better solutions to experts in real-world domains. Additionally, we show one way to measure the effectiveness of the applied machine learning model by using expert opinions jointly with statistical methods. Throughout this paper, we show the impact of using explainable and black-box models on the security and medical applications.

205 citations

Journal ArticleDOI
TL;DR: This paper is the first systematic review based on a predefined search strategy of literature concerned about social media bots detection methods, published between 2010 and 2019, and includes a refined taxonomy of detection methods.
Abstract: Social media bots (automated accounts) attacks are organized crimes that pose potential threats to public opinion, democracy, public health, stock market and other disciplines. While researchers are building many models to detect social media bot accounts, attackers, on the other hand, evolve their bots to evade detection. This everlasting cat and mouse game makes this field vibrant and demands continuous development. To guide and enhance future solutions, this work provides an overview of social media bots attacks, current detection methods and challenges in this area. To the best of our knowledge, this paper is the first systematic review based on a predefined search strategy, which includes literature concerned about social media bots detection methods, published between 2010 and 2019. The results of this review include a refined taxonomy of detection methods, a highlight of the techniques used to detect bots in social media and a comparison between current detection methods. Some of the gaps identified by this work are: the literature mostly focus on Twitter platform only and rarely use methods other than supervised machine learning, most of the public datasets are not accurate or large enough, integrated systems and real-time detection are required, and efforts to spread awareness are needed to arm legitimate users with knowledge.

101 citations

Journal ArticleDOI
TL;DR: In this article, a structural taxonomy of insider threat incidents is presented, which is based on existing taxonomies and the 5W1H questions of the information gathering problem.
Abstract: Insider threats are one of today's most challenging cybersecurity issues that are not well addressed by commonly employed security solutions. Despite several scientific works published in this domain, we argue that the field can benefit from the proposed structural taxonomy and novel categorization of research that contribute to the organization and disambiguation of insider threat incidents and the defense solutions used against them. The objective of our categorization is to systematize knowledge in insider threat research, while leveraging existing grounded theory method for rigorous literature review. The proposed categorization depicts the workflow among particular categories that include: 1) Incidents and datasets, 2) Analysis of attackers, 3) Simulations, and 4) Defense solutions. Special attention is paid to the definitions and taxonomies of the insider threat; we present a structural taxonomy of insider threat incidents, which is based on existing taxonomies and the 5W1H questions of the information gathering problem. Our survey will enhance researchers' efforts in the domain of insider threat, because it provides: a) a novel structural taxonomy that contributes to orthogonal classification of incidents and defining the scope of defense solutions employed against them, b) an updated overview on publicly available datasets that can be used to test new detection solutions against other works, c) references of existing case studies and frameworks modeling insiders' behaviors for the purpose of reviewing defense solutions or extending their coverage, and d) a discussion of existing trends and further research directions that can be used for reasoning in the insider threat domain.

87 citations