Showing papers by "Ifeoma Nwogu published in 2015"

PDF

Open Access

Proceedings Article•DOI•

Malware detection via API calls, topic models and machine learning

[...]

G. Ganesh Sundarkumar¹, Vadlamani Ravi, Ifeoma Nwogu², Venu Govindaraju•Institutions (2)

University of Hyderabad¹, University at Buffalo²

08 Oct 2015

TL;DR: This work presents a model that uses text mining and topic modeling to detect malware, based on the types of API call sequences, and recommends Decision Tree as it yields `if-then' rules, which could be used as an early warning expert system.

...read moreread less

Abstract: Dissemination of malicious code, also known as malware, poses severe challenges to cyber security Malware authors embed software in seemingly innocuous executables, unknown to a user The malware subsequently interacts with security-critical OS resources on the host system or network, in order to destroy their information or to gather sensitive information such as passwords and credit card numbers Malware authors typically use Application Programming Interface (API) calls to perpetrate these crimes We present a model that uses text mining and topic modeling to detect malware, based on the types of API call sequences We evaluated our technique on two publicly available datasets We observed that Decision Tree and Support Vector Machine yielded significant results We performed t-test with respect to sensitivity for the two models and found that statistically there is no significant difference between these models We recommend Decision Tree as it yields ‘if-then’ rules, which could be used as an early warning expert system

...read moreread less

31 citations

Proceedings Article•DOI•

Automated analysis of line plots in documents

[...]

Rathin Radhakrishnan Nair¹, Nishant Sankaran¹, Ifeoma Nwogu¹, Venu Govindaraju¹•Institutions (1)

University at Buffalo¹

23 Aug 2015

TL;DR: This work contends that the message or extracted information can be used to help better understand the ideas conveyed in the document and achieves a classification accuracy of 91% across the dataset and successfully extracts the axes from 92% of line plots.

...read moreread less

Abstract: Information graphics, such as graphs and plots, are used in technical documents to convey information to humans and to facilitate greater understanding. Usually, graphics are a key component in a technical document, as they enable the author to convey complex ideas in a simplified visual format. However, in an automatic text recognition system, which are typically used to digitize documents, the ideas conveyed in a graphical format are lost. We contend that the message or extracted information can be used to help better understand the ideas conveyed in the document. In scientific papers, line plots are the most commonly used graphic to represent experimental results in the form of correlation present between values represented on the axes. The contribution of our work is in the series of image processing algorithms that are used to automatically extract relevant information, including text and plot from graphics found in technical documents. We validate the approach by performing the experiments on a dataset of line plots obtained from scientific documents from computer science conference papers and evaluate the variation of a reconstructed curve from the original curve. Our algorithm achieves a classification accuracy of 91% across the dataset and successfully extracts the axes from 92% of line plots. Axes label extraction and line curve tracing are performed successfully in about half the line plots as well.

...read moreread less

22 citations

Book Chapter•DOI•

Document Informatics for Scientific Learning and Accelerated Discovery

[...]

Venu Govindaraju¹, Ifeoma Nwogu¹, Srirangaraj Setlur¹•Institutions (1)

University at Buffalo¹

01 Jan 2015-Handbook of Statistics

TL;DR: The use of technology is used to extract “deep” meaning from a large corpus of relevant materials science documents to enable faster recognition and use of important theoretical, computational, and experimental information aggregated from peer-reviewed and published materials-related scientific documents online.

...read moreread less

Abstract: This chapter presents a concept paper that describes methods to accelerate new materials discovery and optimization, by enabling faster recognition and use of important theoretical, computational, and experimental information aggregated from peer-reviewed and published materials-related scientific documents online. To obtain insights for the discovery of new materials and to study about existing materials, research and development scientists and engineers rely heavily on an ever-growing number of materials research publications, mostly available online, and that date back many decades. So, the major thrust of this concept paper is the use of technology to (i) extract “deep” meaning from a large corpus of relevant materials science documents; (ii) navigate, cluster, and present documents in a meaningful way; and (iii) evaluate and revise the materials-related query responses until the researchers are guided to their information destination. While the proposed methodology targets the interdisciplinary field of materials research, the tools to be developed can be generalized to enhance scientific discoveries and learning across a broad swathe of disciplines. The research will advance the machine-learning area of developing hierarchical, dynamic topic models to investigate trends in materials discovery over user-specified time periods. Also, the field of image-based document analysis will benefit tremendously from machine learning tools such as the use of deep belief networks for classification and text separation from document images. Developing an interactive visualization tool that can display modeling results from a large materials network perspective as well as a time-based perspective is an advancement in visualization studies.

...read moreread less

1 citations

Book Chapter•DOI•

A Large-Scale Study of Language Usage as a Cognitive Biometric Trait

[...]

Neeti Pokhriyal¹, Ifeoma Nwogu¹, Venu Govindaraju¹•Institutions (1)

State University of New York System¹

01 Jan 2015-Handbook of Statistics

TL;DR: Whether the cognitive state of a person can be learned and used as a soft biometric trait is studied and the authors' large-scale experimental setup, which yielded encouraging results, is discussed.

...read moreread less

Abstract: In this chapter, we discuss a novel biometric trait, called cognitive biometrics . It is defined as the process of identifying an individual through extracting and matching unique signatures based on the cognitive, affective, and conative state of that individual. Currently, there is an increasing need for novel biometric systems that engage multiple modalities because of the changing notion of privacy in today’s world. Also, cognitive biometrics will become essential as pervasive computing becomes more prevalent, where computing can happen anywhere and anytime. The use of cognitive traits for biometrics is relatively underexplored in the research community. We study whether the cognitive state of a person can be learned and used as a soft biometric trait and discuss our large-scale experimental setup, which yielded encouraging results.

...read moreread less