Home
/
Authors
/
Randall Wald

Author

Randall Wald

Bio: Randall Wald is an academic researcher from Florida Atlantic University. The author has contributed to research in topics: Feature selection & Feature (computer vision). The author has an hindex of 21, co-authored 78 publications receiving 3040 citations. Previous affiliations of Randall Wald include Atlantic University.

Papers published on a yearly basis

2016
2015
2014
2013
2012
2011
2010
2009

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Deep learning applications and challenges in big data analytics

[...]

Maryam M. Najafabadi¹, Flavio Villanustre², Taghi M. Khoshgoftaar¹, Naeem Seliya¹, Randall Wald¹, Edin Muharemagic² - Show less +2 more•Institutions (2)

Florida Atlantic University¹, LexisNexis²

24 Feb 2015-Journal of Big Data

TL;DR: This study explores how Deep Learning can be utilized for addressing some important problems in Big Data Analytics, including extracting complex patterns from massive volumes of data, semantic indexing, data tagging, fast information retrieval, and simplifying discriminative tasks.

...read moreread less

Abstract: Big Data Analytics and Deep Learning are two high-focus of data science. Big Data has become important as many organizations both public and private have been collecting massive amounts of domain-specific information, which can contain useful information about problems such as national intelligence, cyber security, fraud detection, marketing, and medical informatics. Companies such as Google and Microsoft are analyzing large volumes of data for business analysis and decisions, impacting existing and future technology. Deep Learning algorithms extract high-level, complex abstractions as data representations through a hierarchical learning process. Complex abstractions are learnt at a given level based on relatively simpler abstractions formulated in the preceding level in the hierarchy. A key benefit of Deep Learning is the analysis and learning of massive amounts of unsupervised data, making it a valuable tool for Big Data Analytics where raw data is largely unlabeled and un-categorized. In the present study, we explore how Deep Learning can be utilized for addressing some important problems in Big Data Analytics, including extracting complex patterns from massive volumes of data, semantic indexing, data tagging, fast information retrieval, and simplifying discriminative tasks. We also investigate some aspects of Deep Learning research that need further exploration to incorporate specific challenges introduced by Big Data Analytics, including streaming data, high-dimensional data, scalability of models, and distributed computing. We conclude by presenting insights into relevant future works by posing some questions, including defining data sampling criteria, domain adaptation modeling, defining criteria for obtaining useful data abstractions, improving semantic indexing, semi-supervised learning, and active learning.

...read moreread less

1,827 citations

Journal Article•DOI•

A review of data mining using big data in health informatics

[...]

Matthew Herland¹, Taghi M. Khoshgoftaar¹, Randall Wald¹•Institutions (1)

Florida Atlantic University¹

24 Jun 2014-Journal of Big Data

TL;DR: This paper will present recent research using Big Data tools and approaches for the analysis of Health Informatics data gathered at multiple levels, including the molecular, tissue, patient, and population levels.

...read moreread less

Abstract: The amount of data produced within Health Informatics has grown to be quite vast, and analysis of this Big Data grants potentially limitless possibilities for knowledge to be gained. In addition, this information can improve the quality of healthcare offered to patients. However, there are a number of issues that arise when dealing with these vast quantities of data, especially how to analyze this data in a reliable manner. The basic goal of Health Informatics is to take in real world medical data from all levels of human existence to help advance our understanding of medicine and medical practice. This paper will present recent research using Big Data tools and approaches for the analysis of Health Informatics data gathered at multiple levels, including the molecular, tissue, patient, and population levels. In addition to gathering data at multiple levels, multiple levels of questions are addressed: human-scale biology, clinical-scale, and epidemic-scale. We will also analyze and examine possible future work for each of these areas, as well as how combining data from each level may provide the most promising approach to gain the most knowledge in Health Informatics.

...read moreread less

264 citations

Journal Article•DOI•

Intrusion detection and Big Heterogeneous Data: a Survey

[...]

Richard Zuech¹, Taghi M. Khoshgoftaar¹, Randall Wald¹•Institutions (1)

Florida Atlantic University¹

27 Feb 2015-Journal of Big Data

TL;DR: Overall, both cyber threat analysis and cyber intelligence could be enhanced by correlating security events across many diverse heterogeneous sources, as well as presenting areas where more research opportunities exist.

...read moreread less

Abstract: Intrusion Detection has been heavily studied in both industry and academia, but cybersecurity analysts still desire much more alert accuracy and overall threat analysis in order to secure their systems within cyberspace. Improvements to Intrusion Detection could be achieved by embracing a more comprehensive approach in monitoring security events from many different heterogeneous sources. Correlating security events from heterogeneous sources can grant a more holistic view and greater situational awareness of cyber threats. One problem with this approach is that currently, even a single event source (e.g., network traffic) can experience Big Data challenges when considered alone. Attempts to use more heterogeneous data sources pose an even greater Big Data challenge. Big Data technologies for Intrusion Detection can help solve these Big Heterogeneous Data challenges. In this paper, we review the scope of works considering the problem of heterogeneous data and in particular Big Heterogeneous Data. We discuss the specific issues of Data Fusion, Heterogeneous Intrusion Detection Architectures, and Security Information and Event Management (SIEM) systems, as well as presenting areas where more research opportunities exist. Overall, both cyber threat analysis and cyber intelligence could be enhanced by correlating security events across many diverse heterogeneous sources.

...read moreread less

257 citations

Proceedings Article•DOI•

Feature Selection with High-Dimensional Imbalanced Data

[...]

Jason Van Hulse¹, Taghi M. Khoshgoftaar¹, Amri Napolitano¹, Randall Wald¹•Institutions (1)

Florida Atlantic University¹

06 Dec 2009

TL;DR: This work considers three filters using classifier performance metrics and six commonly-used filters and all nine filtering techniques are compared and contrasted using five different microarray expression datasets.

...read moreread less

Abstract: Feature selection is an important topic in data mining, especially for high dimensional datasets. Filtering techniques in particular have received much attention, but detailed comparisons of their performance is lacking. This work considers three filters using classifier performance metrics and six commonly-used filters. All nine filtering techniques are compared and contrasted using five different microarray expression datasets. In addition, given that these datasets exhibit an imbalance between the number of positive and negative examples, the utilization of sampling techniques in the context of feature selection is examined.

...read moreread less

145 citations

Proceedings Article•DOI•

A review of the stability of feature selection techniques for bioinformatics data

[...]

Wael Awada¹, Taghi M. Khoshgoftaar¹, David J. Dittman¹, Randall Wald¹, Amri Napolitano¹ - Show less +1 more•Institutions (1)

Florida Atlantic University¹

17 Sep 2012

TL;DR: The role of stability in feature selection with DNA microarray data is introduced, various ways of improving feature ranking stability are listed, and feature selection techniques are discussed, specifically explaining ensemble feature ranking and presenting various ensemble featureranking aggregation methods.

...read moreread less

Abstract: Feature selection is an important step in data mining and is used in various domains including genetics, medicine, and bioinformatics. Choosing the important features (genes) is essential for the discovery of new knowledge hidden within the genetic code as well as the identification of important biomarkers. Although feature selection methods can help sort through large numbers of genes based on their relevance to the problem at hand, the results generated tend to be unstable and thus cannot be reproduced in other experiments. Relatedly, research interest in the stability of feature ranking methods has grown recently and researchers have produced experimental designs for testing the stability of feature selection, creating new metrics for measuring stability and new techniques designed to improve the stability of the feature selection process. In this paper, we will introduce the role of stability in feature selection with DNA microarray data. We list various ways of improving feature ranking stability, and discuss feature selection techniques, specifically explaining ensemble feature ranking and presenting various ensemble feature ranking aggregation methods. Finally, we discuss experimental procedures such as dataset perturbation, fixed overlap partitioning, and cross validation procedures that help researchers analyze and measure the stability of feature ranking methods. Throughout this work, we investigate current research in the field and discuss possible avenues of continuing such research efforts.

...read moreread less

114 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Journal Article•DOI•

A survey of deep neural network architectures and their applications

[...]

Weibo Liu¹, Zidong Wang¹, Xiaohui Liu¹, Nianyin Zeng², Yurong Liu³, Yurong Liu⁴, Fuad E. Alsaadi⁴ - Show less +3 more•Institutions (4)

Brunel University London¹, Xiamen University², Yangzhou University³, King Abdulaziz University⁴

19 Apr 2017-Neurocomputing

TL;DR: This work was supported in part by the Royal Society of the UK, the National Natural Science Foundation of China, and the Alexander von Humboldt Foundation of Germany.

...read moreread less

2,404 citations

Journal Article•DOI•

Deep learning in agriculture: A survey

[...]

Andreas Kamilaris, Francesc X. Prenafeta-Boldú

01 Apr 2018-Computers and Electronics in Agriculture

TL;DR: A survey of 40 research efforts that employ deep learning techniques, applied to various agricultural and food production challenges indicates that deep learning provides high accuracy, outperforming existing commonly used image processing techniques.

...read moreread less

2,100 citations

Journal Article•DOI•

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

[...]

Naveed Akhtar¹, Ajmal Mian¹•Institutions (1)

University of Western Australia¹

19 Feb 2018-IEEE Access

TL;DR: A comprehensive survey on adversarial attacks on deep learning in computer vision can be found in this paper, where the authors review the works that design adversarial attack, analyze the existence of such attacks and propose defenses against them.

...read moreread less

Abstract: Deep learning is at the heart of the current rise of artificial intelligence. In the field of computer vision, it has become the workhorse for applications ranging from self-driving cars to surveillance and security. Whereas, deep neural networks have demonstrated phenomenal success (often beyond human capabilities) in solving complex problems, recent studies show that they are vulnerable to adversarial attacks in the form of subtle perturbations to inputs that lead a model to predict incorrect outputs. For images, such perturbations are often too small to be perceptible, yet they completely fool the deep learning models. Adversarial attacks pose a serious threat to the success of deep learning in practice. This fact has recently led to a large influx of contributions in this direction. This paper presents the first comprehensive survey on adversarial attacks on deep learning in computer vision. We review the works that design adversarial attacks, analyze the existence of such attacks and propose defenses against them. To emphasize that adversarial attacks are possible in practical conditions, we separately review the contributions that evaluate adversarial attacks in the real-world scenarios. Finally, drawing on the reviewed literature, we provide a broader outlook of this research direction.

...read moreread less

1,542 citations

Journal Article•DOI•

Classification in the Presence of Label Noise: A Survey

[...]

Benoît Frénay¹, Michel Verleysen¹•Institutions (1)

Université catholique de Louvain¹

01 May 2014-IEEE Transactions on Neural Networks

TL;DR: In this paper, label noise consists of mislabeled instances: no additional information is assumed to be available like e.g., confidences on labels.

...read moreread less

Abstract: Label noise is an important issue in classification, with many potential negative consequences. For example, the accuracy of predictions may decrease, whereas the complexity of inferred models and the number of necessary training samples may increase. Many works in the literature have been devoted to the study of label noise and the development of techniques to deal with label noise. However, the field lacks a comprehensive survey on the different types of label noise, their consequences and the algorithms that consider label noise. This paper proposes to fill this gap. First, the definitions and sources of label noise are considered and a taxonomy of the types of label noise is proposed. Second, the potential consequences of label noise are discussed. Third, label noise-robust, label noise cleansing, and label noise-tolerant algorithms are reviewed. For each category of approaches, a short discussion is proposed to help the practitioner to choose the most suitable technique in its own particular field of application. Eventually, the design of experiments is also discussed, what may interest the researchers who would like to test their own algorithms. In this paper, label noise consists of mislabeled instances: no additional information is assumed to be available like e.g., confidences on labels.

...read moreread less

1,440 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse