Home
/
Authors
/
Katarina Grolinger

Author

Katarina Grolinger

Other affiliations: University of Zagreb

Bio: Katarina Grolinger is an academic researcher from University of Western Ontario. The author has contributed to research in topics: Deep learning & Energy consumption. The author has an hindex of 18, co-authored 51 publications receiving 1783 citations. Previous affiliations of Katarina Grolinger include University of Zagreb.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2002
1999
1997
1996
1970

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Machine Learning With Big Data: Challenges and Approaches

[...]

Alexandra L'Heureux¹, Katarina Grolinger¹, Hany F. ElYamany¹, Miriam A. M. Capretz¹•Institutions (1)

University of Western Ontario¹

20 Apr 2017-IEEE Access

TL;DR: This paper compiles, summarizes, and organizes machine learning challenges with Big Data, highlighting the cause–effect relationship by organizing challenges according to Big Data Vs or dimensions that instigated the issue: volume, velocity, variety, or veracity.

...read moreread less

Abstract: The Big Data revolution promises to transform how we live, work, and think by enabling process optimization, empowering insight discovery and improving decision making. The realization of this grand potential relies on the ability to extract value from such massive data through data analytics; machine learning is at its core because of its ability to learn from data and provide data driven insights, decisions, and predictions. However, traditional machine learning approaches were developed in a different era, and thus are based upon multiple assumptions, such as the data set fitting entirely into memory, what unfortunately no longer holds true in this new context. These broken assumptions, together with the Big Data characteristics, are creating obstacles for the traditional techniques. Consequently, this paper compiles, summarizes, and organizes machine learning challenges with Big Data. In contrast to other research that discusses challenges, this work highlights the cause–effect relationship by organizing challenges according to Big Data Vs or dimensions that instigated the issue: volume, velocity, variety, or veracity. Moreover, emerging machine learning approaches and techniques are discussed in terms of how they are capable of handling the various challenges with the ultimate objective of helping practitioners select appropriate solutions for their use cases. Finally, a matrix relating the challenges and approaches is presented. Through this process, this paper provides a perspective on the domain, identifies research gaps and opportunities, and provides a strong foundation and encouragement for further research in the field of machine learning with Big Data.

...read moreread less

592 citations

Journal Article•DOI•

Data management in cloud environments: NoSQL and NewSQL data stores

[...]

Katarina Grolinger¹, Wilson A. Higashino², Abhinav Tiwari¹, Miriam A. M. Capretz¹•Institutions (2)

University of Western Ontario¹, State University of Campinas²

01 Dec 2013

TL;DR: This study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages.

...read moreread less

Abstract: Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the large number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and even more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL and NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to practitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities in the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling, and security related capabilities. Features driving the ability to scale read requests and write requests, or scaling data storage are investigated, in particular partitioning, replication, consistency, and concurrency control. Furthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and the suitability of various solutions for different sets of applications is examined. Consequently, this study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages.

...read moreread less

304 citations

Proceedings Article•DOI•

MLaaS: Machine Learning as a Service

[...]

Mauro Ribeiro¹, Katarina Grolinger¹, Miriam A. M. Capretz¹•Institutions (1)

University of Western Ontario¹

01 Dec 2015

TL;DR: This paper proposes an architecture to create a flexible and scalable machine learning as a service, using real-world sensor and weather data by running different algorithms at the same time.

...read moreread less

Abstract: The demand for knowledge extraction has been increasing. With the growing amount of data being generated by global data sources (e.g., social media and mobile apps) and the popularization of context-specific data (e.g., the Internet of Things), companies and researchers need to connect all these data and extract valuable information. Machine learning has been gaining much attention in data mining, leveraging the birth of new solutions. This paper proposes an architecture to create a flexible and scalable machine learning as a service. An open source solution was implemented and presented. As a case study, a forecast of electricity demand was generated using real-world sensor and weather data by running different algorithms at the same time.

...read moreread less

281 citations

Journal Article•DOI•

An ensemble learning framework for anomaly detection in building energy consumption

[...]

Daniel B. Araya¹, Katarina Grolinger¹, Hany F. ElYamany², Hany F. ElYamany¹, Miriam A. M. Capretz¹, Girma Bitsuamlak¹ - Show less +2 more•Institutions (2)

University of Western Ontario¹, Suez Canal University²

01 Jun 2017-Energy and Buildings

TL;DR: A new pattern-based anomaly classifier is proposed, the collective contextual anomaly detection using sliding window (CCAD-SW) framework, which improved the anomaly detection capacity of the CCAD- SW by 3.6% and reduced false alarm rate by 2.7%.

...read moreread less

162 citations

Proceedings Article•DOI•

Challenges for MapReduce in Big Data

[...]

Katarina Grolinger¹, Michael Hayes¹, Wilson A. Higashino¹, Alexandra L'Heureux¹, David S. Allison¹, Miriam A. M. Capretz¹ - Show less +2 more•Institutions (1)

University of Western Ontario¹

27 Jun 2014

TL;DR: The identified issues and challenges MapReduce faces when handling Big Data are grouped into four main categories corresponding to Big Data tasks types: data storage, Big Data analytics, online processing, and security and privacy.

...read moreread less

Abstract: In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce paradigm which allows for massively parallel and distributed execution over a large number of computing nodes. This paper identifies MapReduce issues and challenges in handling Big Data with the objective of providing an overview of the field, facilitating better planning and management of Big Data projects, and identifying opportunities for future research in this field. The identified challenges are grouped into four main categories corresponding to Big Data tasks types: data storage (relational databases and NoSQL stores), Big Data analytics (machine learning and interactive analytics), online processing, and security and privacy. Moreover, current efforts aimed at improving and extending MapReduce to address identified challenges are presented. Consequently, by identifying issues and challenges MapReduce faces when handling Big Data, this study encourages future Big Data research.

...read moreread less

149 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13

Collapse

Cited by

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Book Chapter•DOI•

World energy outlook

[...]

Pierre Desprairies

01 Jan 1982

TL;DR: In this article, the authors discuss leading problems linked to energy that the world is now confronting and propose some ideas concerning possible solutions, and conclude that it is necessary to pursue actively the development of coal, natural gas, and nuclear power.

...read moreread less

Abstract: This chapter discusses leading problems linked to energy that the world is now confronting and to propose some ideas concerning possible solutions. Oil deserves special attention among all energy sources. Since the beginning of 1981, it has merely been continuing and enhancing the downward movement in consumption and prices caused by excessive rises, especially for light crudes such as those from Africa, and the slowing down of worldwide economic growth. Densely-populated oil-producing countries need to produce to live, to pay for their food and their equipment. If the economic growth of the industrialized countries were to be 4%, even if investment in the rational use of energy were pushed to the limit and the development of nonpetroleum energy sources were also pursued actively, it would be extremely difficult to prevent a sharp rise in prices. It is evident that it is absolutely necessary to pursue actively the development of coal, natural gas, and nuclear power if a physical shortage of energy is not to block economic growth.

...read moreread less

2,283 citations

Journal Article•DOI•

Statistics: Methods and Applications.

[...]

Ernest M. Scheuer, John I. Griffin

01 Feb 1963-American Mathematical Monthly

979 citations

Journal Article•DOI•

Review of Smart Meter Data Analytics: Applications, Methodologies, and Challenges

[...]

Yi Wang¹, Qixin Chen¹, Tao Hong², Chongqing Kang¹•Institutions (2)

Tsinghua University¹, University of North Carolina at Charlotte²

01 May 2019-IEEE Transactions on Smart Grid

TL;DR: An application-oriented review of smart meter data analytics identifies the key application areas as load analysis, load forecasting, and load management and reviews the techniques and methodologies adopted or developed to address each application.

...read moreread less

Abstract: The widespread popularity of smart meters enables an immense amount of fine-grained electricity consumption data to be collected. Meanwhile, the deregulation of the power industry, particularly on the delivery side, has continuously been moving forward worldwide. How to employ massive smart meter data to promote and enhance the efficiency and sustainability of the power grid is a pressing issue. To date, substantial works have been conducted on smart meter data analytics. To provide a comprehensive overview of the current research and to identify challenges for future research, this paper conducts an application-oriented review of smart meter data analytics. Following the three stages of analytics, namely, descriptive, predictive, and prescriptive analytics, we identify the key application areas as load analysis, load forecasting, and load management. We also review the techniques and methodologies adopted or developed to address each application. In addition, we also discuss some research trends, such as big data issues, novel machine learning technologies, new business models, the transition of energy systems, and data privacy and security.

...read moreread less

621 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse