Home
/
Topics
/
Spark (mathematics)

Topic

Spark (mathematics)

About: Spark (mathematics) is a research topic. Over the lifetime, 7304 publications have been published within this topic receiving 63322 citations.

...read moreread less

Papers published on a yearly basis

2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969
1968

1 / 3

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Spark plasma extrusion (SPE): Prospects and potential

[...]

Khaled Morsi¹, A. El-Desouky¹, B. Johnson¹, A. Mar¹, S. Lanka¹ - Show less +1 more•Institutions (1)

San Diego State University¹

01 Aug 2009-Scripta Materialia

TL;DR: In this article, the prospects and potential of spark plasma extrusion as a process that can allow the production of extended geometries via electric-current processing was discussed, showing the feasibility of this processing approach, which has major implications for the spark plasma sintering field.

...read moreread less

27 citations

Proceedings Article•DOI•

Predictive mapping of urban air pollution using Apache Spark on a Hadoop cluster

[...]

Marjan Asgari¹, Mahdi Farnaghi¹, Zeinab Ghaemi¹•Institutions (1)

K.N.Toosi University of Technology¹

17 Sep 2017

TL;DR: A solution based on distributed processing concepts to generate predictive map of air pollution for the next 24 hours on monitoring stations of Tehran, the capital of Iran, shows that the proposed approach can achieve a reasonable speed in processing of big spatial data along with horizontal scalability.

...read moreread less

Abstract: Air pollution is one of the major environmental problems in the industrial and populated cities. Predictive mapping of urban air pollution and sharing the generated maps with the public and city officials have positive impacts on society and environment. This article presents a solution based on distributed processing concepts to generate predictive map of air pollution for the next 24 hours. Apache Hadoop has been utilized as the underlying framework to form a cluster of processing machines. In order to improve the processing speed along with required machine learning functionalities, Apache Spark has been employed on the Hadoop cluster. The solution enables us to efficiently predict air quality classes on monitoring stations of Tehran, the capital of Iran for the next 24 hours. Using Inverse distance weighting (IDW) method, the predictive map of air quality classes is generated afterward for the whole city. The results showed that the proposed approach can achieve a reasonable speed in processing of big spatial data along with horizontal scalability.

...read moreread less

27 citations

Proceedings Article•DOI•

Effects of Compression Ratio on Spark-Ignited Engine Efficiency

[...]

Patrick W. Smith¹, John B. Heywood¹, Wai K. Cheng¹•Institutions (1)

Massachusetts Institute of Technology¹

13 Oct 2014

26 citations

Proceedings Article•DOI•

In-Memory Distributed Matrix Computation Processing and Optimization

[...]

Yongyang Yu¹, Mingjie Tang¹, Walid G. Aref¹, Qutaibah M. Malluhi², Mostafa M. Abbas³, Mourad Ouzzani³ - Show less +2 more•Institutions (3)

Purdue University¹, Qatar University², Qatar Computing Research Institute³

19 Apr 2017

TL;DR: New efficient and scalable matrix processing and optimization techniques for in-memory distributed clusters and an evaluation plan generator for complex matrix computations is introduced as well as a distributed plan optimizer that exploits dynamic cost-based analysis and rule-based heuristics to optimize the cost of matrix computation in an in- memory distributed environment.

...read moreread less

Abstract: The use of large-scale machine learning and data mining methods is becoming ubiquitous in many application domains ranging from business intelligence and bioinformatics to self-driving cars These methods heavily rely on matrix computations, and it is hence critical to make these computations scalable and efficient These matrix computations are often complex and involve multiple steps that need to be optimized and sequenced properly for efficient execution This paper presents new efficient and scalable matrix processing and optimization techniques for in-memory distributed clusters The proposed techniques estimate the sparsity of intermediate matrix-computation results and optimize communication costs An evaluation plan generator for complex matrix computations is introduced as well as a distributed plan optimizer that exploits dynamic cost-based analysis and rule-based heuristics to optimize the cost of matrix computations in an in-memory distributed environment The result of a matrix operation will often serve as an input to another matrix operation, thus defining the matrix data dependencies within a matrix program The matrix query plan generator produces query execution plans that minimize memory usage and communication overhead by partitioning the matrix based on the data dependencies in the execution plan We implemented the proposed matrix processing and optimization techniques in Spark, a distributed in-memory computing platform Experiments on both real and synthetic data demonstrate that our proposed techniques achieve up to an order-of-magnitude performance improvement over state-of the-art distributed matrix computation systems on a wide range of applications

...read moreread less

26 citations

Book Chapter•DOI•

Movie Recommender System Based on Collaborative Filtering Using Apache Spark

[...]

Mohammed Fadhel Aljunid¹, D. H. Manjaiah¹•Institutions (1)

Mangalore University¹

01 Jan 2019

TL;DR: This research focuses on the selection of parameters of ALS algorithms that can affect the performance of a building robust RS and proposes a movie recommender system based on ALS using Apache Spark.

...read moreread less

Abstract: Recently, the building of recommender systems becomes a significant research area that attractive several scientists and researchers across the world. The recommender systems are used in a variety of areas including music, movies, books, news, search queries, and commercial products. Collaborative Filtering algorithm is one of the popular successful techniques of RS, which aims to find users closely similar to the active one in order to recommend items. Collaborative filtering (CF) with alternating least squares (ALS) algorithm is the most imperative techniques which are used for building a movie recommendation engine. The ALS algorithm is one of the models of matrix factorization related CF which is considered as the values in the item list of user matrix. As there is a need to perform analysis on the ALS algorithm by selecting different parameters which can eventually help in building efficient movie recommender engine. In this paper, we propose a movie recommender system based on ALS using Apache Spark. This research focuses on the selection of parameters of ALS algorithms that can affect the performance of a building robust RS. From the results, a conclusion is drawn according to the selection of parameters of ALS algorithms which can affect the performance of building of a movie recommender engine. The model evaluation is done using different metrics such as execution time, root mean squared error (RMSE) of rating prediction, and rank in which the best model was trained. Two best cases are chosen based on best parameters selection from experimental results which can lead to building good prediction rating for a movie recommender.

...read moreread less

26 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
…
103
104
105
106
107
108
109
…
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

7,304

Papers

74,604

Citations

No. of papers in the topic in previous years
Year	Papers
2022	10
2021	429
2020	525
2019	661
2018	758
2017	683

Spark (mathematics)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics