scispace - formally typeset
Search or ask a question
Topic

Spark (mathematics)

About: Spark (mathematics) is a research topic. Over the lifetime, 7304 publications have been published within this topic receiving 63322 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A distributed courses recommender system for the e-learning platform that aims to discover relationships between student’s activities using association rules method in order to help the student to choose the most appropriate learning materials.
Abstract: The present work is a part of the ESTenLigne project which is the result of several years of experience for developing e-learning in Sidi Mohamed Ben Abdellah University through the implementation of open, online and adaptive learning environment. However, this platform faces many challenges, such as the increasing amount of data, the diversity of pedagogical resources and a large number of learners that makes harder to find what the learners are really looking for. Furthermore, most of the students in this platform are new graduates who have just come to integrate higher education and who need a system to help them to take the relevant courses that take into account the requirements and needs of each learner. In this article, we develop a distributed courses recommender system for the e-learning platform. It aims to discover relationships between student’s activities using association rules method in order to help the student to choose the most appropriate learning materials. We also focus on the analysis of past historical data of the courses enrollments or log data. The article discusses particularly the frequent itemsets concept to determine the interesting rules in the transaction database. Then, we use the extracted rules to find the catalog of more suitable courses according to the learner’s behaviors and preferences. Next, we deploy our recommender system using big data technologies and techniques. Especially, we implement parallel FP-growth algorithm provided by Spark Framework and Hadoop ecosystem. The experimental results show the effectiveness and scalability of the proposed system. Finally, we evaluate the performance of Spark MLlib library compared to traditional machine learning tools including Weka and R.

46 citations

Proceedings ArticleDOI

46 citations

Posted ContentDOI
21 Oct 2019-bioRxiv
TL;DR: The high power of SPARK allows us to identify new genes and pathways that reveal new biology in the data that otherwise cannot be revealed by existing approaches, up to ten times more powerful than existing approaches.
Abstract: Recent development of various spatially resolved transcriptomic techniques has enabled gene expression profiling on complex tissues with spatial localization information. Identifying genes that display spatial expression pattern in these studies is an important first step towards characterizing the spatial transcriptomic landscape. Detecting spatially expressed genes requires the development of statistical methods that can properly model spatial count data, provide effective type I error control, have sufficient statistical power, and are computationally efficient. Here, we developed such a method, SPARK. SPARK directly models count data generated from various spatial resolved transcriptomic techniques through generalized linear spatial models. With a new efficient penalized quasi-likelihood based algorithm, SPARK is scalable to data sets with tens of thousands of genes measured on tens of thousands of samples. Importantly, SPARK relies on newly developed statistical formulas for hypothesis testing, producing well-calibrated p-values and yielding high statistical power. We illustrate the benefits of SPARK through extensive simulations and in-depth analysis of four published spatially resolved transcriptomic data sets. In the real data applications, SPARK is up to ten times more powerful than existing approaches. The high power of SPARK allows us to identify new genes and pathways that reveal new biology in the data that otherwise cannot be revealed by existing approaches.

46 citations

Journal ArticleDOI
TL;DR: This study aims to help retail companies create personalized deals and promotions for their customers, even during the COVID-19 pandemic, through a big data framework that allows them to handle massive sales volumes with more efficient models.
Abstract: Retail companies recognize the need to analyze and predict their sales and customer behavior against their products and product categories Our study aims to help retail companies create personalized deals and promotions for their customers, even during the COVID-19 pandemic, through a big data framework that allows them to handle massive sales volumes with more efficient models In this paper, we used Black Friday sales data taken from a dataset on the Kaggle website, which contains nearly 550,000 observations analyzed with 10 features: Qualitative and quantitative The class label is purchases and sales (in U S dollars) Because the predictor label is continuous, regression models are suited in this case Using the Apache Spark big data framework, which uses the MLlib machine learning library, we trained two machine learning models: Linear regression and random forest These machine learning algorithms were used to predict future pricing and sales We first implemented a linear regression model and a random forest model without using the Spark framework and achieved accuracies of 68% and 74%, respectively Then, we trained these models on the Spark machine learning big data framework where we achieved an accuracy of 72% for the linear regression model and 81% for the random forest model © 2021, Tech Science Press All rights reserved

45 citations

Journal ArticleDOI
TL;DR: In this article, a two-dimensional code to simulate the early stages of flame kernel formation, shortly after the breakdown discharge, has been developed to investigate the development of a stable flame kernel initiated by an electrical spark.
Abstract: Spark ignition, as the first step during the combustion in Otto engines, has a profound impact on the further development of the flame kernel. Over the last ten years growing concern for environment protection, including low emission of pollutants has increased the interest in the numerical simulation of ignition phenomena to guarantee successful flame kernel development even for lean mixtures. However, the process of spark ignition in a combustible mixture is not yet fully understood. The use of detailed reaction mechanisms, combined with electrodynamical modelling of the spark, is necessary to optimize ignition of lean mixtures. This work presents simulations of the coupling of flow, chemical reactions and transport with discharge processes in order to investigate the development of a stable flame kernel initiated by an electrical spark. A two-dimensional code to simulate the early stages of flame kernel formation, shortly after the breakdown discharge, has been developed. The model includes Joule heati...

45 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
76% related
Combustion
172.3K papers, 1.9M citations
72% related
Cluster analysis
146.5K papers, 2.9M citations
72% related
Cloud computing
156.4K papers, 1.9M citations
71% related
Hydrogen
132.2K papers, 2.5M citations
69% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202210
2021429
2020525
2019661
2018758
2017683