Journal ArticleDOI
Apache Spark: a unified engine for big data processing
Matei Zaharia,Reynold Xin,Patrick Wendell,Tathagata Das,Michael Armbrust,Ankur Dave,Xiangrui Meng,Josh Rosen,Shivaram Venkataraman,Michael J. Franklin,Ali Ghodsi,Joseph E. Gonzalez,Scott Shenker,Ion Stoica +13 more
TLDR
This open source computing framework unifies streaming, batch, and interactive big data workloads to unlock new applications.Abstract:
This open source computing framework unifies streaming, batch, and interactive big data workloads to unlock new applicationsread more
Citations
More filters
Journal ArticleDOI
Artificial intelligence within the interplay between natural and artificial computation: Advances in data science, trends and applications
Juan Manuel Górriz,Juan Manuel Górriz,Javier Ramírez,Andrés Ortiz,Francisco Jesús Martínez-Murcia,Fermín Segovia,John Suckling,Matthew Leming,Yudong Zhang,José Ramón Álvarez-Sánchez,Guido Bologna,Paula Bonomini,Fernando E. Casado,David Charte,Francisco Charte,Ricardo Contreras,Alfredo Cuesta-Infante,Richard J. Duro,Antonio Fernández-Caballero,Eduardo Fernández-Jover,Pedro Gómez-Vilda,Manuel Graña,Francisco Herrera,Roberto Iglesias,Anna Lekova,Javier de Lope,Ezequiel López-Rubio,Rafael Martínez-Tomás,Miguel A. Molina-Cabello,Antonio S. Montemayor,Paulo Novais,Daniel Palacios-Alonso,Juan José Pantrigo,Bryson R. Payne,Félix de la Paz López,María Angélica Pinninghoff,Mariano Rincón,José Santos,Karl Thurnhofer-Hemsi,Athanasios Tsanas,Ramiro Varela,José Manuel Ferrández +41 more
TL;DR: A review of recent works published in the latter field and the state the art are summarized in a comprehensive and self-contained way to provide a baseline framework for the international community in artificial intelligence.
Journal ArticleDOI
SMILES-based deep generative scaffold decorator for de-novo drug design
Josep Arús-Pous,Josep Arús-Pous,Atanas Patronov,Esben Jannik Bjerrum,Christian Tyrchan,Jean-Louis Reymond,Hongming Chen,Ola Engkvist +7 more
TL;DR: A new SMILES-based molecular generative architecture that generates molecules from scaffolds and can be trained from any arbitrary molecular set and serves as a data augmentation technique and is readily coupled with randomized SMilES to obtain even better results with small sets.
Journal ArticleDOI
Big Data Driven Marine Environment Information Forecasting: A Time Series Prediction Network
TL;DR: A semisupervised prediction model is proposed, which exploits the improved unsupervised clustering algorithm to establish the fuzzy partition function, and then utilize the neural network model to build the information prediction function.
Journal ArticleDOI
Analyzing the performance of a blockchain-based personal health record implementation
Alex Roehrs,Cristiano André da Costa,Rodrigo da Rosa Righi,Valter Ferreira da Silva,José Roberto Goldim,Douglas C. Schmidt +5 more
TL;DR: The performance results indicated that data distributed via a blockchain could be recovered with low average response time and high availability in the scenarios the authors tested and demonstrated how OmniPHR model implementation can integrate distributed data into a unified view of health records.
Journal ArticleDOI
The core decomposition of networks: theory, algorithms and applications
Fragkiskos D. Malliaros,Christos Giatsidis,Apostolos N. Papadopoulos,Michalis Vazirgiannis,Michalis Vazirgiannis +4 more
TL;DR: In this survey, an in-depth discussion of core decomposition is performed, focusing mainly on the basic theory and fundamental concepts, the algorithmic techniques proposed for computing it efficiently under different settings, and the applications that can benefit significantly from it.
References
More filters
Journal ArticleDOI
MapReduce: simplified data processing on large clusters
Jeffrey Dean,Sanjay Ghemawat +1 more
TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Proceedings Article
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing
Matei Zaharia,Mosharaf Chowdhury,Tathagata Das,Ankur Dave,Justin Ma,Murphy McCauley,Michael J. Franklin,Scott Shenker,Ion Stoica +8 more
TL;DR: Resilient Distributed Datasets is presented, a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner and is implemented in a system called Spark, which is evaluated through a variety of user applications and benchmarks.
Journal ArticleDOI
A bridging model for parallel computation
TL;DR: The bulk-synchronous parallel (BSP) model is introduced as a candidate for this role, and results quantifying its efficiency both in implementing high-level language features and algorithms, as well as in being implemented in hardware.
Proceedings ArticleDOI
Pregel: a system for large-scale graph processing
Grzegorz Malewicz,Matthew H. Austern,Aart J. C. Bik,James C. Dehnert,Ilan Horn,Naty Leiser,Grzegorz Czajkowski +6 more
TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
Proceedings ArticleDOI
Dryad: distributed data-parallel programs from sequential building blocks
TL;DR: The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.