scispace - formally typeset
Search or ask a question
Conference

International Conference on Big Data 

About: International Conference on Big Data is an academic conference. The conference publishes majorly in the area(s): Big data & Cloud computing. Over the lifetime, 8883 publications have been published by the conference receiving 65538 citations.

Papers published on a yearly basis

Papers
More filters
Proceedings ArticleDOI
Mu Li1
04 Aug 2014
TL;DR: View on new challenges identified are shared, and some of the application scenarios such as micro-blog data analysis and data processing in building next generation search engines are covered.
Abstract: Big data may contain big values, but also brings lots of challenges to the computing theory, architecture, framework, knowledge discovery algorithms, and domain specific tools and applications. Beyond the 4-V or 5-V characters of big datasets, the data processing shows the features like inexact, incremental, and inductive manner. This brings new research opportunities to research community across theory, systems, algorithms, and applications. Is there some new "theory" for the big data? How to handle the data computing algorithms in an operatable manner? This report shares some view on new challenges identified, and covers some of the application scenarios such as micro-blog data analysis and data processing in building next generation search engines.

1,364 citations

Proceedings ArticleDOI
29 Oct 2015
TL;DR: The presented paper modeled and predicted China stock returns using LSTM and improved the accuracy of stock returns prediction from 14.3% to 27.2% compared with random prediction method.
Abstract: Prediction of stock market has attracted attention from industry to academia [1, 2]. Various machine learning algorithms such as neural networks, genetic algorithms, support vector machine, and others are used to predict stock prices.

455 citations

Proceedings ArticleDOI
01 Dec 2019
TL;DR: The results show that additional training of data and thus BiLSTM-based modeling offers better predictions than regular LSTm-based models.
Abstract: Machine and deep learning-based algorithms are the emerging approaches in addressing prediction problems in time series. These techniques have been shown to produce more accurate results than conventional regression-based modeling. It has been reported that artificial Recurrent Neural Networks (RNN) with memory, such as Long Short-Term Memory (LSTM), are superior compared to Autoregressive Integrated Moving Average (ARIMA) with a large margin. The LSTM-based models incorporate additional “gates” for the purpose of memorizing longer sequences of input data. The major question is that whether the gates incorporated in the LSTM architecture already offers a good prediction and whether additional training of data would be necessary to further improve the prediction. Bidirectional LSTMs (BiLSTMs) enable additional training by traversing the input data twice (i.e., 1) left-to-right, and 2) right-to-left). The research question of interest is then whether BiLSTM, with additional training capability, outperforms regular unidirectional LSTM. This paper reports a behavioral analysis and comparison of BiLSTM and LSTM models. The objective is to explore to what extend additional layers of training of data would be beneficial to tune the involved parameters. The results show that additional training of data and thus BiLSTM-based modeling offers better predictions than regular LSTM-based models. More specifically, it was observed that BiLSTM models provide better predictions compared to ARIMA and LSTM models. It was also observed that BiLSTM models reach the equilibrium much slower than LSTM-based models.

428 citations

Proceedings ArticleDOI
01 Dec 2018
TL;DR: YOLO-LITE as discussed by the authors is a real-time object detection model developed to run on portable devices such as a laptop or cellphone lacking a Graphics Processing Unit (GPU) and achieves a mAP of 33.81% and 12.26% respectively.
Abstract: This paper focuses on YOLO-LITE, a real-time object detection model developed to run on portable devices such as a laptop or cellphone lacking a Graphics Processing Unit (GPU). The model was first trained on the PASCAL VOC dataset then on the COCO dataset, achieving a mAP of 33.81% and 12.26% respectively. YOLO-LITE runs at about 21 FPS on a non-GPU computer and 10 FPS after implemented onto a website with only 7 layers and 482 million FLOPS. This speed is 3.8 × faster than the fastest state of art model, SSD MobilenetvI. Based on the original object detection algorithm YOLOV2, YOLO-LITE was designed to create a smaller, faster, and more efficient model increasing the accessibility of real-time object detection to a variety of devices.

391 citations

Proceedings ArticleDOI
01 Oct 2014
TL;DR: Immersion provides benefits beyond the traditional “desktop” visualization tools: it leads to a demonstrably better perception of a datascape geometry, more intuitive data understanding, and a better retention of the perceived relationships in the data.
Abstract: Effective data visualization is a key part of the discovery process in the era of “big data”. It is the bridge between the quantitative content of the data and human intuition, and thus an essential component of the scientific path from data into knowledge and understanding. Visualization is also essential in the data mining process, directing the choice of the applicable algorithms, and in helping to identify and remove bad data from the analysis. However, a high complexity or a high dimensionality of modern data sets represents a critical obstacle. How do we visualize interesting structures and patterns that may exist in hyper-dimensional data spaces? A better understanding of how we can perceive and interact with multidimensional information poses some deep questions in the field of cognition technology and human-computer interaction. To this effect, we are exploring the use of immersive virtual reality platforms for scientific data visualization, both as software and inexpensive commodity hardware. These potentially powerful and innovative tools for multi-dimensional data visualization can also provide an easy and natural path to a collaborative data visualization and exploration, where scientists can interact with their data and their colleagues in the same visual space. Immersion provides benefits beyond the traditional “desktop” visualization tools: it leads to a demonstrably better perception of a datascape geometry, more intuitive data understanding, and a better retention of the perceived relationships in the data.

290 citations

Performance
Metrics
No. of papers from the Conference in previous years
YearPapers
20221
2021543
20201,776
20191,690
20181,327
20171,364