scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Artificial Intelligence and Big Data

01 Mar 2013-IEEE Intelligent Systems (IEEE)-Vol. 28, Iss: 2, pp 96-99
TL;DR: AI Innovation in Industry is a new department for IEEE Intelligent Systems, and this paper examines some of the basic concerns and uses of AI for big data.
Abstract: AI Innovation in Industry is a new department for IEEE Intelligent Systems, and this paper examines some of the basic concerns and uses of AI for big data (AI has been used in several different ways to facilitate capturing and structuring big data, and it has been used to analyze big data for key insights).
Citations
More filters
Journal ArticleDOI
TL;DR: The definition, characteristics, and classification of big data along with some discussions on cloud computing are introduced, and research challenges are investigated, with focus on scalability, availability, data integrity, data transformation, data quality, data heterogeneity, privacy, legal and regulatory issues, and governance.

2,141 citations


Cites background from "Artificial Intelligence and Big Dat..."

  • ...Internet users also generate an extremely diverse set of structured and unstructured data [12]....

    [...]

  • ...MapReduce is a popular cloud computing framework that robotically performs scalable distributed applications [56] and provides an interface that allows for parallelization and distributed computing in a cluster of servers [12]....

    [...]

Journal ArticleDOI
Fei Tao1, Jiangfeng Cheng1, Qinglin Qi1, Meng Zhang1, He Zhang1, Fangyuan Sui1 
TL;DR: In this paper, a new method for product design, manufacturing, and service driven by digital twin is proposed, and three cases are given to illustrate the future applications of digital twin in three phases of a product respectively.
Abstract: Nowadays, along with the application of new-generation information technologies in industry and manufacturing, the big data-driven manufacturing era is coming. However, although various big data in the entire product lifecycle, including product design, manufacturing, and service, can be obtained, it can be found that the current research on product lifecycle data mainly focuses on physical products rather than virtual models. Besides, due to the lack of convergence between product physical and virtual space, the data in product lifecycle is isolated, fragmented, and stagnant, which is useless for manufacturing enterprises. These problems lead to low level of efficiency, intelligence, sustainability in product design, manufacturing, and service phases. However, physical product data, virtual product data, and connected data that tie physical and virtual product are needed to support product design, manufacturing, and service. Therefore, how to generate and use converged cyber-physical data to better serve product lifecycle, so as to drive product design, manufacturing, and service to be more efficient, smart, and sustainable, is emphasized and investigated based on our previous study on big data in product lifecycle management. In this paper, a new method for product design, manufacturing, and service driven by digital twin is proposed. The detailed application methods and frameworks of digital twin-driven product design, manufacturing, and service are investigated. Furthermore, three cases are given to illustrate the future applications of digital twin in the three phases of a product respectively.

1,571 citations


Cites background from "Artificial Intelligence and Big Dat..."

  • ..., internet of things technology and devices are employed to collect various data generated in the entire produce lifecycle [2], cloud technology is used to realize the data management and processing [3], and artificial intelligence is used for data mining and realizing added-value [4], the big data-driven manufacturing era is coming....

    [...]

Journal ArticleDOI
TL;DR: This study comprehensively surveys and classifies the various attributes of Big data, including its nature, definitions, rapid growth rate, volume, management, analysis, and security, and proposes a data life cycle that uses the technologies and terminologies of Big Data.
Abstract: Big Data has gained much attention from the academia and the IT industry. In the digital and computing world, information is generated and collected at a rate that rapidly exceeds the boundary range. Currently, over 2 billion people worldwide are connected to the Internet, and over 5 billion individuals own mobile phones. By 2020, 50 billion devices are expected to be connected to the Internet. At this point, predicted data production will be 44 times greater than that in 2009. As information is transferred and shared at light speed on optic fiber and wireless networks, the volume of data and the speed of market growth increase. However, the fast growth rate of such large data generates numerous challenges, such as the rapid growth of data, transfer speed, diverse data, and security. Nonetheless, Big Data is still in its infancy stage, and the domain has not been reviewed in general. Hence, this study comprehensively surveys and classifies the various attributes of Big Data, including its nature, definitions, rapid growth rate, volume, management, analysis, and security. This study also proposes a data life cycle that uses the technologies and terminologies of Big Data. Future research directions in this field are determined based on opportunities and several open issues in Big Data domination. These research directions facilitate the exploration of the domain and the development of optimal techniques to address Big Data.

419 citations


Cites background from "Artificial Intelligence and Big Dat..."

  • ...Accordingly, the speed of the access and mining of both structured and unstructured data has increased over time [76]....

    [...]

Journal ArticleDOI
TL;DR: The findings of this paper show that firms that developed more BDA capabilities than others, both technological and managerial, increased their performances and that KM orientation plays a significant role in amplifying the effect of Bda capabilities.
Abstract: Big data analytics (BDA) guarantees that data may be analysed and categorised into useful information for businesses and transformed into big data related-knowledge and efficient decision-making processes, thereby improving performance. However, the management of the knowledge generated from the BDA as well as its integration and combination with firm knowledge have scarcely been investigated, despite an emergent need of a structured and integrated approach. The paper aims to discuss these issues.,Through an empirical analysis based on structural equation modelling with data collected from 88 Italian SMEs, the authors tested if BDA capabilities have a positive impact on firm performances, as well as the mediator effect of knowledge management (KM) on this relationship.,The findings of this paper show that firms that developed more BDA capabilities than others, both technological and managerial, increased their performances and that KM orientation plays a significant role in amplifying the effect of BDA capabilities.,BDA has the potential to change the way firms compete through better understanding, processing, and exploiting of huge amounts of data coming from different internal and external sources and processes. Some managerial and theoretical implications are proposed and discussed in light of the emergence of this new phenomenon.

298 citations

Proceedings ArticleDOI
01 Jul 2017
TL;DR: This paper provides a comprehensive survey on what is Big Data, comparing methods, its research problems, and trends, and application of Deep Learning in Big data, its challenges, open research problems and future trends are presented.
Abstract: Big Data means extremely huge large data sets that can be analyzed to find patterns, trends. One technique that can be used for data analysis so that able to help us find abstract patterns in Big Data is Deep Learning. If we apply Deep Learning to Big Data, we can find unknown and useful patterns that were impossible so far. With the help of Deep Learning, AI is getting smart. There is a hypothesis in this regard, the more data, the more abstract knowledge. So a handy survey of Big Data, Deep Learning and its application in Big Data is necessary. In this paper, we provide a comprehensive survey on what is Big Data, comparing methods, its research problems, and trends. Then a survey of Deep Learning, its methods, comparison of frameworks, and algorithms is presented. And at last, application of Deep Learning in Big Data, its challenges, open research problems and future trends are presented.

266 citations


Cites background from "Artificial Intelligence and Big Dat..."

  • ...Authors in [25] noted that about 75 percent of organizations apply at least one form of Big Data....

    [...]

References
More filters
Journal ArticleDOI
Jeffrey Dean1, Sanjay Ghemawat1
06 Dec 2004
TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Abstract: MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day.

20,309 citations

Journal ArticleDOI
Jeffrey Dean1, Sanjay Ghemawat1
TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Abstract: MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.

17,663 citations

Journal ArticleDOI
TL;DR: This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART.
Abstract: This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification, clustering, statistical learning, association analysis, and link mining, which are all among the most important topics in data mining research and development.

4,944 citations

01 Jan 1999
TL;DR: The phrase "Internet of Things" started life as the title of a presentation I made at Procter & Gamble (P&G) in 1999 as mentioned in this paper, which was more than just a good way to get executive attention.
Abstract: Jun 22, 2009—I could be wrong, but I'm fairly sure the phrase "Internet of Things" started life as the title of a presentation I made at Procter & Gamble (P&G) in 1999. Linking the new idea of RFID in P&G's supply chain to the then-red-hot topic of the Internet was more than just a good way to get executive attention. It summed up an important insight—one that 10 years later, after the Internet of Things has become the title of everything from an article in Scientific American to the name of a European Union conference, is still often misunderstood.

2,608 citations

Proceedings ArticleDOI
23 Aug 2004
TL;DR: A system that, given a topic, automatically finds the people who hold opinions about that topic and the sentiment of each opinion and another module for determining word sentiment and another for combining sentiments within a sentence is presented.
Abstract: Identifying sentiments (the affective parts of opinions) is a challenging problem. We present a system that, given a topic, automatically finds the people who hold opinions about that topic and the sentiment of each opinion. The system contains a module for determining word sentiment and another for combining sentiments within a sentence. We experiment with various models of classifying and combining sentiment at word and sentence levels, with promising results.

1,541 citations

Trending Questions (1)
How can AI be used to improve big data processing?

AI can be used to identify and clean dirty data, establish context knowledge, and analyze big data for key insights.