scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Foundations to frontiers of big data analytics

TL;DR: The paper gives the importance of cloud computing in Big data paradigm and a case study using Spark is given as an example.
Abstract: In recent times, big data analytics has become a major trend in catering data queries that has been growing dramatically. The present paper gives a brief description of latest happenings of Big Data analytics. A case study using Spark is given as an example. The paper gives the importance of cloud computing in Big data paradigm.

Summary (1 min read)

Jump to:  and [Summary]

Summary

  • In recent times, big data analytics has become a major trend in catering data queries that has been growing dramatically.
  • The present paper gives a brief description of latest happenings of Big Data analytics.
  • The paper gives the importance of cloud computing in Big data paradigm.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Foundations to frontiers of big data analytics
ABSTRACT
In recent times, big data analytics has become a major trend in catering data queries that has
been growing dramatically. The present paper gives a brief description of latest happenings of
Big Data analytics. A case study using Spark is given as an example. The paper gives the
importance of cloud computing in Big data paradigm.
Keyword: Big data; Map reduce; Spark; Cloud computing; Hadoop; Haloop
Citations
More filters
Journal ArticleDOI
TL;DR: Aliphatic carbons, from the polymerization reactions, especially in Dycal and TheraCal, were found to mask the other components and showed that Ca in the surface layer could vary from 0 to 18%, depending on the material.

23 citations


Cites background from "Foundations to frontiers of big dat..."

  • ...Surface energy and its chemical composition control macromolecular and protein adsorption and desorption, which will initiate the inflammatory and/or immune responses, resulting in repair or necrosis [18-21]....

    [...]

  • ...These reactions occur at the interface between cells and the biomaterial [19-20], and the nanoscale surface composition of the biomaterial is a major variable that determines the host response [21-22]....

    [...]

DissertationDOI
01 Jan 2017
TL;DR: In this article, the authors used angle-resolved spectroscopy in two dimensions to analyze the full emission spectrum in terms of exit angle and emission energy, showing that the in-plane surface source is stronger than predicted.
Abstract: The investigation of surfaces and thin films is of particular interest in current research as it provides a basis for a multiplicity of applications such as waveguides, sensors, solar cells and optoelectronics. The origins of complex phenomena on surfaces and in thin films can be revealed by applying angle-resolved spectroscopy in two dimensions: the angle of incidence is scanned while analyzing the full emission spectrum in terms of exit angle and emission energy. The interface of a metal layer and a dielectric can support collective electron plasma resonances, i.e. surface plasmon polaritons, which are accompanied by giant field enhancement while propagating along the interface. We characterize the Kretschmann and the Otto coupling configuration in terms of their coupling efficiency and their impact on the surface plasmon resonance as a function of wave-vector. Although being commonly considered as equivalent in terms of plasmonic coupling, we identify differing dependencies of their respective coupling efficiency on the coupling layer thickness and the excitation wavelength which is fundamental for sensing applications. Provided that a metal layer is embedded in a symmetric cladding in terms of its dielectric function and the film thickness is reduced to the order of λ/10, modes from both interfaces can couple and propagate as long-range surface waves. Surprisingly, even intrinsically absorbing films support low-loss surface waves, whose propagation length can become arbitrarily long in the limit of vanishing film thickness. This phenomenon requires only that the material’s dielectric function be predominantly imaginary over that particular range of optical frequencies. Furthermore we show that the orientation of transition dipole moments inside thin monolayer films of effective media that contain oriented CdSe nano-platelets can be determined by applying k-space spectroscopy. Thus we determine electronic and dielectric contributions to the emission anisotropy and reveal the intrinsic nature of the directionality in the emission. We show that this phenomenon is related to the anisotropy of the electronic Bloch states that govern the transition dipole moment of the exciton. Beyond the linear investigation of surfaces and thin films, 2D-k-space spectroscopy can provide an insight into the principles of nonlinear wave-mixing interactions. The role of surface plasmons in second harmonic generation, whether they act as field-enhancing catalysts or as quasiparticles converted in the interaction can be revealed by k-space spectroscopy: by way of the signature in k-space, we identify a nonlinear interaction where two surface plasmons annihilate to create a second-harmonic photon as well as the interaction of a plasmon and a photon by virtue of a degenerate three-wave mixing process. We analyze the intrinsic origin of surface plasmon enhanced second harmonic generation in metal films by comparing the absolute nonlinear yield in attenuated internal reflection configurations to theoretical calculations based on the hydrodynamic model. A first estimation of the nonlinear parameters in the hydrodynamic model is given and the contributions of the bulk and surface source are determined, showing that the in-plane surface source is stronger than predicted. For absorbing thin films however, we report the first evidence of field enhancement and long-range surface wave enhanced second harmonic generation. Here, we identify the out-of-plane surface source to have the strongest contribution to the second harmonic yield.As the nonlinear susceptibility of a material can greatly increase if the probing frequency approaches an absorption resonance, absorbing materials can indeed be considered as low-loss optical media for doing surface-wave optics in the nonlinear regime. We show further, that, in contrast to the isotropic linear absorption, the two-photon absorption in oriented nano-platelets is highly anisotropic. This transition dipole orientation is dependent on the probabilities of the involved processes and their selection rules. We demonstrate that an additional silver layer covered with SiO2 enables surface plasmon enhanced excitation of oriented nano-platelets, boosting the photoluminescent emission which is highly directed through coupling to the plasmonic mode. The combination of TPA and the plasmonic resonance even leads to further concentration of the absorption range as a function of excitation wave-vector. In summary, this work has shown that 2D-k-space spectroscopy –as applied to solid surfaces, thin films and nano-particles– provides insight into the intrinsic material properties, as well as the surfacewave and radiation phenomena supported by these structures.

7 citations

Proceedings ArticleDOI
28 May 2022
TL;DR: Decision Tree, DT, Random Forest, Stochastic Gradient Descent, Logistic Regression, Gaussian Naive Bayes, and K-Nearest Neighbors are models used to foretell the specialized who would be diagnosed with COVID-19 quickly by using CXR images classification, and the KNN has revealed the best accuracy compared with the others.
Abstract: Coronavirus (COVID-19) changed the view of people towards life in all the countries of the world in December 2019. The virus has made chaos that cannot be predicted. This problem requires using a variety of technologies to aid in the identification of COVID-19 patients and to control the disease spread. For suspected instances of COVID-19 disease, chest X-ray (CXR) imaging is a standard with fewer costs, but it does not need a COVID-19 examination approach without using technology to help for a suitable diagnosis. In response to this issue, a big dataset of CXR images was divided into four classes found on the website Kaggle. Dealing with large data of the images needs dataset reprocessing through choosing the optimal method for getting speed and best accuracy. Dataset reprocessing converts into gray level then adjust image intensity, resize and extract the best features then apply Machine Learning ML models. The use of different prediction models, ML algorithms, and their performances are calculated with evaluation on the dataset after reprocessing. Decision Tree (DT), Random Forest (RF), Stochastic Gradient Descent (SGD), Logistic Regression (LR), Gaussian Naive Bayes (GNB), and K-Nearest Neighbors (KNN) are models used to foretell the specialized who would be diagnosed with COVID-19 quickly by using CXR images classification. The KNN has revealed the best accuracy compared with the others such as GNB, DT, SGD, LR, and RF. Also, KNN has the best-weighted average for all parameters, which are precision, sensitivity, and F1-score compared with the other models.

2 citations

Journal ArticleDOI
01 Feb 2021
TL;DR: The adaptive resource provisioning is integrated with reinforcement learning mechanism during the service admission process and ensures the collaborated cloud providers will gain more profits without violation of SLA.
Abstract: In cloud computing, the cloud provider agent offers the quality of service (QoS) for different categories of cloud consumer agents. In general, the inter-cloud environment provides resources as a virtual machine (VM) instance representing processing power, Memory allocated in RAM, and secondary storage for consumer agents with QoS guarantees. A service level agreement framework with a reinforcement learning mechanism is considered for provisioning VM’s for all categories of client classes. The parameters like cost of service, availability, and service demand are considered while provisioning VM’s in the inter-cloud environment. QoS violation happens because another set of cloud consumer agents receives the less no of VMs. In our approach, the adaptive resource provisioning is integrated with reinforcement learning mechanism during the service admission process and ensures the collaborated cloud providers will gain more profits without violation of SLA.

1 citations


Cites background from "Foundations to frontiers of big dat..."

  • ...The handling of arrival of requests from different categories of cloud consumers uses the representation of handling of requests through Markov Decision Process (MDP) [25] to check for the likelihood of meeting different cloud providers QoS conditions for different cloud consumer requests....

    [...]

  • ...Q-learning [14] provides an alternate mechanism for the RL technique to enhance the utilization of MDP for different blocking probabilities by treating the requests for VM instances a state space....

    [...]

References
More filters
Journal Article
TL;DR: Big data, the authors write, is far more powerful than the analytics of the past, and executives can measure and therefore manage more precisely than ever before, and make better predictions and smarter decisions.
Abstract: Big data, the authors write, is far more powerful than the analytics of the past. Executives can measure and therefore manage more precisely than ever before. They can make better predictions and smarter decisions. They can target more-effective interventions in areas that so far have been dominated by gut and intuition rather than by data and rigor. The differences between big data and analytics are a matter of volume, velocity, and variety: More data now cross the internet every second than were stored in the entire internet 20 years ago. Nearly real-time information makes it possible for a company to be much more agile than its competitors. And that information can come from social networks, images, sensors, the web, or other unstructured sources. The managerial challenges, however, are very real. Senior decision makers have to learn to ask the right questions and embrace evidence-based decision making. Organizations must hire scientists who can find patterns in very large data sets and translate them into useful business information. IT departments have to work hard to integrate all the relevant internal and external sources of data. The authors offer two success stories to illustrate how companies are using big data: PASSUR Aerospace enables airlines to match their actual and estimated arrival times. Sears Holdings directly analyzes its incoming store data to make promotions much more precise and faster.

3,616 citations

Journal ArticleDOI
TL;DR: Evidence that the effect of DDD on the productivity do not appear to be due to reverse causality is found, providing some of the first large scale data on the direct connection between data-driven decision making and firm performance.
Abstract: We examine whether firms that emphasize decision making based on data and business analytics (“data driven decision making” or DDD) show higher performance. Using detailed survey data on the business practices and information technology investments of 179 large publicly traded firms, we find that firms that adopt DDD have output and productivity that is 5-6% higher than what would be expected given their other investments and information technology usage. Furthermore, the relationship between DDD and performance also appears in other performance measures such as asset utilization, return on equity and market value. Using instrumental variables methods, we find evidence that the effect of DDD on the productivity do not appear to be due to reverse causality. Our results provide some of the first large scale data on the direct connection between data-driven decision making and firm performance.

542 citations

Journal ArticleDOI
TL;DR: Analysis of how labor market factors have shaped early returns on big data investment using a new data source---the LinkedIn skills database--- underscores the importance of geography, corporate investment, and skill acquisition channels for explaining productivity growth differences during the spread of new information technology innovations.
Abstract: This paper considers how labor market factors have shaped early returns to investment in big data technologies. It tests the hypothesis that returns to early investments in Hadoop — a key big data infrastructure technology — have been concentrated in select labor markets due to the importance of aggregate corporate investment levels within a labor market for producing a supply of complementary technical skills during the early stages of technology diffusion. The analysis uses a new data source — the LinkedIn skills database — enabling direct measurement of firms’ investments into emerging technical skills such as Hadoop, Map/Reduce, and Apache Pig. Productivity estimates indicate that from 2006 to 2011, firms’ Hadoop investments were associated with 3% faster productivity growth, but only for firms a) with significant existing data assets and b) in labor networks characterized by significant aggregate Hadoop investment. Evidence for the importance of labor market concentration disappears for investments in mature data technologies, such as SQL-based databases, for which the skills are diffused and readily available through universities and other channels. These findings underscore the importance of geography, corporate investment, and channels for technical skill acquisition for explaining differences in productivity growth rates across labor markets during the spread of new IT innovations.

252 citations

Book
04 Feb 2014
TL;DR: Big Data at Work as discussed by the authors is a book about big data at work that covers all the bases: what big data means from a technical, consumer, and management perspective; what its opportunities and costs are; where it can have real business impact; and which aspects of this hot topic have been oversold.
Abstract: Go ahead, be skeptical about big data. The author wasat first.When the term big data first came on the scene, bestselling author Tom Davenport (Competing on Analytics, Analytics at Work) thought it was just another example of technology hype. But his research in the years that followed changed his mind.Now, in clear, conversational language, Davenport explains what big data meansand why everyone in business needs to know about it. Big Data at Work covers all the bases: what big data means from a technical, consumer, and management perspective; what its opportunities and costs are; where it can have real business impact; and which aspects of this hot topic have been oversold.This book will help you understand: Why big data is important to you and your organization What technology you need to manage it How big data could change your job, your company, and your industry How to hire, rent, or develop the kinds of people who make big data work The key success factors in implementing any big data project How big data is leading to a new approach to managing analyticsWith dozens of company examples, including UPS, GE, Amazon, United Healthcare, Citigroup, and many others, this book will help you seize all opportunitiesfrom improving decisions, products, and services to strengthening customer relationships. It will show you how to put big data to work in your own organization so that you too can harness the power of this ever-evolving new resource.

247 citations


"Foundations to frontiers of big dat..." refers background in this paper

  • ...been undertaken through the lens of big data investment [4] impact on company productivity, e....

    [...]

  • ...successful in investing in big data Hadooplike infrastructures [4]....

    [...]

Journal ArticleDOI
TL;DR: Gupta et al. as discussed by the authors analyzed how labor market factors have shaped early returns on big data investment using a new data source, the LinkedIn skills database, which enables firm-level measurement of the employment of workers with technical skills.
Abstract: This paper analyzes how labor market factors have shaped early returns on big data investment using a new data source---the LinkedIn skills database. The data source enables firm-level measurement of the employment of workers with technical skills such as Hadoop, MapReduce, and Apache Pig. From 2006 to 2011, Hadoop investments were associated with 3% faster productivity growth, but only for firms a with significant data assets and b in labor markets where similar investments by other firms helped to facilitate the development of a cadre of workers with complementary technical skills. The benefits of labor market concentration decline for investments in mature data technologies, such as Structured Query Language-based databases, for which the complementary skills can be acquired by workers through universities or other channels. These findings underscore the importance of geography, corporate investment, and skill acquisition channels for explaining productivity growth differences during the spread of new information technology innovations. This paper was accepted by Alok Gupta, special issue on business analytics.

219 citations

Frequently Asked Questions (1)
Q1. What are the contributions in this paper?

A case study using Spark is given as an example. The paper gives the importance of cloud computing in Big data paradigm.