scispace - formally typeset
Search or ask a question
Author

Rohan Arora

Bio: Rohan Arora is an academic researcher. The author has contributed to research in topics: ATM card & Cloud testing. The author has an hindex of 3, co-authored 4 publications receiving 116 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: Two of the comparison of - Hadoop Map Reduce and the recently introduced Apache Spark - both of which provide a processing model for analyzing big data are discussed, both of whom vary significantly based on the use case under implementation.
Abstract: Data has long been the topic of fascination for Computer Science enthusiasts around the world, and has gained even more prominence in the recent times with the continuous explosion of data resulting from the likes of social media and the quest for tech giants to gain access to deeper analysis of their data This paper discusses two of the comparison of - Hadoop Map Reduce and the recently introduced Apache Spark - both of which provide a processing model for analyzing big data Although both of these options are based on the concept of Big Data, their performance varies significantly based on the use case under implementation This is what makes these two options worthy of analysis with respect to their variability and variety in the dynamic field of Big Data In this paper we compare these two frameworks along with providing the performance analysis using a standard machine learning algorithm for clustering (K- Means)

124 citations

Proceedings ArticleDOI
31 Dec 2012
TL;DR: This paper focuses on the Cloud-Based management of projects through the means of software as a service and its various augmentable utilities to make the process less complex and highly feasible along with the robustness of Cloud Services.
Abstract: Project Management has long been one of the arduous jobs for individuals as well as companies to pertain to. Yet with the dynamic architecture of Cloud Service facilities, this could easily be managed with respect to the availability and cost efficiency. This paper focuses on the Cloud-Based management of projects through the means of software as a service and its various augmentable utilities. A project management Software is one which is used to incorporate an efficient usage of resources based on the use and utilization, which is best implemented with the help of a Cloud Service. This would in turn simplify the needs of efforts for parties needing to meet software requirements of a project from different third parties and instead, providing it from one source to make the process less complex and highly feasible along with the robustness of Cloud Services.

5 citations

Proceedings ArticleDOI
11 Apr 2013
TL;DR: This paper emphasizes on the hypothetical, yet very possible scenario of an individual's ATM card falling into the wrong hands, and the PIN number being cracked by a theft perpetrating entity.
Abstract: ATM Security has always been one of the most prominent issues concerning the daily users and the not so frequent ones as well. This paper emphasizes on the hypothetical, yet very possible scenario of an individual's ATM card falling into the wrong hands, and the PIN number being cracked by a theft perpetrating entity. Our proposed model uses certain factors which would be monitored right from the initiation, to the end of the respective transaction. With the help of these factors, we would declare the status of the transaction before proceeding with cash withdrawal. Such monitoring would assist the transaction with a secure approach to bank upon.

5 citations

Journal ArticleDOI
TL;DR: Elitist strategy of ACO (elitist ant system) is applied for solving SDPSP in a parameter-constrained environment taking project's cost and duration into consideration and results show that the proposed ACOSDPSP methodology is promising in achieving the desired results.
Abstract: Resource allocation and tasks assignment to software development teams are very crucial and arduous activities that can affect a project's cost and completion time. Solution for such problem is NP-Hard and requires software managers to be supported with efficient tools that can perform such allocation and can resolve the software development project scheduling problem (SDPSP) more efficiently. Ant colony optimization (ACO) is a rapidly evolving meta-heuristic technique based on the real life behavior of ants and can be used to solve NP-Hard (SDPSP) problem. Different versions of ACO meta-heuristic have already been applied to the software project scheduling problem in the past that took various resources into account. We have applied elitist strategy of ACO (elitist ant system) for solving SDPSP in a parameter-constrained environment taking project's cost and duration into consideration. The objective of the ACOSDPSP methodology allows software project managers and schedulers to assign most effective set of employees that can contribute in minimizing cost and duration of the software project. Experimental results show that the proposed ACOSDPSP methodology is promising in achieving the desired results. General Terms Simulation,Data dictionary/directory ,Data warehouse and repository,Logging recovery,Security,integrity and protection

2 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: To provide relevant solutions for improving public health, healthcare providers are required to be fully equipped with appropriate infrastructure to systematically generate and analyze big data.
Abstract: ‘Big data’ is massive amounts of information that can work wonders. It has become a topic of special interest for the past two decades because of a great potential that is hidden in it. Various public and private sector industries generate, store, and analyze big data with an aim to improve the services they provide. In the healthcare industry, various sources for big data include hospital records, medical records of patients, results of medical examinations, and devices that are a part of internet of things. Biomedical research also generates a significant portion of big data relevant to public healthcare. This data requires proper management and analysis in order to derive meaningful information. Otherwise, seeking solution by analyzing big data quickly becomes comparable to finding a needle in the haystack. There are various challenges associated with each step of handling big data which can only be surpassed by using high-end computing solutions for big data analysis. That is why, to provide relevant solutions for improving public health, healthcare providers are required to be fully equipped with appropriate infrastructure to systematically generate and analyze big data. An efficient management, analysis, and interpretation of big data can change the game by opening new avenues for modern healthcare. That is exactly why various industries, including the healthcare industry, are taking vigorous steps to convert this potential into better services and financial advantages. With a strong integration of biomedical and healthcare data, modern healthcare organizations can possibly revolutionize the medical therapies and personalized medicine.

615 citations

01 Jan 1981
TL;DR: In this article, the authors provide an overview of economic analysis techniques and their applicability to software engineering and management, including the major estimation techniques available, the state of the art in algorithmic cost models, and the outstanding research issues in software cost estimation.
Abstract: This paper summarizes the current state of the art and recent trends in software engineering economics. It provides an overview of economic analysis techniques and their applicability to software engineering and management. It surveys the field of software cost estimation, including the major estimation techniques available, the state of the art in algorithmic cost models, and the outstanding research issues in software cost estimation.

283 citations

Journal ArticleDOI
TL;DR: This review shows what Apache Spark has for designing and implementing big data algorithms and pipelines for machine learning, graph analysis and stream processing and highlights some research and development directions on Apache Spark for big data analytics.
Abstract: Apache Spark has emerged as the de facto framework for big data analytics with its advanced in-memory programming model and upper-level libraries for scalable machine learning, graph analysis, streaming and structured data processing. It is a general-purpose cluster computing framework with language-integrated APIs in Scala, Java, Python and R. As a rapidly evolving open source project, with an increasing number of contributors from both academia and industry, it is difficult for researchers to comprehend the full body of development and research behind Apache Spark, especially those who are beginners in this area. In this paper, we present a technical review on big data analytics using Apache Spark. This review focuses on the key components, abstractions and features of Apache Spark. More specifically, it shows what Apache Spark has for designing and implementing big data algorithms and pipelines for machine learning, graph analysis and stream processing. In addition, we highlight some research and development directions on Apache Spark for big data analytics.

241 citations

Journal ArticleDOI
TL;DR: This paper shows how to exploit the most recent technological tools and advances in Statistical Learning Theory (SLT) in order to efficiently build an Extreme Learning Machine (ELM) and assess the resultant model's performance when applied to big social data analysis.
Abstract: The science of opinion analysis based on data from social networks and other forms of mass media has garnered the interest of the scientific community and the business world. Dealing with the increasing amount of information present on the Web is a critical task and requires efficient models developed by the emerging field of sentiment analysis. To this end, current research proposes an efficient approach to support emotion recognition and polarity detection in natural language text. In this paper, we show how to exploit the most recent technological tools and advances in Statistical Learning Theory (SLT) in order to efficiently build an Extreme Learning Machine (ELM) and assess the resultant model's performance when applied to big social data analysis. ELM represents a powerful learning tool, developed to overcome some issues in back-propagation networks. The main problem with ELM is in training them to work in the event of a large number of available samples, where the generalization performance has to be carefully assessed. For this reason, we propose an ELM implementation that exploits the Spark distributed in memory technology and show how to take advantage of the most recent advances in SLT in order to address the issue of selecting ELM hyperparameters that give the best generalization performance.

101 citations

Proceedings ArticleDOI
01 Dec 2017
TL;DR: This contribution explores the expanding body of the Apache Spark MLlib 2.0 as an open-source, distributed, scalable, and platform independent machine learning library, and performs several real world machine learning experiments to examine the qualitative and quantitative attributes of the platform.
Abstract: Artificial intelligence, and particularly machine learning, has been used in many ways by the research community to turn a variety of diverse and even heterogeneous data sources into high quality facts and knowledge, providing premier capabilities to accurate pattern discovery. However, applying machine learning strategies on big and complex datasets is computationally expensive, and it consumes a very large amount of logical and physical resources, such as data file space, CPU, and memory. A sophisticated platform for efficient big data analytics is becoming more important these days as the data amount generated in a daily basis exceeds over quintillion bytes. Apache Spark MLlib is one of the most prominent platforms for big data analysis which offers a set of excellent functionalities for different machine learning tasks ranging from regression, classification, and dimension reduction to clustering and rule extraction. In this contribution, we explore, from the computational perspective, the expanding body of the Apache Spark MLlib 2.0 as an open-source, distributed, scalable, and platform independent machine learning library. Specifically, we perform several real world machine learning experiments to examine the qualitative and quantitative attributes of the platform. Furthermore, we highlight current trends in big data machine learning research and provide insights for future work.

66 citations