Showing papers in "ACM Computing Surveys in 2017"

PDF

Open Access

Journal Article•DOI•

[...]

Jundong Li¹, Kewei Cheng¹, Suhang Wang¹, Fred Morstatter¹, Robert P. Trevino¹, Jiliang Tang², Huan Liu¹ - Show less +3 more•Institutions (2)

Arizona State University¹, Michigan State University²

06 Dec 2017-ACM Computing Surveys

TL;DR: This survey revisits feature selection research from a data perspective and reviews representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data, and categorizes them into four main groups: similarity- based, information-theoretical-based, sparse-learning-based and statistical-based.

...read moreread less

Abstract: Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data-mining and machine-learning problems. The objectives of feature selection include building simpler and more comprehensible models, improving data-mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity-based, information-theoretical-based, sparse-learning-based, and statistical-based methods. To facilitate and promote the research in this community, we also present an open source feature selection repository that consists of most of the popular feature selection algorithms (http://featureselection.asu.edu/). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research.

...read moreread less

1,566 citations

Journal Article•DOI•

Imitation Learning: A Survey of Learning Methods

[...]

Ahmed Hussein¹, Mohamed Medhat Gaber², Eyad Elyan¹, Chrisina Jayne¹•Institutions (2)

Robert Gordon University¹, Birmingham City University²

06 Apr 2017-ACM Computing Surveys

TL;DR: This article surveys imitation learning methods and presents design options in different steps of the learning process, and extensively discusses combining imitation learning approaches using different sources and methods, as well as incorporating other motion learning methods to enhance imitation.

...read moreread less

Abstract: Imitation learning techniques aim to mimic human behavior in a given task. An agent (a learning machine) is trained to perform a task from demonstrations by learning a mapping between observations and actions. The idea of teaching by imitation has been around for many years; however, the field is gaining attention recently due to advances in computing and sensing as well as rising demand for intelligent applications. The paradigm of learning by imitation is gaining popularity because it facilitates teaching complex tasks with minimal expert knowledge of the tasks. Generic imitation learning methods could potentially reduce the problem of teaching a task to that of providing demonstrations, without the need for explicit programming or designing reward functions specific to the task. Modern sensors are able to collect and transmit high volumes of data rapidly, and processors with high computational power allow fast processing that maps the sensory data to actions in a timely manner. This opens the door for many potential AI applications that require real-time perception and reaction such as humanoid robots, self-driving vehicles, human computer interaction, and computer games, to name a few. However, specialized algorithms are needed to effectively and robustly learn models as learning by imitation poses its own set of challenges. In this article, we survey imitation learning methods and present design options in different steps of the learning process. We introduce a background and motivation for the field as well as highlight challenges specific to the imitation problem. Methods for designing and evaluating imitation learning tasks are categorized and reviewed. Special attention is given to learning methods in robotics and games as these domains are the most popular in the literature and provide a wide array of problems and methodologies. We extensively discuss combining imitation learning approaches using different sources and methods, as well as incorporating other motion learning methods to enhance imitation. We also discuss the potential impact on industry, present major applications, and highlight current and future research directions.

...read moreread less

535 citations

Journal Article•DOI•

A Survey on Malware Detection Using Data Mining Techniques

[...]

Yanfang Ye¹, Tao Li², Donald A. Adjeroh¹, S. Sitharama Iyengar³•Institutions (3)

West Virginia University¹, Nanjing University of Posts and Telecommunications², Florida International University³

29 Jun 2017-ACM Computing Surveys

TL;DR: There is an urgent need to develop intelligent methods for effective and efficient malware detection from the real and large daily sample collection and a comprehensive investigation on both the feature extraction and the classification/clustering techniques is provided.

...read moreread less

Abstract: In the Internet age, malware (such as viruses, trojans, ransomware, and bots) has posed serious and evolving security threats to Internet users. To protect legitimate users from these threats, anti-malware software products from different companies, including Comodo, Kaspersky, Kingsoft, and Symantec, provide the major defense against malware. Unfortunately, driven by the economic benefits, the number of new malware samples has explosively increased: anti-malware vendors are now confronted with millions of potential malware samples per year. In order to keep on combating the increase in malware samples, there is an urgent need to develop intelligent methods for effective and efficient malware detection from the real and large daily sample collection. In this article, we first provide a brief overview on malware as well as the anti-malware industry, and present the industrial needs on malware detection. We then survey intelligent malware detection methods. In these methods, the process of detection is usually divided into two stages: feature extraction and classification/clustering. The performance of such intelligent malware detection approaches critically depend on the extracted features and the methods for classification/clustering. We provide a comprehensive investigation on both the feature extraction and the classification/clustering techniques. We also discuss the additional issues and the challenges of malware detection using data mining techniques and finally forecast the trends of malware development.

...read moreread less

443 citations

Journal Article•DOI•

A Survey on Ensemble Learning for Data Stream Classification

[...]

Heitor Murilo Gomes, Jean Paul Barddal, Fabrício Enembreck, Albert Bifet¹•Institutions (1)

Université Paris-Saclay¹

27 Mar 2017-ACM Computing Surveys

TL;DR: This work proposes a taxonomy for data stream ensemble learning as derived from reviewing over 60 algorithms, and important aspects such as combination, diversity, and dynamic updates, are thoroughly discussed.

...read moreread less

Abstract: Ensemble-based methods are among the most widely used techniques for data stream classification. Their popularity is attributable to their good performance in comparison to strong single learners while being relatively easy to deploy in real-world applications. Ensemble algorithms are especially useful for data stream learning as they can be integrated with drift detection algorithms and incorporate dynamic updates, such as selective removal or addition of classifiers. This work proposes a taxonomy for data stream ensemble learning as derived from reviewing over 60 algorithms. Important aspects such as combination, diversity, and dynamic updates, are thoroughly discussed. Additional contributions include a listing of popular open-source tools and a discussion about current data stream research challenges and how they relate to ensemble learning (big data streams, concept evolution, feature drifts, temporal dependencies, and others).

...read moreread less

395 citations

Journal Article•DOI•

Data-Driven Techniques in Disaster Information Management

[...]

Tao Li¹, Ning Xie², Chunqiu Zeng², Wubai Zhou², Li Zheng², Yexi Jiang², Yimin Yang², Hsin-Yu Ha², Wei Xue², Yue Huang¹, Shu-Ching Chen², Jainendra K. Navlakha², S. Sitharama Iyengar² - Show less +9 more•Institutions (2)

Nanjing University of Posts and Telecommunications¹, Florida International University²

10 Mar 2017-ACM Computing Surveys

TL;DR: A general overview of the requirements and system architectures of disaster management systems is presented and state-of-the-art data-driven techniques that have been applied on improving situation awareness as well as in addressing users’ information needs in disaster management are summarized.

...read moreread less

Abstract: Improving disaster management and recovery techniques is one of national priorities given the huge toll caused by man-made and nature calamities. Data-driven disaster management aims at applying advanced data collection and analysis technologies to achieve more effective and responsive disaster management, and has undergone considerable progress in the last decade. However, to the best of our knowledge, there is currently no work that both summarizes recent progress and suggests future directions for this emerging research area. To remedy this situation, we provide a systematic treatment of the recent developments in data-driven disaster management. Specifically, we first present a general overview of the requirements and system architectures of disaster management systems and then summarize state-of-the-art data-driven techniques that have been applied on improving situation awareness as well as in addressing users’ information needs in disaster management. We also discuss and categorize general data-mining and machine-learning techniques in disaster management. Finally, we recommend several research directions for further investigations.

...read moreread less

364 citations

Journal Article•DOI•

Business Process Variability Modeling: A Survey

[...]

Marcello La Rosa¹, Wil M. P. van der Aalst², Marlon Dumas¹, Fredrik Milani³•Institutions (3)

Queensland University of Technology¹, Eindhoven University of Technology², University of Tartu³

10 Mar 2017-ACM Computing Surveys

TL;DR: This survey draws up a systematic inventory of approaches to customizable process modeling and provides a comparative evaluation with the aim of identifying common and differentiating modeling features, providing criteria for selecting among multiple approaches, and identifying gaps in the state of the art.

...read moreread less

Abstract: It is common for organizations to maintain multiple variants of a given business process, such as multiple sales processes for different products or multiple bookkeeping processes for different countries. Conventional business process modeling languages do not explicitly support the representation of such families of process variants. This gap triggered significant research efforts over the past decade, leading to an array of approaches to business process variability modeling. In general, each of these approaches extends a conventional process modeling language with constructs to capture customizable process models. A customizable process model represents a family of process variants in a way that a model of each variant can be derived by adding or deleting fragments according to customization options or according to a domain model. This survey draws up a systematic inventory of approaches to customizable process modeling and provides a comparative evaluation with the aim of identifying common and differentiating modeling features, providing criteria for selecting among multiple approaches, and identifying gaps in the state of the art. The survey puts into evidence an abundance of customizable process-modeling languages, which contrasts with a relative scarcity of available tool support and empirical comparative evaluations.

...read moreread less

358 citations

Journal Article•DOI•

Fog Computing for Sustainable Smart Cities: A Survey

[...]

Charith Perera¹, Yongrui Qin², Julio Cezar Estrella³, Stephan Reiff-Marganiec⁴, Athanasios V. Vasilakos⁵ - Show less +1 more•Institutions (5)

Open University¹, University of Huddersfield², University of São Paulo³, University of Leicester⁴, Luleå University of Technology⁵

29 Jun 2017-ACM Computing Surveys

TL;DR: Several inspiring use case scenarios of Fog computing are described, several major functionalities that ideal Fog computing platforms should support and a number of open challenges toward implementing them are identified to shed light on future research directions on realizing Fog computing for building sustainable smart cities.

...read moreread less

Abstract: The Internet of Things (IoT) aims to connect billions of smart objects to the Internet, which can bring a promising future to smart cities. These objects are expected to generate large amounts of data and send the data to the cloud for further processing, especially for knowledge discovery, in order that appropriate actions can be taken. However, in reality sensing all possible data items captured by a smart object and then sending the complete captured data to the cloud is less useful. Further, such an approach would also lead to resource wastage (e.g., network, storage, etc.). The Fog (Edge) computing paradigm has been proposed to counterpart the weakness by pushing processes of knowledge discovery using data analytics to the edges. However, edge devices have limited computational capabilities. Due to inherited strengths and weaknesses, neither Cloud computing nor Fog computing paradigm addresses these challenges alone. Therefore, both paradigms need to work together in order to build a sustainable IoT infrastructure for smart cities. In this article, we review existing approaches that have been proposed to tackle the challenges in the Fog computing domain. Specifically, we describe several inspiring use case scenarios of Fog computing, identify ten key characteristics and common features of Fog computing, and compare more than 30 existing research efforts in this domain. Based on our review, we further identify several major functionalities that ideal Fog computing platforms should support and a number of open challenges toward implementing them, to shed light on future research directions on realizing Fog computing for building sustainable smart cities.

...read moreread less

341 citations

Journal Article•DOI•

Current State of Text Sentiment Analysis from Opinion to Emotion Mining

[...]

Ali Yadollahi¹, Ameneh Gholipour Shahraki¹, Osmar R. Zaïane¹•Institutions (1)

University of Alberta¹

25 May 2017-ACM Computing Surveys

TL;DR: This work presents the state-of-the-art methods and proposes the following contributions: a taxonomy of sentiment analysis; a survey on polarity classification methods and resources, especially those related to emotion mining; a complete survey on emotion theories and emotion-mining research; and some useful resources, including lexicons and datasets.

...read moreread less

Abstract: Sentiment analysis from text consists of extracting information about opinions, sentiments, and even emotions conveyed by writers towards topics of interest. It is often equated to opinion mining, but it should also encompass emotion mining. Opinion mining involves the use of natural language processing and machine learning to determine the attitude of a writer towards a subject. Emotion mining is also using similar technologies but is concerned with detecting and classifying writers emotions toward events or topics. Textual emotion-mining methods have various applications, including gaining information about customer satisfaction, helping in selecting teaching materials in e-learning, recommending products based on users emotions, and even predicting mental-health disorders. In surveys on sentiment analysis, which are often old or incomplete, the strong link between opinion mining and emotion mining is understated. This motivates the need for a different and new perspective on the literature on sentiment analysis, with a focus on emotion mining. We present the state-of-the-art methods and propose the following contributions: (1) a taxonomy of sentiment analysis; (2) a survey on polarity classification methods and resources, especially those related to emotion mining; (3) a complete survey on emotion theories and emotion-mining research; and (4) some useful resources, including lexicons and datasets.

...read moreread less

331 citations

Journal Article•DOI•

Nudges for Privacy and Security: Understanding and Assisting Users’ Choices Online

[...]

Alessandro Acquisti¹, Idris Adjerid², Rebecca Balebako¹, Laura Brandimarte³, Lorrie Faith Cranor¹, Saranga Komanduri, Pedro Giovanni Leon⁴, Norman Sadeh¹, Florian Schaub⁵, Manya Sleeper¹, Yang Wang⁶, Shomir Wilson⁷ - Show less +8 more•Institutions (7)

Carnegie Mellon University¹, University of Notre Dame², University of Arizona³, Bank of Mexico⁴, University of Michigan⁵, Syracuse University⁶, University of Cincinnati⁷

08 Aug 2017-ACM Computing Surveys

TL;DR: This article focuses on research on assisting individuals’ privacy and security choices with soft paternalistic interventions that nudge users toward more beneficial choices and identifies key ethical, design, and research challenges.

...read moreread less

Abstract: Advancements in information technology often task users with complex and consequential privacy and security decisions A growing body of research has investigated individuals’ choices in the presence of privacy and information security tradeoffs, the decision-making hurdles affecting those choices, and ways to mitigate such hurdles This article provides a multi-disciplinary assessment of the literature pertaining to privacy and security decision making It focuses on research on assisting individuals’ privacy and security choices with soft paternalistic interventions that nudge users toward more beneficial choices The article discusses potential benefits of those interventions, highlights their shortcomings, and identifies key ethical, design, and research challenges

...read moreread less

301 citations

Journal Article•DOI•

A Survey of Presence and Related Concepts

[...]

Richard Skarbez¹, Frederick P. Brooks¹, Mary C. Whitton¹•Institutions (1)

University of North Carolina at Chapel Hill¹

14 Nov 2017-ACM Computing Surveys

TL;DR: A meta-analysis of existing presence models is presented and a model of presence informed by Slater’s Place Illusion and Plausibility Illusion constructs is proposed.

...read moreread less

Abstract: The presence construct, most commonly defined as the sense of “being there,” has driven research and development of virtual environments (VEs) for decades. Despite that, there is not widespread agreement on how to define or operationalize this construct. The literature contains many different definitions of presence and many proposed measures for it. This article reviews many of the definitions, measures, and models of presence from the literature. We also review several related constructs, including social presence, copresence, immersion, agency, transportation, reality judgment, and embodiment. In addition, we present a meta-analysis of existing presence models and propose a model of presence informed by Slater’s Place Illusion and Plausibility Illusion constructs.

...read moreread less

292 citations

Journal Article•DOI•

The Evolution of Android Malware and Android Analysis Techniques

[...]

Kimberly Tam¹, Ali Feizollah², Nor Badrul Anuar², Rosli Salleh², Lorenzo Cavallaro¹ - Show less +1 more•Institutions (2)

Royal Holloway, University of London¹, University of Malaya²

13 Jan 2017-ACM Computing Surveys

TL;DR: A comprehensive survey on leading Android malware analysis and detection techniques, and their effectiveness against evolving malware, is presented and categorizes systems by methodology and date to evaluate progression and weaknesses.

...read moreread less

Abstract: With the integration of mobile devices into daily life, smartphones are privy to increasing amounts of sensitive information. Sophisticated mobile malware, particularly Android malware, acquire or utilize such data without user consent. It is therefore essential to devise effective techniques to analyze and detect these threats. This article presents a comprehensive survey on leading Android malware analysis and detection techniques, and their effectiveness against evolving malware. This article categorizes systems by methodology and date to evaluate progression and weaknesses. This article also discusses evaluations of industry solutions, malware statistics, and malware evasion techniques and concludes by supporting future research paths.

...read moreread less

Journal Article•DOI•

Presentation Attack Detection Methods for Face Recognition Systems: A Comprehensive Survey

[...]

Raghavendra Ramachandra¹, Christoph Busch¹•Institutions (1)

Norwegian University of Science and Technology¹

20 Mar 2017-ACM Computing Surveys

TL;DR: This paper describes the various aspects of face presentation attacks, including different types of face artifacts, state-of-the-art PAD algorithms and an overview of the respective research labs working in this domain, vulnerability assessments and performance evaluation metrics, the outcomes of competitions, the availability of public databases for benchmarking new P AD algorithms in a reproducible manner, and a summary of the relevant international standardization in this field.

...read moreread less

Abstract: The vulnerability of face recognition systems to presentation attacks (also known as direct attacks or spoof attacks) has received a great deal of interest from the biometric community. The rapid evolution of face recognition systems into real-time applications has raised new concerns about their ability to resist presentation attacks, particularly in unattended application scenarios such as automated border control. The goal of a presentation attack is to subvert the face recognition system by presenting a facial biometric artifact. Popular face biometric artifacts include a printed photo, the electronic display of a facial photo, replaying video using an electronic display, and 3D face masks. These have demonstrated a high security risk for state-of-the-art face recognition systems. However, several presentation attack detection (PAD) algorithms (also known as countermeasures or antispoofing methods) have been proposed that can automatically detect and mitigate such targeted attacks. The goal of this survey is to present a systematic overview of the existing work on face presentation attack detection that has been carried out. This paper describes the various aspects of face presentation attacks, including different types of face artifacts, state-of-the-art PAD algorithms and an overview of the respective research labs working in this domain, vulnerability assessments and performance evaluation metrics, the outcomes of competitions, the availability of public databases for benchmarking new PAD algorithms in a reproducible manner, and finally a summary of the relevant international standardization in this field. Furthermore, we discuss the open challenges and future work that need to be addressed in this evolving field of biometrics.

...read moreread less

Journal Article•DOI•

Automatic Sarcasm Detection: A Survey

[...]

Aditya Joshi¹, Pushpak Bhattacharyya², Mark James Carman³•Institutions (3)

IITB-Monash Research Academy¹, Indian Institute of Technology Bombay², Monash University³

26 Sep 2017-ACM Computing Surveys

TL;DR: Automatic sarcasm detection is the task of predicting sarcasm in text as mentioned in this paper, which is a crucial step to sentiment analysis, considering prevalence and challenges of sarcasm of sentiment-bearing text.

...read moreread less

Abstract: Automatic sarcasm detection is the task of predicting sarcasm in text. This is a crucial step to sentiment analysis, considering prevalence and challenges of sarcasm in sentiment-bearing text. Beginning with an approach that used speech-based features, automatic sarcasm detection has witnessed great interest from the sentiment analysis community. This article is a compilation of past work in automatic sarcasm detection. We observe three milestones in the research so far: semi-supervised pattern extraction to identify implicit sentiment, use of hashtag-based supervision, and incorporation of context beyond target text. In this article, we describe datasets, approaches, trends, and issues in sarcasm detection. We also discuss representative performance values, describe shared tasks, and provide pointers to future work, as given in prior works. In terms of resources to understand the state-of-the-art, the survey presents several useful illustrations—most prominently, a table that summarizes past papers along different dimensions such as the types of features, annotation techniques, and datasets used.

...read moreread less

Journal Article•DOI•

Deep Learning Advances in Computer Vision with 3D Data: A Survey

[...]

Anastasia Ioannidou¹, Elisavet Chatzilari¹, Spiros Nikolopoulos¹, Ioannis Kompatsiaris¹•Institutions (1)

Information Technology Institute¹

06 Apr 2017-ACM Computing Surveys

TL;DR: It is concluded that systems employing 2D views of 3D data typically surpass voxel-based (3D) deep models, which however, can perform better with more layers and severe data augmentation, therefore, larger-scale datasets and increased resolutions are required.

...read moreread less

Abstract: Deep learning has recently gained popularity achieving state-of-the-art performance in tasks involving text, sound, or image processing. Due to its outstanding performance, there have been efforts to apply it in more challenging scenarios, for example, 3D data processing. This article surveys methods applying deep learning on 3D data and provides a classification based on how they exploit them. From the results of the examined works, we conclude that systems employing 2D views of 3D data typically surpass voxel-based (3D) deep models, which however, can perform better with more layers and severe data augmentation. Therefore, larger-scale datasets and increased resolutions are required.

...read moreread less

Journal Article•DOI•

Software Vulnerability Analysis and Discovery Using Machine-Learning and Data-Mining Techniques: A Survey

[...]

Seyed Mohammad Ghaffarian¹, Hamid Reza Shahriari¹•Institutions (1)

Amirkabir University of Technology¹

25 Aug 2017-ACM Computing Surveys

TL;DR: An extensive review of the many different works in the field of software vulnerability analysis and discovery that utilize machine-learning and data-mining techniques that utilize both advantages and shortcomings in this domain is provided.

...read moreread less

Abstract: Software security vulnerabilities are one of the critical issues in the realm of computer security. Due to their potential high severity impacts, many different approaches have been proposed in the past decades to mitigate the damages of software vulnerabilities. Machine-learning and data-mining techniques are also among the many approaches to address this issue. In this article, we provide an extensive review of the many different works in the field of software vulnerability analysis and discovery that utilize machine-learning and data-mining techniques. We review different categories of works in this domain, discuss both advantages and shortcomings, and point out challenges and some uncharted territories in the field.

...read moreread less

Journal Article•DOI•

The Experience Sampling Method on Mobile Devices

[...]

Niels van Berkel¹, Denzil Ferreira², Vassilis Kostakos¹•Institutions (2)

University of Melbourne¹, University of Oulu²

06 Dec 2017-ACM Computing Surveys

TL;DR: An overview of the history of the ESM, usage of this methodology in the computer science discipline, as well as its evolution over time, is provided and important considerations for ESM studies on mobile devices are identified.

...read moreread less

Abstract: The Experience Sampling Method (ESM) is used by scientists from various disciplines to gather insights into the intra-psychic elements of human life. Researchers have used the ESM in a wide variety of studies, with the method seeing increased popularity. Mobile technologies have enabled new possibilities for the use of the ESM, while simultaneously leading to new conceptual, methodological, and technological challenges. In this survey, we provide an overview of the history of the ESM, usage of this methodology in the computer science discipline, as well as its evolution over time. Next, we identify and discuss important considerations for ESM studies on mobile devices, and analyse the particular methodological parameters scientists should consider in their study design. We reflect on the existing tools that support the ESM methodology and discuss the future development of such tools. Finally, we discuss the effect of future technological developments on the use of the ESM and identify areas requiring further investigation.

...read moreread less

Journal Article•DOI•

Foundations of Modern Query Languages for Graph Databases

[...]

Renzo Angles¹, Marcelo Arenas², Pablo Barceló³, Aidan Hogan³, Juan L. Reutter², Domagoj Vrgoč² - Show less +2 more•Institutions (3)

University of Talca¹, Pontifical Catholic University of Chile², University of Chile³

26 Sep 2017-ACM Computing Surveys

TL;DR: In this paper, the authors present a survey of the fundamental graph querying functionalities, such as graph patterns and navigational expressions, which are used in modern graph query languages such as SPARQL, Cypher and Gremlin.

...read moreread less

Abstract: We survey foundational features underlying modern graph query languages. We first discuss two popular graph data models: edge-labelled graphs, where nodes are connected by directed, labelled edges, and property graphs, where nodes and edges can further have attributes. Next we discuss the two most fundamental graph querying functionalities: graph patterns and navigational expressions. We start with graph patterns, in which a graph-structured query is matched against the data. Thereafter, we discuss navigational expressions, in which patterns can be matched recursively against the graph to navigate paths of arbitrary length; we give an overview of what kinds of expressions have been proposed and how they can be combined with graph patterns. We also discuss several semantics under which queries using the previous features can be evaluated, what effects the selection of features and semantics has on complexity, and offer examples of such features in three modern languages that are used to query graphs: SPARQL, Cypher, and Gremlin. We conclude by discussing the importance of formalisation for graph query languages; a summary of what is known about SPARQL, Cypher, and Gremlin in terms of expressivity and complexity; and an outline of possible future directions for the area.

...read moreread less

Journal Article•DOI•

A Survey of Research into Mixed Criticality Systems

[...]

Alan Burns¹, Robert I. Davis¹•Institutions (1)

University of York¹

22 Nov 2017-ACM Computing Surveys

TL;DR: The survey explores the relationship between research into mixed criticality systems and other topics such as hard and soft time constraints, fault tolerant scheduling, hierarchical scheduling, cyber physical systems, probabilistic real-time systems, and industrial safety standards.

...read moreread less

Abstract: This survey covers research into mixed criticality systems that has been published since Vestal’s seminal paper in 2007, up until the end of 2016. The survey is organised along the lines of the major research areas within this topic. These include single processor analysis (including fixed priority and Earliest Deadline First (EDF) scheduling, shared resources, and static and synchronous scheduling), multiprocessor analysis, realistic models, and systems issues. The survey also explores the relationship between research into mixed criticality systems and other topics such as hard and soft time constraints, fault tolerant scheduling, hierarchical scheduling, cyber physical systems, probabilistic real-time systems, and industrial safety standards.

...read moreread less

Journal Article•DOI•

Data Science: A Comprehensive Overview

[...]

Longbing Cao¹•Institutions (1)

University of Technology, Sydney¹

29 Jun 2017-ACM Computing Surveys

TL;DR: This article provides a comprehensive survey and tutorial of the fundamental aspects of data science: the evolution from data analysis to data science, the data science concepts, a big picture of the era of dataScience, the major challenges and directions in data innovation, the nature of data analytics, new industrialization and service opportunities in the data economy, the profession and competency of data education, and the future of datascience.

...read moreread less

Abstract: The 21st century has ushered in the age of big data and data economy, in which data DNA, which carries important knowledge, insights, and potential, has become an intrinsic constituent of all data-based organisms. An appropriate understanding of data DNA and its organisms relies on the new field of data science and its keystone, analytics. Although it is widely debated whether big data is only hype and buzz, and data science is still in a very early phase, significant challenges and opportunities are emerging or have been inspired by the research, innovation, business, profession, and education of data science. This article provides a comprehensive survey and tutorial of the fundamental aspects of data science: the evolution from data analysis to data science, the data science concepts, a big picture of the era of data science, the major challenges and directions in data innovation, the nature of data analytics, new industrialization and service opportunities in the data economy, the profession and competency of data education, and the future of data science. This article is the first in the field to draw a comprehensive big picture, in addition to offering rich observations, lessons, and thinking about data science and analytics.

...read moreread less

Journal Article•DOI•

Metrics for Community Analysis: A Survey

[...]

Tanmoy Chakraborty¹, Ayushi Dalmia², Animesh Mukherjee³, Niloy Ganguly³•Institutions (3)

Indraprastha Institute of Information Technology¹, IBM², Indian Institute of Technology Kharagpur³

30 Aug 2017-ACM Computing Surveys

TL;DR: A survey of the metrics used for community detection and evaluation can be found in this paper, where the authors also conduct experiments on synthetic and real networks to present a comparative analysis of these metrics in measuring the goodness of the underlying community structure.

...read moreread less

Abstract: Detecting and analyzing dense groups or communities from social and information networks has attracted immense attention over the last decade due to its enormous applicability in different domains. Community detection is an ill-defined problem, as the nature of the communities is not known in advance. The problem has turned even more complicated due to the fact that communities emerge in the network in various forms such as disjoint, overlapping, and hierarchical. Various heuristics have been proposed to address these challenges, depending on the application in hand. All these heuristics have been materialized in the form of new metrics, which in most cases are used as optimization functions for detecting the community structure, or provide an indication of the goodness of detected communities during evaluation. Over the last decade, a large number of such metrics have been proposed. Thus, there arises a need for an organized and detailed survey of the metrics proposed for community detection and evaluation. Here, we present a survey of the start-of-the-art metrics used for the detection and the evaluation of community structure. We also conduct experiments on synthetic and real networks to present a comparative analysis of these metrics in measuring the goodness of the underlying community structure.

...read moreread less

Journal Article•DOI•

Game Theory for Cyber Security and Privacy

[...]

Cuong T. Do¹, Nguyen H. Tran¹, Choong Seon Hong¹, Charles A. Kamhoua², Kevin Kwiat², Erik Blasch², Shaolei Ren³, Niki Pissinou³, S. Sitharama Iyengar³ - Show less +5 more•Institutions (3)

Kyung Hee University¹, Air Force Research Laboratory², Florida International University³

10 May 2017-ACM Computing Surveys

TL;DR: This survey demonstrates how to employ game-theoretic approaches to security and privacy but also encourages researchers to employgame theory to establish a comprehensive understanding of emergingSecurity and privacy problems in cyberspace and potential solutions.

...read moreread less

Abstract: In this survey, we review the existing game-theoretic approaches for cyber security and privacy issues, categorizing their application into two classes, security and privacy. To show how game theory is utilized in cyberspace security and privacy, we select research regarding three main applications: cyber-physical security, communication security, and privacy. We present game models, features, and solutions of the selected works and describe their advantages and limitations from design to implementation of the defense mechanisms. We also identify some emerging trends and topics for future research. This survey not only demonstrates how to employ game-theoretic approaches to security and privacy but also encourages researchers to employ game theory to establish a comprehensive understanding of emerging security and privacy problems in cyberspace and potential solutions.

...read moreread less

Journal Article•DOI•

Control-Flow Integrity: Precision, Security, and Performance

[...]

Nathan Burow¹, Scott A. Carr¹, Joseph Nash², Per Larsen², Michael Franz², Stefan Brunthaler³, Mathias Payer¹ - Show less +3 more•Institutions (3)

Purdue University¹, University of California, Irvine², University of Paderborn³

04 Apr 2017-ACM Computing Surveys

TL;DR: A broad range of CFI mechanisms are compared using a unified nomenclature based on (i) a qualitative discussion of the conceptual security guarantees, (ii) a quantitative security evaluation, and (iii) an empirical evaluation of their performance in the same test environment.

...read moreread less

Abstract: Memory corruption errors in C/C++ programs remain the most common source of security vulnerabilities in today’s systems. Control-flow hijacking attacks exploit memory corruption vulnerabilities to divert program execution away from the intended control flow. Researchers have spent more than a decade studying and refining defenses based on Control-Flow Integrity (CFI); this technique is now integrated into several production compilers. However, so far, no study has systematically compared the various proposed CFI mechanisms nor is there any protocol on how to compare such mechanisms. We compare a broad range of CFI mechanisms using a unified nomenclature based on (i) a qualitative discussion of the conceptual security guarantees, (ii) a quantitative security evaluation, and (iii) an empirical evaluation of their performance in the same test environment. For each mechanism, we evaluate (i) protected types of control-flow transfers and (ii) precision of the protection for forward and backward edges. For open-source, compiler-based implementations, we also evaluate (iii) generated equivalence classes and target sets and (iv) runtime performance.

...read moreread less

Journal Article•DOI•

Spatio-Temporal Analysis of Team Sports

[...]

Joachim Gudmundsson¹, Michael Horton¹•Institutions (1)

University of Sydney¹

11 Apr 2017-ACM Computing Surveys

TL;DR: In this paper, the authors survey recent research efforts that use spatio-temporal data from team sports as input and involve non-trivial computation and identify a number of open research questions.

...read moreread less

Abstract: Team-based invasion sports such as football, basketball, and hockey are similar in the sense that the players are able to move freely around the playing area and that player and team performance cannot be fully analysed without considering the movements and interactions of all players as a group. State-of-the-art object tracking systems now produce spatio-temporal traces of player trajectories with high definition and high frequency, and this, in turn, has facilitated a variety of research efforts, across many disciplines, to extract insight from the trajectories. We survey recent research efforts that use spatio-temporal data from team sports as input and involve non-trivial computation. This article categorises the research efforts in a coherent framework and identifies a number of open research questions.

...read moreread less

Journal Article•DOI•

A Survey on Reinforcement Learning Models and Algorithms for Traffic Signal Control

[...]

Kok-Lim Alvin Yau¹, Junaid Qadir², Hooi Ling Khoo³, Mee Hong Ling¹, Peter Komisarczuk⁴ - Show less +1 more•Institutions (4)

Sunway University¹, Information Technology University², Universiti Tunku Abdul Rahman³, Royal Holloway, University of London⁴

29 Jun 2017-ACM Computing Surveys

TL;DR: Various RL models and algorithms applied to traffic signal control are reviewed in the aspects of the representations of the RL model (i.e., state, action, and reward), performance measures, and complexity to establish a foundation for further investigation in this research field.

...read moreread less

Abstract: Traffic congestion has become a vexing and complex issue in many urban areas. Of particular interest are the intersections where traffic bottlenecks are known to occur despite being traditionally signalized. Reinforcement learning (RL), which is an artificial intelligence approach, has been adopted in traffic signal control for monitoring and ameliorating traffic congestion. RL enables autonomous decision makers (e.g., traffic signal controllers) to observe, learn, and select the optimal action (e.g., determining the appropriate traffic phase and its timing) to manage traffic such that system performance is improved. This article reviews various RL models and algorithms applied to traffic signal control in the aspects of the representations of the RL model (i.e., state, action, and reward), performance measures, and complexity to establish a foundation for further investigation in this research field. Open issues are presented toward the end of this article to discover new research areas with the objective to spark new interest in this research field.

...read moreread less

Journal Article•DOI•

Current Research and Open Problems in Attribute-Based Access Control

[...]

Daniel Servos¹, Sylvia L. Osborn¹•Institutions (1)

University of Western Ontario¹

02 Jan 2017-ACM Computing Surveys

TL;DR: This article provides a basic introduction to ABAC and a comprehensive review of recent research efforts toward developing formal models of ABAC, including a taxonomy ofABAC research presented and used to categorize and evaluate surveyed articles.

...read moreread less

Abstract: Attribute-based access control (ABAC) is a promising alternative to traditional models of access control (i.e., discretionary access control (DAC), mandatory access control (MAC), and role-based access control (RBAC)) that is drawing attention in both recent academic literature and industry application. However, formalization of a foundational model of ABAC and large scale adoption is still in its infancy. The relatively recent emergence of ABAC still leaves a number of problems unexplored. Issues like delegation, administration, auditability, scalability, hierarchical representations, and the like, have been largely ignored or left to future work. This article provides a basic introduction to ABAC and a comprehensive review of recent research efforts toward developing formal models of ABAC. A taxonomy of ABAC research is presented and used to categorize and evaluate surveyed articles. Open problems are identified based on the shortcomings of the reviewed works and potential solutions discussed.

...read moreread less

Journal Article•DOI•

Wireless Body Area Network (WBAN): A Survey on Reliability, Fault Tolerance, and Technologies Coexistence

[...]

Marwa Salayma¹, Ahmed Al-Dubai¹, Imed Romdhani¹, Youssef Nasser²•Institutions (2)

Edinburgh Napier University¹, American University of Beirut²

10 Mar 2017-ACM Computing Surveys

TL;DR: The reliability and fault tolerance paradigms suggested for WBANs are investigated thoroughly and some suggested trends in these aspects are discussed.

...read moreread less

Abstract: Wireless Body Area Network (WBAN) has been a key element in e-health to monitor bodies. This technology enables new applications under the umbrella of different domains, including the medical field, the entertainment and ambient intelligence areas. This survey paper places substantial emphasis on the concept and key features of the WBAN technology. First, the WBAN concept is introduced and a review of key applications facilitated by this networking technology is provided. The study then explores a wide variety of communication standards and methods deployed in this technology. Due to the sensitivity and criticality of the data carried and handled by WBAN, fault tolerance is a critical issue and widely discussed in this paper. Hence, this survey investigates thoroughly the reliability and fault tolerance paradigms suggested for WBANs. Open research and challenging issues pertaining to fault tolerance, coexistence and interference management and power consumption are also discussed along with some suggested trends in these aspects.

...read moreread less

Journal Article•DOI•

Software Platforms for Smart Cities: Concepts, Requirements, Challenges, and a Unified Reference Architecture

[...]

Eduardo Felipe Zambom Santana¹, Ana Paula Chaves², Marco Aurélio Gerosa¹, Fabio Kon¹, Dejan Milojicic³ - Show less +1 more•Institutions (3)

University of São Paulo¹, Federal University of Technology - Paraná², Hewlett-Packard³

22 Nov 2017-ACM Computing Surveys

TL;DR: In this article, the state of the art in software platforms for smart cities is surveyed and a reference architecture is derived to guide the development of next-generation software platforms in Smart Cities.

...read moreread less

Abstract: Information and communication technologies (ICT) can be instrumental in progressing towards smarter city environments, which improve city services, sustainability, and citizens’ quality of life. Smart City software platforms can support the development and integration of Smart City applications. However, the ICT community must overcome current technological and scientific challenges before these platforms can be widely adopted. This article surveys the state of the art in software platforms for Smart Cities. We analyzed 23 projects concerning the most used enabling technologies, as well as functional and non-functional requirements, classifying them into four categories: Cyber-Physical Systems, Internet of Things, Big Data, and Cloud Computing. Based on these results, we derived a reference architecture to guide the development of next-generation software platforms for Smart Cities. Finally, we enumerated the most frequently cited open research challenges and discussed future opportunities. This survey provides important references to help application developers, city managers, system operators, end-users, and Smart City researchers make project, investment, and research decisions.

...read moreread less

Journal Article•DOI•

Surveying Stylometry Techniques and Applications

[...]

Tempestt Neal¹, Kalaivani Sundararajan¹, Aneez Fatima¹, Yiming Yan¹, Yingfei Xiang¹, Damon L. Woodard¹ - Show less +2 more•Institutions (1)

University of Florida¹

22 Nov 2017-ACM Computing Surveys

TL;DR: An extensive performance analysis is performed on a corpus of 1,000 authors to investigate authorship attribution, verification, and clustering using 14 algorithms from the literature.

...read moreread less

Abstract: The analysis of authorial style, termed stylometry, assumes that style is quantifiably measurable for evaluation of distinctive qualities. Stylometry research has yielded several methods and tools over the past 200 years to handle a variety of challenging cases. This survey reviews several articles within five prominent subtasks: authorship attribution, authorship verification, authorship profiling, stylochronometry, and adversarial stylometry. Discussions on datasets, features, experimental techniques, and recent approaches are provided. Further, a current research challenge lies in the inability of authorship analysis techniques to scale to a large number of authors with few text samples. Here, we perform an extensive performance analysis on a corpus of 1,000 authors to investigate authorship attribution, verification, and clustering using 14 algorithms from the literature. Finally, several remaining research challenges are discussed, along with descriptions of various open-source and commercial software that may be useful for stylometry subtasks.

...read moreread less

Journal Article•DOI•

Searchable Symmetric Encryption: Designs and Challenges

[...]

Geong Sen Poh¹, Ji-Jian Chin², Wei-Chuen Yau³, Kim-Kwang Raymond Choo⁴, Moesfa Soeheila Mohamad¹ - Show less +1 more•Institutions (4)

MIMOS¹, Multimedia University², Xiamen University³, University of Texas at San Antonio⁴

26 May 2017-ACM Computing Surveys

TL;DR: This work seeks to address the gap in detail how SSE’s underlying structures are designed and how these result in the many properties of a SSE scheme, as well as presenting recent state-of-the-art advances on SSE.

...read moreread less

Abstract: Searchable Symmetric Encryption (SSE) when deployed in the cloud allows one to query encrypted data without the risk of data leakage. Despite the widespread interest, existing surveys do not examine in detail how SSE’s underlying structures are designed and how these result in the many properties of a SSE scheme. This is the gap we seek to address, as well as presenting recent state-of-the-art advances on SSE. Specifically, we present a general framework and believe the discussions may lead to insights for potential new designs. We draw a few observations. First, most schemes use index table, where optimal index size and sublinear search can be achieved using an inverted index. Straightforward updating can only be achieved using direct index, but search time would be linear. A recent trend is the combinations of index table, and tree, deployed for efficient updating and storage. Secondly, mechanisms from related fields such as Oblivious RAM (ORAM) have been integrated to reduce leakages. However, using these mechanisms to minimise leakages in schemes with richer functionalities (e.g., ranked, range) is relatively unexplored. Thirdly, a new approach (e.g., multiple servers) is required to mitigate new and emerging attacks on leakage. Lastly, we observe that a proposed index may not be practically efficient when implemented, where I/O access must be taken into consideration.

...read moreread less

Journal Article•DOI•

Classification of Resilience Techniques Against Functional Errors at Higher Abstraction Layers of Digital Systems

[...]

Georgia Psychou¹, Dimitrios Rodopoulos², Mohamed M. Sabry³, Tobias Gemmeke¹, David Atienza³, Tobias G. Noll¹, Francky Catthoor² - Show less +3 more•Institutions (3)

RWTH Aachen University¹, Katholieke Universiteit Leuven², École Polytechnique Fédérale de Lausanne³

04 Oct 2017-ACM Computing Surveys

TL;DR: A systematic classification of approaches that increase system resilience in the presence of functional hardware (HW)-induced errors is presented, dealing with higher system abstractions, such as the (micro)architecture, the mapping, and platform software (SW).

...read moreread less

Abstract: Nanoscale technology nodes bring reliability concerns back to the center stage of digital system design. A systematic classification of approaches that increase system resilience in the presence of functional hardware (HW)-induced errors is presented, dealing with higher system abstractions, such as the (micro)architecture, the mapping, and platform software (SW). The field is surveyed in a systematic way based on nonoverlapping categories, which add insight into the ongoing work by exposing similarities and differences. HW and SW solutions are discussed in a similar fashion so that interrelationships become apparent. The presented categories are illustrated by representative literature examples to illustrate their properties. Moreover, it is demonstrated how hybrid schemes can be decomposed into their primitive components.

...read moreread less