Showing papers in &quot;ACM Computing Surveys in 2020&quot;

A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities

TL;DR: A thorough survey to fully understand Few-shot Learning (FSL), and categorizes FSL methods from three perspectives: data, which uses prior knowledge to augment the supervised experience; model, which used to reduce the size of the hypothesis space; and algorithm, which using prior knowledgeto alter the search for the best hypothesis in the given hypothesis space.

...read moreread less

Abstract: Machine learning has been highly successful in data-intensive applications but is often hampered when the data set is small. Recently, Few-shot Learning (FSL) is proposed to tackle this problem. Using prior knowledge, FSL can rapidly generalize to new tasks containing only a few samples with supervised information. In this article, we conduct a thorough survey to fully understand FSL. Starting from a formal definition of FSL, we distinguish FSL from several relevant machine learning problems. We then point out that the core issue in FSL is that the empirical risk minimizer is unreliable. Based on how prior knowledge can be used to handle this core issue, we categorize FSL methods from three perspectives: (i) data, which uses prior knowledge to augment the supervised experience; (ii) model, which uses prior knowledge to reduce the size of the hypothesis space; and (iii) algorithm, which uses prior knowledge to alter the search for the best hypothesis in the given hypothesis space. With this taxonomy, we review and discuss the pros and cons of each category. Promising directions, in the aspects of the FSL problem setups, techniques, applications, and theories, are also proposed to provide insights for future research.1

...read moreread less

1,129 citations

Journal Article•DOI•

[...]

Xinyi Zhou¹, Reza Zafarani¹•Institutions (1)

Syracuse University¹

28 Sep 2020-ACM Computing Surveys

TL;DR: In this article, the authors present a survey of methods that can detect fake news from four perspectives: the false knowledge it carries, its writing style, its propagation patterns, and the credibility of its source.

...read moreread less

Abstract: The explosive growth in fake news and its erosion to democracy, justice, and public trust has increased the demand for fake news detection and intervention. This survey reviews and evaluates methods that can detect fake news from four perspectives: the false knowledge it carries, its writing style, its propagation patterns, and the credibility of its source. The survey also highlights some potential research tasks based on the review. In particular, we identify and detail related fundamental theories across various disciplines to encourage interdisciplinary research on fake news. It is our hope that this survey can facilitate collaborative efforts among experts in computer and information sciences, social sciences, political science, and journalism to research fake news, where such efforts can lead to fake news detection that is not only efficient but, more importantly, explainable.

...read moreread less

372 citations

Journal Article•DOI•

A Survey on Distributed Machine Learning

[...]

Joost Verbraeken¹, Matthijs Wolting¹, Jonathan Katzy¹, Jeroen Kloppenburg¹, Tim Verbelen², Jan S. Rellermeyer¹ - Show less +2 more•Institutions (2)

Delft University of Technology¹, Ghent University²

A Survey of IoT Applications in Blockchain Systems: Architecture, Consensus, and Traffic Modeling

TL;DR: In this article, the authors provide an extensive overview of the current state-of-the-art in the field by outlining the challenges and opportunities of distributed machine learning over conventional (centralized) machine learning.

...read moreread less

Abstract: The demand for artificial intelligence has grown significantly over the past decade, and this growth has been fueled by advances in machine learning techniques and the ability to leverage hardware acceleration. However, to increase the quality of predictions and render machine learning solutions feasible for more complex applications, a substantial amount of training data is required. Although small machine learning models can be trained with modest amounts of data, the input for training larger models such as neural networks grows exponentially with the number of parameters. Since the demand for processing training data has outpaced the increase in computation power of computing machinery, there is a need for distributing the machine learning workload across multiple machines, and turning the centralized into a distributed system. These distributed systems present new challenges: first and foremost, the efficient parallelization of the training process and the creation of a coherent model. This article provides an extensive overview of the current state-of-the-art in the field by outlining the challenges and opportunities of distributed machine learning over conventional (centralized) machine learning, discussing the techniques used for distributed machine learning, and providing an overview of the systems that are available.

...read moreread less

358 citations

Journal Article•DOI•

[...]

Laphou Lao¹, Zecheng Li¹, Songlin Hou¹, Bin Xiao¹, Songtao Guo², Yuanyuan Yang³ - Show less +2 more•Institutions (3)

Hong Kong Polytechnic University¹, Chongqing University², Stony Brook University³

A Survey of Learning Causality with Data: Problems and Methods

TL;DR: This article gives an architecture overview of popular IoT-blockchain systems by analyzing their network structures and protocols, discusses variant consensus protocols for IoT blockchains, and makes comparisons among different consensus algorithms.

...read moreread less

Abstract: Blockchain technology can be extensively applied in diverse services, including online micro-payments, supply chain tracking, digital forensics, health-care record sharing, and insurance payments. Extending the technology to the Internet of things (IoT), we can obtain a verifiable and traceable IoT network. Emerging research in IoT applications exploits blockchain technology to record transaction data, optimize current system performance, or construct next-generation systems, which can provide additional security, automatic transaction management, decentralized platforms, offline-to-online data verification, and so on. In this article, we conduct a systematic survey of the key components of IoT blockchain and examine a number of popular blockchain applications. In particular, we first give an architecture overview of popular IoT-blockchain systems by analyzing their network structures and protocols. Then, we discuss variant consensus protocols for IoT blockchains, and make comparisons among different consensus algorithms. Finally, we analyze the traffic model for P2P and blockchain systems and provide several metrics. We also provide a suitable traffic model for IoT-blockchain systems to illustrate network traffic distribution.

...read moreread less

217 citations

Journal Article•DOI•

[...]

Ruocheng Guo¹, Lu Cheng¹, Jundong Li², P. Richard Hahn¹, Huan Liu¹ - Show less +1 more•Institutions (2)

Arizona State University¹, University of Virginia²

22 Jul 2020-ACM Computing Surveys

TL;DR: This survey provides a comprehensive and structured review of both traditional and frontier methods in learning causal effects and relations along with the connections between causality and machine learning.

...read moreread less

Abstract: This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from—or the same as—the traditional one? To answer this question, this survey provides a comprehensive and structured review of both traditional and frontier methods in learning causality and relations along with the connections between causality and machine learning. This work points out on a case-by-case basis how big data facilitates, complicates, or motivates each approach.

...read moreread less

207 citations

Journal Article•DOI•

A Survey on Ethereum Systems Security: Vulnerabilities, Attacks, and Defenses

[...]

Huashan Chen¹, Marcus Pendleton², Laurent Njilla², Shouhuai Xu¹•Institutions (2)

University of Texas at San Antonio¹, Air Force Research Laboratory²

A Deep Journey into Super-resolution: A Survey

TL;DR: This work systematize three aspects of Ethereum systems security: vulnerabilities, attacks, and defenses, and draws insights into vulnerability root causes, attack consequences, and defense capabilities, which shed light on future research directions.

...read moreread less

Abstract: Blockchain technology is believed by many to be a game changer in many application domains. While the first generation of blockchain technology (i.e., Blockchain 1.0) is almost exclusively used for cryptocurrency, the second generation (i.e., Blockchain 2.0), as represented by Ethereum, is an open and decentralized platform enabling a new paradigm of computing—Decentralized Applications (DApps) running on top of blockchains. The rich applications and semantics of DApps inevitably introduce many security vulnerabilities, which have no counterparts in pure cryptocurrency systems like Bitcoin. Since Ethereum is a new, yet complex, system, it is imperative to have a systematic and comprehensive understanding on its security from a holistic perspective, which was previously unavailable in the literature. To the best of our knowledge, the present survey, which can also be used as a tutorial, fills this void. We systematize three aspects of Ethereum systems security: vulnerabilities, attacks, and defenses. We draw insights into vulnerability root causes, attack consequences, and defense capabilities, which shed light on future research directions.

...read moreread less

204 citations

Journal Article•DOI•

[...]

Saeed Anwar¹, Salman Khan, Nick Barnes¹•Institutions (1)

Australian National University¹

28 May 2020-ACM Computing Surveys

TL;DR: Deep convolutional networks–based super-resolution is a fast-growing field with numerous practical applications and this exposition extensively compare more than 30 state-of-the-art super-resolves.

...read moreread less

Abstract: Deep convolutional networks–based super-resolution is a fast-growing field with numerous practical applications. In this exposition, we extensively compare more than 30 state-of-the-art super-resolution Convolutional Neural Networks (CNNs) over three classical and three recently introduced challenging datasets to benchmark single image super-resolution. We introduce a taxonomy for deep learning–based super-resolution networks that groups existing methods into nine categories including linear, residual, multi-branch, recursive, progressive, attention-based, and adversarial designs. We also provide comparisons between the models in terms of network complexity, memory footprint, model input and output, learning details, the type of network losses, and important architectural differences (e.g., depth, skip-connections, filters). The extensive evaluation performed shows the consistent and rapid growth in the accuracy in the past few years along with a corresponding boost in model complexity and the availability of large-scale datasets. It is also observed that the pioneering methods identified as the benchmarks have been significantly outperformed by the current contenders. Despite the progress in recent years, we identify several shortcomings of existing techniques and provide future research directions towards the solution of these open problems. Datasets and codes for evaluation are publicly available at https://github.com/saeed-anwar/SRsurvey.

...read moreread less

162 citations

Journal Article•DOI•

An Overview of Service Placement Problem in Fog and Edge Computing

[...]

Farah Ait Salaht¹, Frédéric Desprez¹, Adrien Lebre²•Institutions (2)

University of Grenoble¹, French Institute for Research in Computer Science and Automation²

A Survey of Android Malware Detection with Deep Neural Models

TL;DR: A survey of current research conducted on Service Placement Problem (SPP) in the Fog/Edge Computing is presented and a categorization of current proposals is given and identified issues and challenges are discussed.

...read moreread less

Abstract: To support the large and various applications generated by the Internet of Things (IoT), Fog Computing was introduced to complement the Cloud Computing and offer Cloud-like services at the edge of the network with low latency and real-time responses. Large-scale, geographical distribution, and heterogeneity of edge computational nodes make service placement in such infrastructure a challenging issue. Diversity of user expectations and IoT devices characteristics also complicate the deployment problem. This article presents a survey of current research conducted on Service Placement Problem (SPP) in the Fog/Edge Computing. Based on a new classification scheme, a categorization of current proposals is given and identified issues and challenges are discussed.

...read moreread less

159 citations

Journal Article•DOI•

[...]

Junyang Qiu¹, Jun Zhang², Wei Luo¹, Lei Pan¹, Surya Nepal, Yang Xiang² - Show less +2 more•Institutions (2)

Deakin University¹, Swinburne University of Technology²

06 Dec 2020-ACM Computing Surveys

TL;DR: This survey aims to address the challenges in DL-based Android malware detection and classification by systematically reviewing the latest progress, including FCN, CNN, RNN, DBN, AE, and hybrid models, and organize the literature according to the DL architecture.

...read moreread less

Abstract: Deep Learning (DL) is a disruptive technology that has changed the landscape of cyber security research. Deep learning models have many advantages over traditional Machine Learning (ML) models, particularly when there is a large amount of data available. Android malware detection or classification qualifies as a big data problem because of the fast booming number of Android malware, the obfuscation of Android malware, and the potential protection of huge values of data assets stored on the Android devices. It seems a natural choice to apply DL on Android malware detection. However, there exist challenges for researchers and practitioners, such as choice of DL architecture, feature extraction and processing, performance evaluation, and even gathering adequate data of high quality. In this survey, we aim to address the challenges by systematically reviewing the latest progress in DL-based Android malware detection and classification. We organize the literature according to the DL architecture, including FCN, CNN, RNN, DBN, AE, and hybrid models. The goal is to reveal the research frontier, with the focus on representing code semantics for Android malware detection. We also discuss the challenges in this emerging field and provide our view of future research opportunities and directions.

...read moreread less

151 citations

Journal Article•DOI•

A Survey of Blockchain-Based Strategies for Healthcare

[...]

Erikson Júlio de Aguiar¹, Bruno S. Faiçal¹, Bhaskar Krishnamachari², Jó Ueyama¹•Institutions (2)

University of São Paulo¹, University of Southern California²

Stance Detection: A Survey

TL;DR: This study aims to address research into the applications of the blockchain healthcare area by discussing the management of medical information, as well as the sharing of medical records, image sharing, and log management, and summarizes the methods used in healthcare per application area.

...read moreread less

Abstract: Blockchain technology has been gaining visibility owing to its ability to enhance the security, reliability, and robustness of distributed systems. Several areas have benefited from research based on this technology, such as finance, remote sensing, data analysis, and healthcare. Data immutability, privacy, transparency, decentralization, and distributed ledgers are the main features that make blockchain an attractive technology. However, healthcare records that contain confidential patient data make this system very complicated because there is a risk of a privacy breach. This study aims to address research into the applications of the blockchain healthcare area. It sets out by discussing the management of medical information, as well as the sharing of medical records, image sharing, and log management. We also discuss papers that intersect with other areas, such as the Internet of Things, the management of information, tracking of drugs along their supply chain, and aspects of security and privacy. As we are aware that there are other surveys of blockchain in healthcare, we analyze and compare both the positive and negative aspects of their papers. Finally, we seek to examine the concepts of blockchain in the medical area, by assessing their benefits and drawbacks and thus giving guidance to other researchers in the area. Additionally, we summarize the methods used in healthcare per application area and show their pros and cons.

...read moreread less

134 citations

Journal Article•DOI•

[...]

Dilek Küçük¹, Fazli Can²•Institutions (2)

Energy Institute¹, Bilkent University²

An Overview of End-to-End Entity Resolution for Big Data

TL;DR: A survey of stance detection in social media posts and (online) regular texts is presented and it is hoped that this newly emerging topic will act as a significant resource for interested researchers and practitioners.

...read moreread less

Abstract: Automatic elicitation of semantic information from natural language texts is an important research problem with many practical application areas. Especially after the recent proliferation of online content through channels such as social media sites, news portals, and forums; solutions to problems such as sentiment analysis, sarcasm/controversy/veracity/rumour/fake news detection, and argument mining gained increasing impact and significance, revealed with large volumes of related scientific publications. In this article, we tackle an important problem from the same family and present a survey of stance detection in social media posts and (online) regular texts. Although stance detection is defined in different ways in different application settings, the most common definition is “automatic classification of the stance of the producer of a piece of text, towards a target, into one of these three classes: {Favor, Against, Neither}.” Our survey includes definitions of related problems and concepts, classifications of the proposed approaches so far, descriptions of the relevant datasets and tools, and related outstanding issues. Stance detection is a recent natural language processing topic with diverse application areas, and our survey article on this newly emerging topic will act as a significant resource for interested researchers and practitioners.

...read moreread less

Journal Article•DOI•

[...]

Vassilis Christophides¹, Vasilis Efthymiou², Themis Palpanas³, George Papadakis⁴, Kostas Stefanidis - Show less +1 more•Institutions (4)

École nationale supérieure de l'électronique et de ses applications¹, IBM², University of Paris³, National and Kapodistrian University of Athens⁴

06 Dec 2020-ACM Computing Surveys

TL;DR: This survey provides an end-to-end view of ER workflows for Big Data, critically review the pros and cons of existing methods, and concludes with the main open research directions.

...read moreread less

Abstract: One of the most critical tasks for improving data quality and increasing the reliability of data analytics is Entity Resolution (ER), which aims to identify different descriptions that refer to the same real-world entity. Despite several decades of research, ER remains a challenging problem. In this survey, we highlight the novel aspects of resolving Big Data entities when we should satisfy more than one of the Big Data characteristics simultaneously (i.e., Volume and Velocity with Variety). We present the basic concepts, processing steps, and execution strategies that have been proposed by database, semantic Web, and machine learning communities in order to cope with the loose structuredness, extreme diversity, high speed, and large scale of entity descriptions used by real-world applications. We provide an end-to-end view of ER workflows for Big Data, critically review the pros and cons of existing methods, and conclude with the main open research directions.

...read moreread less

Journal Article•DOI•

A Survey of Network Virtualization Techniques for Internet of Things Using SDN and NFV

[...]

Iqbal Alam¹, Kashif Sharif¹, Fan Li¹, Zohaib Latif¹, Md. Monjurul Karim¹, Sujit Biswas¹, Boubakr Nour¹, Yu Wang² - Show less +4 more•Institutions (2)

Beijing Institute of Technology¹, Temple University²

16 Apr 2020-ACM Computing Surveys

TL;DR: This article presents a systematic and comprehensive review of virtualization techniques explicitly designed for IoT networks, and classified the literature into software-defined networks designed for Internet of Things, function virtualization for IoT Networks, and software- defined IoT networks.

...read moreread less

Abstract: Internet of Things (IoT) and Network Softwarization are fast becoming core technologies of information systems and network management for the next-generation Internet. The deployment and applications of IoT range from smart cities to urban computing and from ubiquitous healthcare to tactile Internet. For this reason, the physical infrastructure of heterogeneous network systems has become more complicated and thus requires efficient and dynamic solutions for management, configuration, and flow scheduling. Network softwarization in the form of Software Defined Networks and Network Function Virtualization has been extensively researched for IoT in the recent past. In this article, we present a systematic and comprehensive review of virtualization techniques explicitly designed for IoT networks. We have classified the literature into software-defined networks designed for IoT, function virtualization for IoT networks, and software-defined IoT networks. These categories are further divided into works that present architectural, security, and management solutions. Besides, the article highlights several short-term and long-term research challenges and open issues related to the adoption of software-defined Internet of Things.

...read moreread less

Journal Article•DOI•

The AI-Based Cyber Threat Landscape: A Survey

[...]

Nektaria Kaloudi¹, Jingyue Li¹•Institutions (1)

Norwegian University of Science and Technology¹

Recommender Systems Leveraging Multimedia Content

TL;DR: This study aims to explore existing studies of AI-based cyber attacks and to map them onto a proposed framework, providing insight into new threats, and explains how to apply this framework to analyze AI-like attacks in a hypothetical scenario of a critical smart grid infrastructure.

...read moreread less

Abstract: Recent advancements in artificial intelligence (AI) technologies have induced tremendous growth in innovation and automation. Although these AI technologies offer significant benefits, they can be used maliciously. Highly targeted and evasive attacks in benign carrier applications, such as DeepLocker, have demonstrated the intentional use of AI for harmful purposes. Threat actors are constantly changing and improving their attack strategy with particular emphasis on the application of AI-driven techniques in the attack process, called AI-based cyber attack, which can be used in conjunction with conventional attack techniques to cause greater damage. Despite several studies on AI and security, researchers have not summarized AI-based cyber attacks enough to be able to understand the adversary’s actions and to develop proper defenses against such attacks. This study aims to explore existing studies of AI-based cyber attacks and to map them onto a proposed framework, providing insight into new threats. Our framework includes the classification of several aspects of malicious uses of AI during the cyber attack life cycle and provides a basis for their detection to predict future threats. We also explain how to apply this framework to analyze AI-based cyber attacks in a hypothetical scenario of a critical smart grid infrastructure.

...read moreread less

Journal Article•DOI•

[...]

Yashar Deldjoo¹, Markus Schedl², Paolo Cremonesi³, Gabriella Pasi⁴•Institutions (4)

Polytechnic University of Bari¹, Johannes Kepler University of Linz², Polytechnic University of Milan³, University of Milano-Bicocca⁴

28 Sep 2020-ACM Computing Surveys

TL;DR: A thorough review of the state-of-the-art of recommender systems that leverage multimedia content is presented, by classifying the reviewed papers with respect to their media type, the techniques employed to extract and represent their content features, and the recommendation algorithm.

...read moreread less

Abstract: Recommender systems have become a popular and effective means to manage the ever-increasing amount of multimedia content available today and to help users discover interesting new items. Today’s recommender systems suggest items of various media types, including audio, text, visual (images), and videos. In fact, scientific research related to the analysis of multimedia content has made possible effective content-based recommender systems capable of suggesting items based on an analysis of the features extracted from the item itself. The aim of this survey is to present a thorough review of the state-of-the-art of recommender systems that leverage multimedia content, by classifying the reviewed papers with respect to their media type, the techniques employed to extract and represent their content features, and the recommendation algorithm. Moreover, for each media type, we discuss various domains in which multimedia content plays a key role in human decision-making and is therefore considered in the recommendation process. Examples of the identified domains include fashion, tourism, food, media streaming, and e-commerce.

...read moreread less

Journal Article•DOI•

Orchestrating the Development Lifecycle of Machine Learning-based IoT Applications: A Taxonomy and Survey

[...]

Bin Qian¹, Jie Su¹, Zhenyu Wen¹, Devki Nandan Jha¹, Yinhao Li¹, Yu Guan¹, Deepak Puthal¹, Philip James¹, Renyu Yang², Albert Y. Zomaya³, Omer Rana⁴, Lizhe Wang, Maciej Koutny¹, Rajiv Ranjan¹ - Show less +10 more•Institutions (4)

Newcastle University¹, University of Leeds², University of Sydney³, Cardiff University⁴

03 Aug 2020-ACM Computing Surveys

Abstract: Machine Learning (ML) and Internet of Things (IoT) are complementary advances: ML techniques unlock the potential of IoT with intelligence, and IoT applications increasingly feed data collected by sensors into ML models, thereby employing results to improve their business processes and services. Hence, orchestrating ML pipelines that encompass model training and implication involved in the holistic development lifecycle of an IoT application often leads to complex system integration. This article provides a comprehensive and systematic survey of the development lifecycle of ML-based IoT applications. We outline the core roadmap and taxonomy and subsequently assess and compare existing standard techniques used at individual stages.

...read moreread less

Journal Article•DOI•

Blocking and Filtering Techniques for Entity Resolution: A Survey

[...]

George Papadakis¹, Dimitrios Skoutas, Emmanouil Thanos², Themis Palpanas³•Institutions (3)

National and Kapodistrian University of Athens¹, Katholieke Universiteit Leuven², Paris Descartes University³

Trade-offs between Distributed Ledger Technology Characteristics

TL;DR: In this paper, a large number of relevant works under two different but related frameworks, blocking and filtering, are reviewed, and a comprehensive list of the relevant works, discussing them in the greater context is provided.

...read moreread less

Abstract: Entity Resolution (ER), a core task of Data Integration, detects different entity profiles that correspond to the same real-world object. Due to its inherently quadratic complexity, a series of techniques accelerate it so that it scales to voluminous data. In this survey, we review a large number of relevant works under two different but related frameworks: Blocking and Filtering. The former restricts comparisons to entity pairs that are more likely to match, while the latter identifies quickly entity pairs that are likely to satisfy predetermined similarity thresholds. We also elaborate on hybrid approaches that combine different characteristics. For each framework we provide a comprehensive list of the relevant works, discussing them in the greater context. We conclude with the most promising directions for future work in the field.

...read moreread less

Journal Article•DOI•

[...]

Niclas Kannengießer¹, Sebastian Lins¹, Tobias Dehling¹, Ali Sunyaev¹•Institutions (1)

Karlsruhe Institute of Technology¹

28 May 2020-ACM Computing Surveys

TL;DR: The main purpose of the article is to introduce scientific and practical audiences to the intricacies of DLT designs and to support development of viable applications on DLT.

...read moreread less

Abstract: When developing peer-to-peer applications on distributed ledger technology (DLT), a crucial decision is the selection of a suitable DLT design (e.g., Ethereum), because it is hard to change the underlying DLT design post hoc. To facilitate the selection of suitable DLT designs, we review DLT characteristics and identify trade-offs between them. Furthermore, we assess how DLT designs account for these trade-offs and we develop archetypes for DLT designs that cater to specific requirements of applications on DLT. The main purpose of our article is to introduce scientific and practical audiences to the intricacies of DLT designs and to support development of viable applications on DLT.

...read moreread less

Journal Article•DOI•

A Survey of Multilingual Neural Machine Translation

[...]

Raj Dabre¹, Chenhui Chu², Anoop Kunchukuttan³•Institutions (3)

National Institute of Information and Communications Technology¹, Osaka University², Microsoft³

28 Sep 2020-ACM Computing Surveys

TL;DR: The authors presented a survey on multilingual neural machine translation (MNMT), which has gained a lot of traction in recent years and many approaches have been proposed to exploit multilingual parallel corpora for improving translation quality.

...read moreread less

Abstract: We present a survey on multilingual neural machine translation (MNMT), which has gained a lot of traction in recent years. MNMT has been useful in improving translation quality as a result of translation knowledge transfer (transfer learning). MNMT is more promising and interesting than its statistical machine translation counterpart, because end-to-end modeling and distributed representations open new avenues for research on machine translation. Many approaches have been proposed to exploit multilingual parallel corpora for improving translation quality. However, the lack of a comprehensive survey makes it difficult to determine which approaches are promising and, hence, deserve further exploration. In this article, we present an in-depth survey of existing literature on MNMT. We first categorize various approaches based on their central use-case and then further categorize them based on resource scenarios, underlying modeling principles, core-issues, and challenges. Wherever possible, we address the strengths and weaknesses of several techniques by comparing them with each other. We also discuss the future directions for MNMT. This article is aimed towards both beginners and experts in NMT. We hope this article will serve as a starting point as well as a source of new ideas for researchers and engineers interested in MNMT.

...read moreread less

Journal Article•DOI•

Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques, and Tools

[...]

Ruben Mayer¹, Hans-Arno Jacobsen¹•Institutions (1)

Technische Universität München¹

A Survey of Compiler Testing

TL;DR: This survey performs a broad and thorough investigation on challenges, techniques and tools for scalable DL on distributed infrastructures, and highlights future research trends in DL systems that deserve further research.

...read moreread less

Abstract: Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-art results in various domains, such as image recognition and natural language processing. One of the reasons for this success is the increasing size of DL models and the proliferation of vast amounts of training data being available. To keep on improving the performance of DL, increasing the scalability of DL systems is necessary. In this survey, we perform a broad and thorough investigation on challenges, techniques and tools for scalable DL on distributed infrastructures. This incorporates infrastructures for DL, methods for parallel DL training, multi-tenant resource scheduling, and the management of training and model data. Further, we analyze and compare 11 current open-source DL frameworks and tools and investigate which of the techniques are commonly implemented in practice. Finally, we highlight future research trends in DL systems that deserve further research.

...read moreread less

Journal Article•DOI•

[...]

Junjie Chen¹, Jibesh Patra², Michael Pradel², Yingfei Xiong³, Hongyu Zhang⁴, Dan Hao³, Lu Zhang³ - Show less +3 more•Institutions (4)

Tianjin University¹, University of Stuttgart², Peking University³, University of Newcastle⁴

Indicator-based Multi-objective Evolutionary Algorithms: A Comprehensive Survey

TL;DR: This research presents a meta-compiler that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually compiling software.

...read moreread less

Abstract: Virtually any software running on a computer has been processed by a compiler or a compiler-like tool. Because compilers are such a crucial piece of infrastructure for building software, their correctness is of paramount importance. To validate and increase the correctness of compilers, significant research efforts have been devoted to testing compilers. This survey article provides a comprehensive summary of the current state-of-the-art of research on compiler testing. The survey covers different aspects of the compiler testing problem, including how to construct test programs, what test oracles to use for determining whether a compiler behaves correctly, how to execute compiler tests efficiently, and how to help compiler developers take action on bugs discovered by compiler testing. Moreover, we survey work that empirically studies the strengths and weaknesses of current compiler testing research and practice. Based on the discussion of existing work, we outline several open challenges that remain to be addressed in future work.

...read moreread less

Journal Article•DOI•

[...]

Jesús Guillermo Falcón-Cardona¹, Carlos A. Coello Coello²•Institutions (2)

CINVESTAV¹, UAM Azcapotzalco²

Deep Learning for Source Code Modeling and Generation: Models, Applications, and Challenges

TL;DR: A comprehensive survey of IB-MOEAs for continuous search spaces since their origins up to the current state-of-the-art approaches is presented and a taxonomy that classifies IB-mechanisms into two main categories is proposed: (1) IB-Selection (which is divided into IB-Environmental Selection, IB-Density Estimation, and IB-Archiving) and (2)IB-Mating Selection.

...read moreread less

Abstract: For over 25 years, most multi-objective evolutionary algorithms (MOEAs) have adopted selection criteria based on Pareto dominance. However, the performance of Pareto-based MOEAs quickly degrades when solving multi-objective optimization problems (MOPs) having four or more objective functions (the so-called many-objective optimization problems), mainly because of the loss of selection pressure. Consequently, in recent years, MOEAs have been coupled with indicator-based selection mechanisms in furtherance of increasing the selection pressure so that they can properly solve many-objective optimization problems. Several research efforts have been conducted since 2003 regarding the design of the so-called indicator-based (IB) MOEAs. In this article, we present a comprehensive survey of IB-MOEAs for continuous search spaces since their origins up to the current state-of-the-art approaches. We propose a taxonomy that classifies IB-mechanisms into two main categories: (1) IB-Selection (which is divided into IB-Environmental Selection, IB-Density Estimation, and IB-Archiving) and (2) IB-Mating Selection. Each of these classes is discussed in detail in this article, emphasizing the advantages and drawbacks of the selection mechanisms. In the final part, we provide some possible paths for future research.

...read moreread less

Journal Article•DOI•

[...]

Triet H. M. Le¹, Hao Chen¹, Muhammad Ali Babar¹•Institutions (1)

University of Adelaide¹

Core Concepts, Challenges, and Future Directions in Blockchain: A Centralized Tutorial

TL;DR: Deep Learning (DL) techniques for source code modeling and generation have been evolving remarkably fast Recently, the DL advances in language modeling, machine translation, and paragraph understanding are so prominent that the potential of DL in software engineering cannot be overlooked, especially in the field of program learning as mentioned in this paper.

...read moreread less

Abstract: Deep Learning (DL) techniques for Natural Language Processing have been evolving remarkably fast Recently, the DL advances in language modeling, machine translation, and paragraph understanding are so prominent that the potential of DL in Software Engineering cannot be overlooked, especially in the field of program learning To facilitate further research and applications of DL in this field, we provide a comprehensive review to categorize and investigate existing DL methods for source code modeling and generation To address the limitations of the traditional source code models, we formulate common program learning tasks under an encoder-decoder framework After that, we introduce recent DL mechanisms suitable to solve such problems Then, we present the state-of-the-art practices and discuss their challenges with some recommendations for practitioners and researchers as well

...read moreread less

Journal Article•DOI•

[...]

John Kolb¹, Moustafa AbdelBaky¹, Randy H. Katz¹, David E. Culler¹•Institutions (1)

University of California, Berkeley¹

The Future of False Information Detection on Social Media: New Perspectives and Trends

TL;DR: This tutorial explains the fundamental elements of blockchains and uses Ethereum as a case study to describe the inner workings in detail before comparing blockchains to traditional distributed systems.

...read moreread less

Abstract: Blockchains are a topic of immense interest in academia and industry, but their true nature is often obscured by marketing and hype. In this tutorial, we explain the fundamental elements of blockchains. We discuss their ability to achieve availability, consistency, and data integrity as well as their inherent limitations. Using Ethereum as a case study, we describe the inner workings of blockchains in detail before comparing blockchains to traditional distributed systems. In the second part of our tutorial, we discuss the major challenges facing blockchains and summarize ongoing research and commercial offerings that seek to address these challenges.

...read moreread less

Journal Article•DOI•

[...]

Bin Guo¹, Yasan Ding¹, Lina Yao², Yunji Liang¹, Zhiwen Yu¹ - Show less +1 more•Institutions (2)

Northwestern Polytechnical University¹, University of New South Wales²

11 Jul 2020-ACM Computing Surveys

TL;DR: The extraction and usage of various crowd intelligence in FID is investigated, which paves a promising way to tackle FID challenges, and the views on the open issues and future research directions are given.

...read moreread less

Abstract: The massive spread of false information on social media has become a global risk, implicitly influencing public opinion and threatening social/political development. False information detection (FID) has thus become a surging research topic in recent years. As a promising and rapidly developing research field, we find that much effort has been paid to new research problems and approaches of FID. Therefore, it is necessary to give a comprehensive review of the new research trends of FID. We first give a brief review of the literature history of FID, based on which we present several new research challenges and techniques of it, including early detection, detection by multimodal data fusion, and explanatory detection. We further investigate the extraction and usage of various crowd intelligence in FID, which paves a promising way to tackle FID challenges. Finally, we give our views on the open issues and future research directions of FID, such as model adaptivity/generality to new events, embracing of novel machine learning models, aggregation of crowd wisdom, adversarial attack and defense in detection models, and so on.

...read moreread less

Journal Article•DOI•

Outlier Detection: Methods, Models, and Classification

[...]

Azzedine Boukerche¹, Lining Zheng¹, Omar Alfandi²•Institutions (2)

University of Ottawa¹, Zayed University²

Driver Emotion Recognition for Intelligent Vehicles: A Survey

TL;DR: A taxonomy of the recently designed outlier detection strategies while underlying their fundamental characteristics and properties is proposed and several newly trending outlier Detection methods designed for high-dimensional data, data streams, big data, and minimally labeled data are introduced.

...read moreread less

Abstract: Over the past decade, we have witnessed an enormous amount of research effort dedicated to the design of efficient outlier detection techniques while taking into consideration efficiency, accuracy, high-dimensional data, and distributed environments, among other factors. In this article, we present and examine these characteristics, current solutions, as well as open challenges and future research directions in identifying new outlier detection strategies. We propose a taxonomy of the recently designed outlier detection strategies while underlying their fundamental characteristics and properties. We also introduce several newly trending outlier detection methods designed for high-dimensional data, data streams, big data, and minimally labeled data. Last, we review their advantages and limitations and then discuss future and new challenging issues.

...read moreread less

Journal Article•DOI•

[...]

Sebastian Zepf¹, Javier Hernandez², Alexander Schmitt¹, Wolfgang Minker³, Rosalind W. Picard² - Show less +1 more•Institutions (3)

Mercedes-Benz¹, Massachusetts Institute of Technology², University of Ulm³

17 Jun 2020-ACM Computing Surveys

TL;DR: A mobile app that automates the very labor-intensive and therefore time-heavy and expensive and therefore expensive process of driving that is essential for road safety and long-term human health improvement.

...read moreread less

Abstract: Driving can occupy a large portion of daily life and often can elicit negative emotional states like anger or stress, which can significantly impact road safety and long-term human health. In recent decades, the arrival of new tools to help recognize human affect has inspired increasing interest in how to develop emotion-aware systems for cars. To help researchers make needed advances in this area, this article provides a comprehensive literature survey of work addressing the problem of human emotion recognition in an automotive context. We systematically review the literature back to 2002 and identify 63 peer-review published articles on this topic. We overview each study’s methodology to measure and recognize emotions in the context of driving. Across the literature, we find a strong preference toward studying emotional states associated with high arousal and negative valence, monitoring the different states with cardiac, electrodermal activity, and speech signals, and using supervised machine learning to automatically infer the underlying human affective states. This article summarizes the existing work together with publicly available resources (e.g., datasets and tools) to help new researchers get started in this field. We also identify new research opportunities to help advance progress for improving driver emotion recognition.

...read moreread less

Journal Article•DOI•

Anomaly Detection in Road Traffic Using Visual Surveillance: A Survey

[...]

Kelathodi Kumaran Santhosh¹, Debi Prosad Dogra¹, Partha Pratim Roy²•Institutions (2)

Indian Institute of Technology Bhubaneswar¹, Indian Institute of Technology Roorkee²

06 Dec 2020-ACM Computing Surveys

TL;DR: In this paper, the authors present a survey of the recent visual surveillance-related research on anomaly detection in public places, particularly on road, and analyze various vision-guided anomaly detection techniques using a generic framework such that the key technical components can be easily understood.

...read moreread less

Abstract: Computer vision has evolved in the last decade as a key technology for numerous applications replacing human supervision. Timely detection of traffic violations and abnormal behavior of pedestrians at public places through computer vision and visual surveillance can be highly effective for maintaining traffic order in cities. However, despite a handful of computer vision–based techniques proposed in recent times to understand the traffic violations or other types of on-road anomalies, no methodological survey is available that provides a detailed insight into the classification techniques, learning methods, datasets, and application contexts. Thus, this study aims to investigate the recent visual surveillance–related research on anomaly detection in public places, particularly on road. The study analyzes various vision-guided anomaly detection techniques using a generic framework such that the key technical components can be easily understood. Our survey includes definitions of related terminologies and concepts, judicious classifications of the vision-guided anomaly detection approaches, detailed analysis of anomaly detection methods including deep learning–based methods, descriptions of the relevant datasets with environmental conditions, and types of anomalies. The study also reveals vital gaps in the available datasets and anomaly detection capability in various contexts, and thus gives future directions to the computer vision–guided anomaly detection research. As anomaly detection is an important step in automatic road traffic surveillance, this survey can be a useful resource for interested researchers working on solving various issues of Intelligent Transportation Systems (ITS).

...read moreread less

Journal Article•DOI•

Fast Packet Processing with eBPF and XDP: Concepts, Code, Challenges, and Applications

[...]

Marcos A. M. Vieira¹, Matheus S. Castanho¹, Racyus D. G. Pacífico¹, Elerson R. S. Santos¹, Eduardo P. M. Câmara Júnior¹, Luiz F. M. Vieira¹ - Show less +2 more•Institutions (1)

Universidade Federal de Minas Gerais¹