scispace - formally typeset
Search or ask a question

Showing papers in "ACM Computing Surveys in 2013"


Journal ArticleDOI
TL;DR: A framework is proposed for evaluating algorithms' ability to detect overlapping nodes, which helps to assess overdetection and underdetection, and for low overlapping density networks, SLPA, OSLOM, Game, and COPRA offer better performance than the other tested algorithms.
Abstract: This article reviews the state-of-the-art in overlapping community detection algorithms, quality measures, and benchmarks. A thorough comparison of different algorithms (a total of fourteen) is provided. In addition to community-level evaluation, we propose a framework for evaluating algorithms' ability to detect overlapping nodes, which helps to assess overdetection and underdetection. After considering community-level detection performance measured by normalized mutual information, the Omega index, and node-level detection performance measured by F-score, we reached the following conclusions. For low overlapping density networks, SLPA, OSLOM, Game, and COPRA offer better performance than the other tested algorithms. For networks with high overlapping density and high overlapping diversity, both SLPA and Game provide relatively stable performance. However, test results also suggest that the detection in such networks is still not yet fully resolved. A common feature observed by various algorithms in real-world networks is the relatively small fraction of overlapping nodes (typically less than 30p), each of which belongs to only 2 or 3 communities.

1,166 citations


Journal ArticleDOI
TL;DR: A fresh treatment is introduced that classifies and discusses existing work within three rational aspects: what and how EA components contribute to exploration and exploitation; when and how Exploration and exploitation are controlled; and how balance between exploration and exploited is achieved.
Abstract: “Exploration and exploitation are the two cornerstones of problem solving by search.” For more than a decade, Eiben and Schippers' advocacy for balancing between these two antagonistic cornerstones still greatly influences the research directions of evolutionary algorithms (EAs) [1998]. This article revisits nearly 100 existing works and surveys how such works have answered the advocacy. The article introduces a fresh treatment that classifies and discusses existing work within three rational aspects: (1) what and how EA components contribute to exploration and exploitation; (2) when and how exploration and exploitation are controlled; and (3) how balance between exploration and exploitation is achieved. With a more comprehensive and systematic understanding of exploration and exploitation, more research in this direction may be motivated and refined.

1,029 citations


Journal ArticleDOI
TL;DR: This survey provides a structured and comprehensive overview of research on security and privacy in computer and communication networks that use game-theoretic approaches and provides a discussion on the advantages, drawbacks, and future direction of using game theory in this field.
Abstract: This survey provides a structured and comprehensive overview of research on security and privacy in computer and communication networks that use game-theoretic approaches. We present a selected set of works to highlight the application of game theory in addressing different forms of security and privacy problems in computer networks and mobile applications. We organize the presented works in six main categories: security of the physical and MAC layers, security of self-organizing networks, intrusion detection systems, anonymity and privacy, economics of network security, and cryptography. In each category, we identify security problems, players, and game models. We summarize the main results of selected works, such as equilibrium analysis and security mechanism designs. In addition, we provide a discussion on the advantages, drawbacks, and future direction of using game theory in this field. In this survey, our goal is to instill in the reader an enhanced understanding of different research approaches in applying game-theoretic methods to network security. This survey can also help researchers from various fields develop game-theoretic solutions to current and emerging security problems in computer networking.

791 citations


Journal ArticleDOI
TL;DR: In this article, the authors survey the channel state information (CSI) in 802.11 a/g/n and highlight the differences between CSI and RSSI with respect to network layering, time resolution, frequency resolution, stability, and accessibility.
Abstract: The spatial features of emitted wireless signals are the basis of location distinction and determination for wireless indoor localization. Available in mainstream wireless signal measurements, the Received Signal Strength Indicator (RSSI) has been adopted in vast indoor localization systems. However, it suffers from dramatic performance degradation in complex situations due to multipath fading and temporal dynamics.Break-through techniques resort to finer-grained wireless channel measurement than RSSI. Different from RSSI, the PHY layer power feature, channel response, is able to discriminate multipath characteristics, and thus holds the potential for the convergence of accurate and pervasive indoor localization. Channel State Information (CSI, reflecting channel response in 802.11 a/g/n) has attracted many research efforts and some pioneer works have demonstrated submeter or even centimeter-level accuracy. In this article, we survey this new trend of channel response in localization. The differences between CSI and RSSI are highlighted with respect to network layering, time resolution, frequency resolution, stability, and accessibility. Furthermore, we investigate a large body of recent works and classify them overall into three categories according to how to use CSI. For each category, we emphasize the basic principles and address future directions of research in this new and largely open area.

704 citations


Journal ArticleDOI
TL;DR: This article presents the first comprehensive review of social and computer science literature on trust in social networks and discusses recent works addressing three aspects of social trust: trust information collection, trust evaluation, and trust dissemination.
Abstract: Web-based social networks have become popular as a medium for disseminating information and connecting like-minded people. The public accessibility of such networks with the ability to share opinions, thoughts, information, and experience offers great promise to enterprises and governments. In addition to individuals using such networks to connect to their friends and families, governments and enterprises have started exploiting these platforms for delivering their services to citizens and customers. However, the success of such attempts relies on the level of trust that members have with each other as well as with the service provider. Therefore, trust becomes an essential and important element of a successful social network. In this article, we present the first comprehensive review of social and computer science literature on trust in social networks. We first review the existing definitions of trust and define social trust in the context of social networks. We then discuss recent works addressing three aspects of social trust: trust information collection, trust evaluation, and trust dissemination. Finally, we compare and contrast the literature and identify areas for further research in social trust.

615 citations


Journal ArticleDOI
TL;DR: A survey of the approaches and techniques for constructing trajectories from movement tracks, enriching trajectories with semantic information to enable the desired interpretations of movements, and using data mining to analyze semantic trajectories to extract knowledge about their characteristics.
Abstract: Focus on movement data has increased as a consequence of the larger availability of such data due to current GPS, GSM, RFID, and sensors techniques. In parallel, interest in movement has shifted from raw movement data analysis to more application-oriented ways of analyzing segments of movement suitable for the specific purposes of the application. This trend has promoted semantically rich trajectories, rather than raw movement, as the core object of interest in mobility studies. This survey provides the definitions of the basic concepts about mobility data, an analysis of the issues in mobility data management, and a survey of the approaches and techniques for: (i) constructing trajectories from movement tracks, (ii) enriching trajectories with semantic information to enable the desired interpretations of movements, and (iii) using data mining to analyze semantic trajectories and extract knowledge about their characteristics, in particular the behavioral patterns of the moving objects. Last but not least, the article surveys the new privacy issues that arise due to the semantic aspects of trajectories.

520 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe the current status and needed developments in order to achieve a functional Metaverse and consider factors that support the formation of a viable Metaverse, such as institutional and popular interest and ongoing improvements in hardware performance, and factors that constrain the achievement of this goal.
Abstract: Moving from a set of independent virtual worlds to an integrated network of 3D virtual worlds or Metaverse rests on progress in four areas: immersive realism, ubiquity of access and identity, interoperability, and scalability. For each area, the current status and needed developments in order to achieve a functional Metaverse are described. Factors that support the formation of a viable Metaverse, such as institutional and popular interest and ongoing improvements in hardware performance, and factors that constrain the achievement of this goal, including limits in computational methods and unrealized collaboration among virtual world stakeholders and developers, are also considered.

501 citations


Journal ArticleDOI
TL;DR: A survey of data stream clustering algorithms is presented, providing a thorough discussion of the main design components of state-of-the-art algorithms and an overview of the usually employed experimental methodologies.
Abstract: Data stream mining is an active research area that has recently emerged to discover knowledge from large amounts of continuously generated data. In this context, several data stream clustering algorithms have been proposed to perform unsupervised learning. Nevertheless, data stream clustering imposes several challenges to be addressed, such as dealing with nonstationary, unbounded data that arrive in an online fashion. The intrinsic nature of stream data requires the development of algorithms capable of performing fast and incremental processing of data objects, suitably addressing time and memory limitations. In this article, we present a survey of data stream clustering algorithms, providing a thorough discussion of the main design components of state-of-the-art algorithms. In addition, this work addresses the temporal aspects involved in data stream clustering, and presents an overview of the usually employed experimental methodologies. A number of references are provided that describe applications of data stream clustering in different domains, such as network intrusion detection, sensor networks, and stock market analysis. Information regarding software packages and data repositories are also available for helping researchers and practitioners. Finally, some important issues and open questions that can be subject of future research are discussed.

479 citations


Journal ArticleDOI
TL;DR: There is a mean estimation of egocentric distances in virtual environments of about 74% of the modeled distances.
Abstract: Over the last 20 years research has been done on the question of how egocentric distances, i.e., the subjectively reported distance from a human observer to an object, are perceived in virtual environments. This review surveys the existing literature on empirical user studies on this topic. In summary, there is a mean estimation of egocentric distances in virtual environments of about 74p of the modeled distances. Many factors possibly influencing distance estimates were reported in the literature. We arranged these factors into four groups, namely measurement methods, technical factors, compositional factors, and human factors. The research on these factors is summarized, conclusions are drawn, and promising areas for future research are outlined.

403 citations


Journal ArticleDOI
TL;DR: The goal of this article is to compare the approaches to QoS description in the literature, where several models and metamodels are included, and to analyze where the need for further research and investigation lies.
Abstract: Quality of service (QoS) can be a critical element for achieving the business goals of a service provider, for the acceptance of a service by the user, or for guaranteeing service characteristics in a composition of services, where a service is defined as either a software or a software-support (i.e., infrastructural) service which is available on any type of network or electronic channel. The goal of this article is to compare the approaches to QoS description in the literature, where several models and metamodels are included. consider a large spectrum of models and metamodels to describe service quality, ranging from ontological approaches to define quality measures, metrics, and dimensions, to metamodels enabling the specification of quality-based service requirements and capabilities as well as of SLAs (Service-Level Agreements) and SLA templates for service provisioning. Our survey is performed by inspecting the characteristics of the available approaches to reveal which are the consolidated ones and which are the ones specific to given aspects and to analyze where the need for further research and investigation lies. The approaches here illustrated have been selected based on a systematic review of conference proceedings and journals spanning various research areas in computer science and engineering, including: distributed, information, and telecommunication systems, networks and security, and service-oriented and grid computing.

397 citations


Journal ArticleDOI
TL;DR: The survey aims to tackle all the issues and challenging aspects of people reidentification while simultaneously describing the previously proposed solutions for the encountered problems, including the first attempts of holistic descriptors and progresses to the more recently adopted 2D and 3D model-based approaches.
Abstract: The field of surveillance and forensics research is currently shifting focus and is now showing an ever increasing interest in the task of people reidentification. This is the task of assigning the same identifier to all instances of a particular individual captured in a series of images or videos, even after the occurrence of significant gaps over time or space. People reidentification can be a useful tool for people analysis in security as a data association method for long-term tracking in surveillance. However, current identification techniques being utilized present many difficulties and shortcomings. For instance, they rely solely on the exploitation of visual cues such as color, texture, and the object’s shape. Despite the many advances in this field, reidentification is still an open problem. This survey aims to tackle all the issues and challenging aspects of people reidentification while simultaneously describing the previously proposed solutions for the encountered problems. This begins with the first attempts of holistic descriptors and progresses to the more recently adopted 2D and 3D model-based approaches. The survey also includes an exhaustive treatise of all the aspects of people reidentification, including available datasets, evaluation metrics, and benchmarking.

Journal ArticleDOI
TL;DR: This article surveys the approaches and algorithms proposed to date in Sequential Pattern Mining, a subfield of data mining to focus on detecting and analyzing frequent subsequences in data.
Abstract: Sequences of events, items, or tokens occurring in an ordered metric space appear often in data and the requirement to detect and analyze frequent subsequences is a common problem Sequential Pattern Mining arose as a subfield of data mining to focus on this field This article surveys the approaches and algorithms proposed to date

Journal ArticleDOI
TL;DR: This survey reviews algorithmic paradigms—search based, cycle based, transformation based, and BDD based—as well as specific algorithms for reversible synthesis, both exact and heuristic, and outlines key open challenges in synthesis of reversible and quantum logic.
Abstract: Reversible logic circuits have been historically motivated by theoretical research in low-power electronics as well as practical improvement of bit manipulation transforms in cryptography and computer graphics. Recently, reversible circuits have attracted interest as components of quantum algorithms, as well as in photonic and nano-computing technologies where some switching devices offer no signal gain. Research in generating reversible logic distinguishes between circuit synthesis, postsynthesis optimization, and technology mapping. In this survey, we review algorithmic paradigms—search based, cycle based, transformation based, and BDD based—as well as specific algorithms for reversible synthesis, both exact and heuristic. We conclude the survey by outlining key open challenges in synthesis of reversible and quantum logic, as well as most common misconceptions.

Journal ArticleDOI
TL;DR: In this article, the authors provide an exhaustive survey of the work on mining taxi traces and provide a formalization of the data sets, along with an overview of different mechanisms for preprocessing the data.
Abstract: Vehicles equipped with GPS localizers are an important sensory device for examining people’s movements and activities. Taxis equipped with GPS localizers serve the transportation needs of a large number of people driven by diverse needs; their traces can tell us where passengers were picked up and dropped off, which route was taken, and what steps the driver took to find a new passenger. In this article, we provide an exhaustive survey of the work on mining these traces. We first provide a formalization of the data sets, along with an overview of different mechanisms for preprocessing the data. We then classify the existing work into three main categories: social dynamics, traffic dynamics and operational dynamics. Social dynamics refers to the study of the collective behaviour of a city’s population, based on their observed movements; Traffic dynamics studies the resulting flow of the movement through the road network; Operational dynamics refers to the study and analysis of taxi driver’s modus operandi. We discuss the different problems currently being researched, the various approaches proposed, and suggest new avenues of research. Finally, we present a historical overview of the research work in this field and discuss which areas hold most promise for future research.

Journal ArticleDOI
TL;DR: This survey describes and provides a taxonomy of existing reactive programming approaches along six axes: representation of time-varying values, evaluation model, lifting operations, multidirectionality, glitch avoidance, and support for distribution.
Abstract: Reactive programming has recently gained popularity as a paradigm that is well-suited for developing event-driven and interactive applications. It facilitates the development of such applications by providing abstractions to express time-varying values and automatically managing dependencies between such values. A number of approaches have been recently proposed embedded in various languages such as Haskell, Scheme, JavaScript, Java, .NET, etc. This survey describes and provides a taxonomy of existing reactive programming approaches along six axes: representation of time-varying values, evaluation model, lifting operations, multidirectionality, glitch avoidance, and support for distribution. From this taxonomy, we observe that there are still open challenges in the field of reactive programming. For instance, multidirectionality is supported only by a small number of languages, which do not automatically track dependencies between time-varying values. Similarly, glitch avoidance, which is subtle in reactive programs, cannot be ensured in distributed reactive programs using the current techniques.

Journal ArticleDOI
TL;DR: This work surveys the state-of-the-art in the field of local algorithm design, covering impossibility results, deterministic local algorithms, randomized localgorithms, and local algorithms for geometric graphs.
Abstract: A local algorithm is a distributed algorithm that runs in constant time, independently of the size of the network. Being highly scalable and fault tolerant, such algorithms are ideal in the operation of large-scale distributed systems. Furthermore, even though the model of local algorithms is very limited, in recent years we have seen many positive results for nontrivial problems. This work surveys the state-of-the-art in the field, covering impossibility results, deterministic local algorithms, randomized local algorithms, and local algorithms for geometric graphs.

Journal ArticleDOI
TL;DR: This survey reviews some of the well-known past broadband pricing proposals (both static and dynamic), including their current realizations in various consumer data plans around the world, and discusses several research problems and open questions.
Abstract: Traditionally, network operators have used simple flat-rate broadband data plans for both wired and wireless network access. But today, with the popularity of mobile devices and exponential growth of apps, videos, and clouds, service providers are gradually moving toward more sophisticated pricing schemes. This decade will therefore likely witness a major change in the ways in which network resources are managed, and the role of economics in allocating these resources. This survey reviews some of the well-known past broadband pricing proposals (both static and dynamic), including their current realizations in various consumer data plans around the world, and discusses several research problems and open questions. By exploring the benefits and challenges of pricing data, this article attempts to facilitate both the industrial and the academic communities' efforts in understanding the existing literature, recognizing new trends, and shaping an appropriate and timely research agenda.

Journal ArticleDOI
TL;DR: An up-to-date review of the existing literature revealing the current state-of-art in ear detection and recognition is provided, offering insights into some unsolved ear recognition problems as well as ear databases available for researchers.
Abstract: Recognizing people by their ear has recently received significant attention in the literature. Several reasons account for this trend: first, ear recognition does not suffer from some problems associated with other non-contact biometrics, such as face recognition; second, it is the most promising candidate for combination with the face in the context of multi-pose face recognition; and third, the ear can be used for human recognition in surveillance videos where the face may be occluded completely or in part. Further, the ear appears to degrade little with age. Even though current ear detection and recognition systems have reached a certain level of maturity, their success is limited to controlled indoor conditions. In addition to variation in illumination, other open research problems include hair occlusion, earprint forensics, ear symmetry, ear classification, and ear individuality.This article provides a detailed survey of research conducted in ear detection and recognition. It provides an up-to-date review of the existing literature revealing the current state-of-art for not only those who are working in this area but also for those who might exploit this new approach. Furthermore, it offers insights into some unsolved ear recognition problems as well as ear databases available for researchers.

Journal ArticleDOI
TL;DR: The article at hand reviews the failure mechanisms, fault models, diagnosis techniques, and fault-tolerance methods in on-chip networks, and surveys and summarizes the research of the last ten years.
Abstract: Networks-on-Chip constitute the interconnection architecture of future, massively parallel multiprocessors that assemble hundreds to thousands of processing cores on a single chip. Their integration is enabled by ongoing miniaturization of chip manufacturing technologies following Moore's Law. It comes with the downside of the circuit elements' increased susceptibility to failure. Research on fault-tolerant Networks-on-Chip tries to mitigate partial failure and its effect on network performance and reliability by exploiting various forms of redundancy at the suitable network layers. The article at hand reviews the failure mechanisms, fault models, diagnosis techniques, and fault-tolerance methods in on-chip networks, and surveys and summarizes the research of the last ten years. It is structured along three communication layers: the data link, the network, and the transport layers. The most important results are summarized and open research problems and challenges are highlighted to guide future research on this topic.

Journal ArticleDOI
TL;DR: The security considerations and some associated methodologies by which security breaches can occur are explained, recommendations for how virtualized environments can best be protected are offered, and a set of generalized recommendations that can be applied to achieve secure virtualized implementations are offered.
Abstract: Although system virtualization is not a new paradigm, the way in which it is used in modern system architectures provides a powerful platform for system building, the advantages of which have only been realized in recent years, as a result of the rapid deployment of commodity hardware and software systems. In principle, virtualization involves the use of an encapsulating software layer (Hypervisor or Virtual Machine Monitor) which surrounds or underlies an operating system and provides the same inputs, outputs, and behavior that would be expected from an actual physical device. This abstraction means that an ideal Virtual Machine Monitor provides an environment to the software equivalent to the host system, but which is decoupled from the hardware state. Because a virtual machine is not dependent on the state of the physical hardware, multiple virtual machines may be installed on a single set of hardware. The decoupling of physical and logical states gives virtualization inherent security benefits. However, the design, implementation, and deployment of virtualization technology have also opened up novel threats and security issues which, while not particular to system virtualization, take on new forms in relation to it. Reverse engineering becomes easier due to introspection capabilities, as encryption keys, security algorithms, low-level protection, intrusion detection, or antidebugging measures can become more easily compromised. Furthermore, associated technologies such as virtual routing and networking can create challenging issues for security, intrusion control, and associated forensic processes. We explain the security considerations and some associated methodologies by which security breaches can occur, and offer recommendations for how virtualized environments can best be protected. Finally, we offer a set of generalized recommendations that can be applied to achieve secure virtualized implementations.

Journal ArticleDOI
TL;DR: Typical defects that make a 3D model unsuitable for key application contexts are analyzed, and existing algorithms that process, repair, and improve its structure, geometry, and topology are surveyed to make it appropriate to case-by-case requirements.
Abstract: Nowadays, digital 3D models are in widespread and ubiquitous use, and each specific application dealing with 3D geometry has its own quality requirements that restrict the class of acceptable and supported models. This article analyzes typical defects that make a 3D model unsuitable for key application contexts, and surveys existing algorithms that process, repair, and improve its structure, geometry, and topology to make it appropriate to case-by-case requirements.The analysis is focused on polygon meshes, which constitute by far the most common 3D object representation. In particular, this article provides a structured overview of mesh repairing techniques from the point of view of the application context. Different types of mesh defects are classified according to the upstream application that produced the mesh, whereas mesh quality requirements are grouped by representative sets of downstream applications where the mesh is to be used. The numerous mesh repair methods that have been proposed during the last two decades are analyzed and classified in terms of their capabilities, properties, and guarantees. Based on these classifications, guidelines can be derived to support the identification of repairing algorithms best-suited to bridge the compatibility gap between the quality provided by the upstream process and the quality required by the downstream applications in a given geometry processing scenario.


Journal ArticleDOI
TL;DR: An overview of the cloud service models and the main techniques and research prototypes that efficiently support trust management of services in cloud environments are surveyed and a generic analytical framework is presented that assesses existing trust management research prototypes in cloud computing and relevant areas using a set of assessment criteria.
Abstract: Trust management is one of the most challenging issues in the emerging cloud computing area. Over the past few years, many studies have proposed different techniques to address trust management issues. However, despite these past efforts, several trust management issues such as identification, privacy, personalization, integration, security, and scalability have been mostly neglected and need to be addressed before cloud computing can be fully embraced. In this article, we present an overview of the cloud service models and we survey the main techniques and research prototypes that efficiently support trust management of services in cloud environments. We present a generic analytical framework that assesses existing trust management research prototypes in cloud computing and relevant areas using a set of assessment criteria. Open research issues for trust management in cloud environments are also discussed.

Journal ArticleDOI
TL;DR: This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy-based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging.
Abstract: The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy-based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field.

Journal ArticleDOI
TL;DR: This study structures previous research by presenting a comprehensive taxonomy of CSFs in the area of ERP, and provides a comprehensive bibliography of the articles published during this period that can serve as a guide for future research.
Abstract: Organizations perceive ERP as a vital tool for organizational competition as it integrates dispersed organizational systems and enables flawless transactions and production. This review examines studies investigating Critical Success Factors (CSFs) in implementing Enterprise Resource Planning (ERP) systems. Keywords relating to the theme of this study were defined and used to search known Web engines and journal databases for studies on both implementing ERP systems per se and integrating ERP systems with other well- known systems (e.g., SCM, CRM) whose importance to business organizations and academia is acknowledged to work in a complementary fashion. A total of 341 articles were reviewed to address three main goals. This study structures previous research by presenting a comprehensive taxonomy of CSFs in the area of ERP. Second, it maps studies, identified through an exhaustive and comprehensive literature review, to different dimensions and facets of ERP system implementation. Third, it presents studies investigating CSFs in terms of a specific ERP lifecycle phase and across the entire ERP life cycle. This study not only reviews articles in which an ERP system is the sole or primary field of research, but also articles that refer to an integration of ERP systems and other popular systems (e.g., SCM, CRM). Finally it provides a comprehensive bibliography of the articles published during this period that can serve as a guide for future research.

Journal ArticleDOI
TL;DR: This article addresses the online exact string matching problem which consists in finding all occurrences of a given pattern p in a text t and presents experimental results in order to bring order among the dozens of articles published in this area.
Abstract: This article addresses the online exact string matching problem which consists in finding all occurrences of a given pattern p in a text t. It is an extensively studied problem in computer science, mainly due to its direct applications to such diverse areas as text, image and signal processing, speech analysis and recognition, information retrieval, data compression, computational biology and chemistry.In the last decade more than 50 new algorithms have been proposed for the problem, which add up to a wide set of (almost 40) algorithms presented before 2000. In this article we review the string matching algorithms presented in the last decade and present experimental results in order to bring order among the dozens of articles published in this area.

Journal ArticleDOI
TL;DR: A comprehensive survey of large-scale data processing mechanisms based on the MapReduce framework can be found in this paper, where a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the map-reduce framework are reviewed.
Abstract: In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large-scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling, and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large-scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large-scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.


Journal ArticleDOI
TL;DR: As discovered in recent works, latest improvements and progress in near-duplicate video retrieval, as well as related topics including low-level feature extraction, signature generation, and high-dimensional indexing, are employed to assist the process.
Abstract: The exponential growth of online videos, along with increasing user involvement in video-related activities, has been observed as a constant phenomenon during the last decade. User's time spent on video capturing, editing, uploading, searching, and viewing has boosted to an unprecedented level. The massive publishing and sharing of videos has given rise to the existence of an already large amount of near-duplicate content. This imposes urgent demands on near-duplicate video retrieval as a key role in novel tasks such as video search, video copyright protection, video recommendation, and many more. Driven by its significance, near-duplicate video retrieval has recently attracted a lot of attention. As discovered in recent works, latest improvements and progress in near-duplicate video retrieval, as well as related topics including low-level feature extraction, signature generation, and high-dimensional indexing, are employed to assist the process.As we survey the works in near-duplicate video retrieval, we comparatively investigate existing variants of the definition of near-duplicate video, describe a generic framework, summarize state-of-the-art practices, and explore the emerging trends in this research topic.

Journal ArticleDOI
TL;DR: This article presents a comprehensive overview of current peer-to-peer solutions for massively multiplayer games using a uniform terminology.
Abstract: Scalability, fast response time, and low cost are of utmost importance in designing a successful massively multiplayer online game. The underlying architecture plays an important role in meeting these conditions. Peer-to-peer architectures, due to their distributed and collaborative nature, have low infrastructure costs and can achieve high scalability. They can also achieve fast response times by creating direct connections between players. However, these architectures face many challenges. Distributing a game among peers makes maintaining control over the game more complex. Peer-to-peer architectures also tend to be vulnerable to churn and cheating. Moreover, different genres of games have different requirements that should be met by the underlying architecture, rendering the task of designing a general-purpose architecture harder. Many peer-to-peer gaming solutions have been proposed that utilize a range of techniques while using somewhat different and confusing terminologies. This article presents a comprehensive overview of current peer-to-peer solutions for massively multiplayer games using a uniform terminology.