scispace - formally typeset
Search or ask a question

Showing papers in "ACM Computing Surveys in 2008"


Journal ArticleDOI
TL;DR: Almost 300 key theoretical and empirical contributions in the current decade related to image retrieval and automatic image annotation are surveyed, and the spawning of related subfields are discussed, to discuss the adaptation of existing image retrieval techniques to build systems that can be useful in the real world.
Abstract: We have witnessed great interest and a wealth of promise in content-based image retrieval as an emerging technology. While the last decade laid foundation to such promise, it also paved the way for a large number of new techniques and systems, got many new people involved, and triggered stronger association of weakly related fields. In this article, we survey almost 300 key theoretical and empirical contributions in the current decade related to image retrieval and automatic image annotation, and in the process discuss the spawning of related subfields. We also discuss significant challenges involved in the adaptation of existing image retrieval techniques to build systems that can be useful in the real world. In retrospect of what has been achieved so far, we also conjecture what the future may hold for image retrieval research.

3,433 citations


Journal ArticleDOI
TL;DR: The main objective of this survey is to present the work that has been conducted in the area of graph database modeling, concentrating on data structures, query languages, and integrity constraints.
Abstract: Graph database models can be defined as those in which data structures for the schema and instances are modeled as graphs or generalizations of them, and data manipulation is expressed by graph-oriented operations and type constructors. These models took off in the eighties and early nineties alongside object-oriented models. Their influence gradually died out with the emergence of other database models, in particular geographical, spatial, semistructured, and XML. Recently, the need to manage information with graph-like nature has reestablished the relevance of this area. The main objective of this survey is to present the work that has been conducted in the area of graph database modeling, concentrating on data structures, query languages, and integrity constraints.

1,669 citations


Journal ArticleDOI
TL;DR: An introduction to the motivation and concepts of autonomic computing is provided and some research that has been seen as seminal in influencing a large proportion of early work is described, including the works that have provided significant contributions to an established reference model.
Abstract: Autonomic Computing is a concept that brings together many fields of computing with the purpose of creating computing systems that self-manage. In its early days it was criticised as being a “hype topic” or a rebadging of some Multi Agent Systems work. In this survey, we hope to show that this was not indeed ‘hype’ and that, though it draws on much work already carried out by the Computer Science and Control communities, its innovation is strong and lies in its robust application to the specific self-management of computing systems. To this end, we first provide an introduction to the motivation and concepts of autonomic computing and describe some research that has been seen as seminal in influencing a large proportion of early work. Taking the components of an established reference model in turn, we discuss the works that have provided significant contributions to that area. We then look at larger scaled systems that compose autonomic systems illustrating the hierarchical nature of their architectures. Autonomicity is not a well defined subject and as such different systems adhere to different degrees of Autonomicity, therefore we cross-slice the body of work in terms of these degrees. From this we list the key applications of autonomic computing and discuss the research work that is missing and what we believe the community should be considering.

918 citations


Journal ArticleDOI
TL;DR: This survey describes and classify top-k processing techniques in relational databases including query models, data access methods, implementation levels, data and query certainty, and supported scoring functions, and shows the implications of each dimension on the design of the underlying techniques.
Abstract: Efficient processing of top-k queries is a crucial requirement in many interactive environments that involve massive amounts of data. In particular, efficient top-k processing in domains such as the Web, multimedia search, and distributed systems has shown a great impact on performance. In this survey, we describe and classify top-k processing techniques in relational databases. We discuss different design dimensions in the current techniques including query models, data access methods, implementation levels, data and query certainty, and supported scoring functions. We show the implications of each dimension on the design of the underlying techniques. We also discuss top-k queries in XML domain, and show their connections to relational approaches.

893 citations


Journal ArticleDOI
TL;DR: An overview of techniques based on dynamic analysis that are used to analyze potentially malicious samples and analysis programs that employ these techniques to assist human analysts in assessing whether a given sample deserves closer manual inspection due to its unknown malicious behavior is provided.
Abstract: Anti-virus vendors are confronted with a multitude of potentially malicious samples today. Receiving thousands of new samples every day is not uncommon. The signatures that detect confirmed malicious threats are mainly still created manually, so it is important to discriminate between samples that pose a new unknown threat and those that are mere variants of known malware.This survey article provides an overview of techniques based on dynamic analysis that are used to analyze potentially malicious samples. It also covers analysis programs that leverage these It also covers analysis programs that employ these techniques to assist human analysts in assessing, in a timely and appropriate manner, whether a given sample deserves closer manual inspection due to its unknown malicious behavior.

815 citations


Journal ArticleDOI
TL;DR: This work develops a framework for organizing the literature based on the input-mediator-output-input (IMOI) model from the small groups literature, and suggests topics for future research.
Abstract: We review the empirical research on Free/Libre and Open-Source Software (FLOSS) development and assess the state of the literature. We develop a framework for organizing the literature based on the input-mediator-output-input (IMOI) model from the small groups literature. We present a quantitative summary of articles selected for the review and then discuss findings of this literature categorized into issues pertaining to inputs (e.g., member characteristics, technology use, and project characteristics), processes (software development practices, social processes, and firm involvement practices), emergent states (e.g., social states and task-related states), and outputs (e.g. team performance, FLOSS implementation, and project evolution). Based on this review, we suggest topics for future research, as well as identify methodological and theoretical issues for future inquiry in this area, including issues relating to sampling and the need for more longitudinal studies.

466 citations


Journal ArticleDOI
TL;DR: In this paper, state-of-the-art sequential 2D EDT algorithms are reviewed and compared, in an effort to reach more solid conclusions regarding their differences in speed and their exactness.
Abstract: The distance transform (DT) is a general operator forming the basis of many methods in computer vision and geometry, with great potential for practical applications. However, all the optimal algorithms for the computation of the exact Euclidean DT (EDT) were proposed only since the 1990s. In this work, state-of-the-art sequential 2D EDT algorithms are reviewed and compared, in an effort to reach more solid conclusions regarding their differences in speed and their exactness. Six of the best algorithms were fully implemented and compared in practice.

451 citations


Journal ArticleDOI
TL;DR: A tutorial overview of the state of the art of statistical machine translation, which describes the context of the current research and presents a taxonomy of some different approaches within the main subproblems: translation modeling, parameter estimation, and decoding.
Abstract: Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and new ideas are constantly introduced. This survey presents a tutorial overview of the state of the art. We describe the context of the current research and then move to a formal problem description and an overview of the main subproblems: translation modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and a discussion of future directions.

275 citations


Journal ArticleDOI
TL;DR: This survey is to provide a clear vision of what has been developed so far, focusing on methods that make use of theoretical frameworks that are developed for classes of real functions rather than for a single function, even if they are applied in a restricted manner.
Abstract: Differential topology, and specifically Morse theory, provide a suitable setting for formalizing and solving several problems related to shape analysis The fundamental idea behind Morse theory is that of combining the topological exploration of a shape with quantitative measurement of geometrical properties provided by a real function defined on the shape The added value of approaches based on Morse theory is in the possibility of adopting different functions as shape descriptors according to the properties and invariants that one wishes to analyze In this sense, Morse theory allows one to construct a general framework for shape characterization, parametrized with respect to the mapping function used, and possibly the space associated with the shape The mapping function plays the role of a lens through which we look at the properties of the shape, and different functions provide different insightsIn the last decade, an increasing number of methods that are rooted in Morse theory and make use of properties of real-valued functions for describing shapes have been proposed in the literature The methods proposed range from approaches which use the configuration of contours for encoding topographic surfaces to more recent work on size theory and persistent homology All these have been developed over the years with a specific target domain and it is not trivial to systematize this work and understand the links, similarities, and differences among the different methods Moreover, different terms have been used to denote the same mathematical constructs, which often overwhelm the understanding of the underlying common frameworkThe aim of this survey is to provide a clear vision of what has been developed so far, focusing on methods that make use of theoretical frameworks that are developed for classes of real functions rather than for a single function, even if they are applied in a restricted manner The term geometrical-topological used in the title is meant to underline that both levels of information content are relevant for the applications of shape descriptions: geometrical, or metrical, properties and attributes are crucial for characterizing specific instances of features, while topological properties are necessary to abstract and classify shapes according to invariant aspects of their geometry The approaches surveyed will be discussed in detail, with respect to theory, computation, and application Several properties of the shape descriptors will be analyzed and compared We believe this is a crucial step to exploit fully the potential of such approaches in many applications, as well as to identify important areas of future research

231 citations


Journal ArticleDOI
TL;DR: A process-centered template is used for summarizing the object-oriented software development methodologies, highlighting the activities prescribed in the methodology while describing the modeling languages used (mainly diagrams and tables) as secondary to the activities.
Abstract: We provide a detailed review of existing object-oriented software development methodologies, focusing on their development processes. The review aims at laying bare their core philosophies, processes, and internal activities. This is done by using a process-centered template for summarizing the methodologies, highlighting the activities prescribed in the methodology while describing the modeling languages used (mainly diagrams and tables) as secondary to the activities. The descriptions produced using this template aim not to offer a critique on the methodologies and processes, but instead provide an abstract and structured description in a way that facilitates their elaborate analysis for the purposes of improving understanding, and making it easier to tailor, select, and evaluate the processes.

118 citations


Journal ArticleDOI
TL;DR: This survey focuses on emerging approaches to spam filtering built on recent developments in computing technologies, which include peer-to-peer computing, grid computing, semantic Web, and social networks.
Abstract: From just an annoying characteristic of the electronic mail epoch, spam has evolved into an expensive resource and time-consuming problem. In this survey, we focus on emerging approaches to spam filtering built on recent developments in computing technologies. These include peer-to-peer computing, grid computing, semantic Web, and social networks. We also address a number of perspectives related to personalization and privacy in spam filtering. We conclude that, while important advancements have been made in spam filtering in recent years, high performance approaches remain to be explored due to the large scale of the problem.

Journal ArticleDOI
TL;DR: This paper surveys modern state-of-the-art in trust management authorization, focusing on features of policy and rights languages that provide the necessary expressiveness for modern practice.
Abstract: Trust management systems are frameworks for authorization in modern distributed systems, allowing remotely accessible resources to be protected by providers. By allowing providers to specify policy, and access requesters to possess certain access rights, trust management automates the process of determining whether access should be allowed on the basis of policy, rights, and an authorization semantics. In this paper we survey modern state-of-the-art in trust management authorization, focusing on features of policy and rights languages that provide the necessary expressiveness for modern practice. We characterize systems in light of a generic structure that takes into account components of practical implementations. We emphasize systems that have a formal foundation, since security properties of them can be rigorously guaranteed. Underlying formalisms are reviewed to provide necessary background.

Journal ArticleDOI
TL;DR: This research work presents a detailed framework to evaluate collaborative systems according to given variables and performance levels and assumes that evaluation is an evolving process during the system lifecycle.
Abstract: Collaborative systems evaluation is always necessary to determine the impact a solution will have on the individuals, groups, and the organization. Several methods of evaluation have been proposed. These methods comprise a variety of approaches with various goals. Thus, the need for a strategy to select the most appropriate method for a specific case is clear. This research work presents a detailed framework to evaluate collaborative systems according to given variables and performance levels. The proposal assumes that evaluation is an evolving process during the system lifecycle. Therefore, the framework, illustrated with two examples, is complemented with a collection of guidelines to evaluate collaborative systems according to product development status.

Journal ArticleDOI
TL;DR: This work provides a survey of decentralized access control mechanisms in distributed file systems intended for large scale, in both administrative domains and users, and identifies essential properties of such access control mechanism.
Abstract: The Internet enables global sharing of data across organizational boundaries. Distributed file systems facilitate data sharing in the form of remote file access. However, traditional access control mechanisms used in distributed file systems are intended for machines under common administrative control, and rely on maintaining a centralized database of user identities. They fail to scale to a large user base distributed across multiple organizations. We provide a survey of decentralized access control mechanisms in distributed file systems intended for large scale, in both administrative domains and users. We identify essential properties of such access control mechanisms. We analyze both popular production and experimental distributed file systems in the context of our survey.

Journal ArticleDOI
TL;DR: Two Web framework taxonomies are proposed, reflecting two orthogonal ways of characterizing a framework: the way in which the markup language content of a browser-destined document is specified in the framework and the framework's facilities for the user to control the flow of events between browser and server.
Abstract: Most contemporary Web frameworks may be classified as server-centric. An overview of such Web frameworks is presented. It is based on information gleaned from surveying 80 server-centric Web frameworks, as well as from popular related specifications. Requirements typically expected of a server-centric Web framework are discussed. Two Web framework taxonomies are proposed, reflecting two orthogonal ways of characterizing a framework: the way in which the markup language content of a browser-destined document is specified in the framework (presentation concerns); and the framework's facilities for the user to control the flow of events between browser and server (control concerns).

Journal ArticleDOI
TL;DR: In this article, the authors define a "base" of structural attributes with which application-level fault-tolerance structures can be qualitatively assessed and compared with each other and with respect to the aforementioned needs.
Abstract: Structures for the expression of fault-tolerance provisions in application software comprise the central topic of this article. Structuring techniques answer questions as to how to incorporate fault tolerance in the application layer of a computer program and how to manage the fault-tolerant code. As such, they provide the means to control complexity, the latter being a relevant factor for the introduction of design faults. This fact and the ever-increasing complexity of today's distributed software justify the need for simple, coherent, and effective structures for the expression of fault-tolerance in the application software. In this text we first define a “base” of structural attributes with which application-level fault-tolerance structures can be qualitatively assessed and compared with each other and with respect to the aforementioned needs. This result is then used to provide an elaborated survey of the state-of-the-art of application-level fault-tolerance structures.