Institution
IEEE Computer Society
Nonprofit•Washington D.C., District of Columbia, United States•
About: IEEE Computer Society is a nonprofit organization based out in Washington D.C., District of Columbia, United States. It is known for research contribution in the topics: Software & The Internet. The organization has 523 authors who have published 629 publications receiving 35133 citations. The organization is also known as: Computer Society & IEEE CS.
Topics: Software, The Internet, Software system, Visualization, Cluster analysis
Papers published on a yearly basis
Papers
More filters
••
TL;DR: This paper presents a middleware platform which addresses the issue of selecting Web services for the purpose of their composition in a way that maximizes user satisfaction expressed as utility functions over QoS attributes, while satisfying the constraints set by the user and by the structure of the composite service.
Abstract: The paradigmatic shift from a Web of manual interactions to a Web of programmatic interactions driven by Web services is creating unprecedented opportunities for the formation of online business-to-business (B2B) collaborations. In particular, the creation of value-added services by composition of existing ones is gaining a significant momentum. Since many available Web services provide overlapping or identical functionality, albeit with different quality of service (QoS), a choice needs to be made to determine which services are to participate in a given composite service. This paper presents a middleware platform which addresses the issue of selecting Web services for the purpose of their composition in a way that maximizes user satisfaction expressed as utility functions over QoS attributes, while satisfying the constraints set by the user and by the structure of the composite service. Two selection approaches are described and compared: one based on local (task-level) selection of services and the other based on global allocation of tasks to services using integer programming.
2,872 citations
••
TL;DR: This paper presents an extensive set of duplicate detection algorithms that can detect approximately duplicate records in a database and covers similarity metrics that are commonly used to detect similar field entries.
Abstract: Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task. Errors are introduced as the result of transcription errors, incomplete information, lack of standard formats, or any combination of these factors. In this paper, we present a thorough analysis of the literature on duplicate record detection. We cover similarity metrics that are commonly used to detect similar field entries, and we present an extensive set of duplicate detection algorithms that can detect approximately duplicate records in a database. We also cover multiple techniques for improving the efficiency and scalability of approximate duplicate detection algorithms. We conclude with coverage of existing tools and with a brief discussion of the big open problems in the area
1,640 citations
••
TL;DR: This paper proposes a projection-based, sequential pattern-growth approach for efficient mining of sequential patterns, and shows that PrefixSpan outperforms the a priori-based algorithm GSP, FreeSpan, and SPADE and is the fastest among all the tested algorithms.
Abstract: Sequential pattern mining is an important data mining problem with broad applications. However, it is also a difficult problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Most of the previously developed sequential pattern mining methods, such as GSP, explore a candidate generation-and-test approach [R. Agrawal et al. (1994)] to reduce the number of candidates to be examined. However, this approach may not be efficient in mining large sequence databases having numerous patterns and/or long patterns. In this paper, we propose a projection-based, sequential pattern-growth approach for efficient mining of sequential patterns. In this approach, a sequence database is recursively projected into a set of smaller projected databases, and sequential patterns are grown in each projected database by exploring only locally frequent fragments. Based on an initial study of the pattern growth-based sequential pattern mining, FreeSpan [J. Han et al. (2000)], we propose a more efficient method, called PSP, which offers ordered growth and reduced projected databases. To further improve the performance, a pseudoprojection technique is developed in PrefixSpan. A comprehensive performance study shows that PrefixSpan, in most cases, outperforms the a priori-based algorithm GSP, FreeSpan, and SPADE [M. Zaki, (2001)] (a sequential pattern mining algorithm that adopts vertical data format), and PrefixSpan integrated with pseudoprojection is the fastest among all the tested algorithms. Furthermore, this mining methodology can be extended to mining sequential patterns with user-specified constraints. The high promise of the pattern-growth approach may lead to its further extension toward efficient mining of other kinds of frequent patterns, such as frequent substructures.
1,334 citations
01 Jan 2010
TL;DR: This short paper shows that unnormalized cross correlation can be efficiently normalized using precomputing integrals of the image and image over the search window.
Abstract: Although it is well known that cross correlation can be efficiently implemented in the transform domain, the normalized form of cross correlation preferred for feature matching applications does not have a simple frequency domain expression. Normalized cross correlation has been computed in the spatial domain for this reason. This short paper shows that unnormalized cross correlation can be efficiently normalized using precomputing integrals of the image and image over the search window.
1,198 citations
01 Jan 2003
TL;DR: A possible taxonomy for the classification of several existing and proposed model transformation approaches is proposed, described with a feature model that makes the different design choices for model transformations explicit.
Abstract: The Model-Driven Architecture is an initiative by the Object Management Group to automate the generation of platform-specific models from platformindependent models. While there exist some well-established standards for modeling platform models, there is currently no matured foundation for specifying transformations between such models. In this paper, we propose a possible taxonomy for the classification of several existing and proposed model transformation approaches. The taxonomy is described with a feature model that makes the different design choices for model transformations explicit. Based on our analysis, we propose a few major categories in which most model transformation approaches fit.
884 citations
Authors
Showing all 524 results
Name | H-index | Papers | Citations |
---|---|---|---|
John Hart | 108 | 1081 | 54283 |
Hermann Ney | 98 | 997 | 49231 |
Jean-Pierre Hubaux | 90 | 415 | 35837 |
George Karypis | 86 | 471 | 58073 |
Sajal K. Das | 85 | 1124 | 29785 |
Divesh Srivastava | 82 | 451 | 25082 |
Jian Pei | 82 | 465 | 41357 |
Jiri Matas | 78 | 345 | 44739 |
Marlon Dumas | 77 | 465 | 26791 |
Michal Irani | 73 | 150 | 25714 |
Daniel A. Keim | 72 | 462 | 27795 |
Tie-Yan Liu | 71 | 569 | 26266 |
Barbara Kitchenham | 71 | 233 | 26140 |
Jeffrey Xu Yu | 69 | 492 | 17376 |
Hui Xiong | 69 | 470 | 16776 |