scispace - formally typeset
Search or ask a question
Institution

IEEE Computer Society

NonprofitWashington D.C., District of Columbia, United States
About: IEEE Computer Society is a nonprofit organization based out in Washington D.C., District of Columbia, United States. It is known for research contribution in the topics: Software & The Internet. The organization has 523 authors who have published 629 publications receiving 35133 citations. The organization is also known as: Computer Society & IEEE CS.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper presents a middleware platform which addresses the issue of selecting Web services for the purpose of their composition in a way that maximizes user satisfaction expressed as utility functions over QoS attributes, while satisfying the constraints set by the user and by the structure of the composite service.
Abstract: The paradigmatic shift from a Web of manual interactions to a Web of programmatic interactions driven by Web services is creating unprecedented opportunities for the formation of online business-to-business (B2B) collaborations. In particular, the creation of value-added services by composition of existing ones is gaining a significant momentum. Since many available Web services provide overlapping or identical functionality, albeit with different quality of service (QoS), a choice needs to be made to determine which services are to participate in a given composite service. This paper presents a middleware platform which addresses the issue of selecting Web services for the purpose of their composition in a way that maximizes user satisfaction expressed as utility functions over QoS attributes, while satisfying the constraints set by the user and by the structure of the composite service. Two selection approaches are described and compared: one based on local (task-level) selection of services and the other based on global allocation of tasks to services using integer programming.

2,872 citations

Journal ArticleDOI
TL;DR: This paper presents an extensive set of duplicate detection algorithms that can detect approximately duplicate records in a database and covers similarity metrics that are commonly used to detect similar field entries.
Abstract: Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task. Errors are introduced as the result of transcription errors, incomplete information, lack of standard formats, or any combination of these factors. In this paper, we present a thorough analysis of the literature on duplicate record detection. We cover similarity metrics that are commonly used to detect similar field entries, and we present an extensive set of duplicate detection algorithms that can detect approximately duplicate records in a database. We also cover multiple techniques for improving the efficiency and scalability of approximate duplicate detection algorithms. We conclude with coverage of existing tools and with a brief discussion of the big open problems in the area

1,640 citations

Journal ArticleDOI
TL;DR: This paper proposes a projection-based, sequential pattern-growth approach for efficient mining of sequential patterns, and shows that PrefixSpan outperforms the a priori-based algorithm GSP, FreeSpan, and SPADE and is the fastest among all the tested algorithms.
Abstract: Sequential pattern mining is an important data mining problem with broad applications. However, it is also a difficult problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Most of the previously developed sequential pattern mining methods, such as GSP, explore a candidate generation-and-test approach [R. Agrawal et al. (1994)] to reduce the number of candidates to be examined. However, this approach may not be efficient in mining large sequence databases having numerous patterns and/or long patterns. In this paper, we propose a projection-based, sequential pattern-growth approach for efficient mining of sequential patterns. In this approach, a sequence database is recursively projected into a set of smaller projected databases, and sequential patterns are grown in each projected database by exploring only locally frequent fragments. Based on an initial study of the pattern growth-based sequential pattern mining, FreeSpan [J. Han et al. (2000)], we propose a more efficient method, called PSP, which offers ordered growth and reduced projected databases. To further improve the performance, a pseudoprojection technique is developed in PrefixSpan. A comprehensive performance study shows that PrefixSpan, in most cases, outperforms the a priori-based algorithm GSP, FreeSpan, and SPADE [M. Zaki, (2001)] (a sequential pattern mining algorithm that adopts vertical data format), and PrefixSpan integrated with pseudoprojection is the fastest among all the tested algorithms. Furthermore, this mining methodology can be extended to mining sequential patterns with user-specified constraints. The high promise of the pattern-growth approach may lead to its further extension toward efficient mining of other kinds of frequent patterns, such as frequent substructures.

1,334 citations

01 Jan 2010
TL;DR: This short paper shows that unnormalized cross correlation can be efficiently normalized using precomputing integrals of the image and image over the search window.
Abstract: Although it is well known that cross correlation can be efficiently implemented in the transform domain, the normalized form of cross correlation preferred for feature matching applications does not have a simple frequency domain expression. Normalized cross correlation has been computed in the spatial domain for this reason. This short paper shows that unnormalized cross correlation can be efficiently normalized using precomputing integrals of the image and image over the search window.

1,198 citations

01 Jan 2003
TL;DR: A possible taxonomy for the classification of several existing and proposed model transformation approaches is proposed, described with a feature model that makes the different design choices for model transformations explicit.
Abstract: The Model-Driven Architecture is an initiative by the Object Management Group to automate the generation of platform-specific models from platformindependent models. While there exist some well-established standards for modeling platform models, there is currently no matured foundation for specifying transformations between such models. In this paper, we propose a possible taxonomy for the classification of several existing and proposed model transformation approaches. The taxonomy is described with a feature model that makes the different design choices for model transformations explicit. Based on our analysis, we propose a few major categories in which most model transformation approaches fit.

884 citations


Authors

Showing all 524 results

NameH-indexPapersCitations
John Hart108108154283
Hermann Ney9899749231
Jean-Pierre Hubaux9041535837
George Karypis8647158073
Sajal K. Das85112429785
Divesh Srivastava8245125082
Jian Pei8246541357
Jiri Matas7834544739
Marlon Dumas7746526791
Michal Irani7315025714
Daniel A. Keim7246227795
Tie-Yan Liu7156926266
Barbara Kitchenham7123326140
Jeffrey Xu Yu6949217376
Hui Xiong6947016776
Network Information
Related Institutions (5)
Hewlett-Packard
59.8K papers, 1.4M citations

84% related

Microsoft
86.9K papers, 4.1M citations

83% related

Intel
68.8K papers, 1.6M citations

82% related

Google
39.8K papers, 2.1M citations

82% related

IBM
253.9K papers, 7.4M citations

82% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20222
202118
202010
201912
20186
20178