Showing papers on "Field (computer science) published in 2005"

PDF

Open Access

Journal Article•DOI•

[...]

Rui Xu¹, Donald C. Wunsch¹•Institutions (1)

Missouri University of Science and Technology¹

01 May 2005-IEEE Transactions on Neural Networks

TL;DR: Clustering algorithms for data sets appearing in statistics, computer science, and machine learning are surveyed, and their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts are illustrated.

...read moreread less

Abstract: Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts. Several tightly related topics, proximity measure, and cluster validation, are also discussed.

...read moreread less

5,744 citations

Proceedings Article•DOI•

Fields of Experts: a framework for learning image priors

[...]

Stefan Roth¹, Michael J. Black¹•Institutions (1)

Brown University¹

20 Jun 2005

TL;DR: A framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks, developed using a Products-of-Experts framework.

...read moreread less

Abstract: We develop a framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks. The approach extends traditional Markov random field (MRF) models by learning potential functions over extended pixel neighborhoods. Field potentials are modeled using a Products-of-Experts framework that exploits nonlinear functions of many linear filter responses. In contrast to previous MRF approaches all parameters, including the linear filters themselves, are learned from training data. We demonstrate the capabilities of this Field of Experts model with two example applications, image denoising and image inpainting, which are implemented using a simple, approximate inference scheme. While the model is trained on a generic image database and is not tuned toward a specific application, we obtain results that compete with and even outperform specialized techniques.

...read moreread less

1,167 citations

Journal Article•DOI•

Mining data streams: a review

[...]

Mohamed Medhat Gaber¹, Arkady Zaslavsky¹, Shonali Krishnaswamy¹•Institutions (1)

Monash University¹

01 Jun 2005

TL;DR: This review paper presents the state-of-the-art in data stream mining, concerned with extracting knowledge structures represented in models and patterns in non stopping streams of information.

...read moreread less

Abstract: The recent advances in hardware and software have enabled the capture of different measurements of data in a wide range of fields. These measurements are generated continuously and in a very high fluctuating data rates. Examples include sensor networks, web logs, and computer network traffic. The storage, querying and mining of such data sets are highly computationally challenging tasks. Mining data streams is concerned with extracting knowledge structures represented in models and patterns in non stopping streams of information. The research in data stream mining has gained a high attraction due to the importance of its applications and the increasing generation of streaming information. Applications of data stream analysis can vary from critical scientific and astronomical applications to important business and financial ones. Algorithms, systems and frameworks that address streaming challenges have been developed over the past three years. In this review paper, we present the state-of-the-art in this growing vital field.

...read moreread less

999 citations

Proceedings Article•

Feature generation for text categorization using world knowledge

[...]

Evgeniy Gabrilovich¹, Shaul Markovitch¹•Institutions (1)

Technion – Israel Institute of Technology¹

30 Jul 2005

TL;DR: Improved machine learning algorithms for text categorization with generated features based on domain-specific and common-sense knowledge are enhanced, addressing the two main problems of natural language processing--synonymy and polysemy.

...read moreread less

Abstract: We enhance machine learning algorithms for text categorization with generated features based on domain-specific and common-sense knowledge. This knowledge is represented using publicly available ontologies that contain hundreds of thousands of concepts, such as the Open Directory; these ontologies are further enriched by several orders of magnitude through controlled Web crawling. Prior to text categorization, a feature generator analyzes the documents and maps them onto appropriate ontology concepts, which in turn induce a set of generated features that augment the standard bag of words. Feature generation is accomplished through contextual analysis of document text, implicitly performing word sense disambiguation. Coupled with the ability to generalize concepts using the ontology, this approach addresses the two main problems of natural language processing--synonymy and polysemy. Categorizing documents with the aid of knowledge-based features leverages information that cannot be deduced from the documents alone. Experimental results confirm improved performance, breaking through the plateau previously reached in the field.

...read moreread less

278 citations

Book•

R-Trees: Theory and Applications

[...]

Yannis Manolopoulos, Alexandros Nanopoulos, Apostolos N. Papadopoulos, Yannis Theodoridis

21 Nov 2005

TL;DR: This book provides an extensive survey of the R-tree evolution, studying the applicability of the structure & its variations to efficient query processing, accurate proposed cost models, & implementation issues like concurrency control and parallelism.

...read moreread less

Abstract: Space support in databases poses new challenges in every part of a database management system & the capability of spatial support in the physical layer is considered very important This has led to the design of spatial access methods to enable the effective & efficient management of spatial objects R-trees have a simplicity of structure & together with their resemblance to the B-tree, allow developers to incorporate them easily into existing database management systems for the support of spatial query processing This book provides an extensive survey of the R-tree evolution, studying the applicability of the structure & its variations to efficient query processing, accurate proposed cost models, & implementation issues like concurrency control and parallelism Written for database researchers, designers & programmers as well as graduate students, this comprehensive monograph will be a welcome addition to the field

...read moreread less

247 citations

Issues in mining imbalanced data sets - a review paper

[...]

Visa Sofa¹, Ralescu Anca¹•Institutions (1)

University of Cincinnati¹

01 Jan 2005

TL;DR: This paper traces some of the recent progress in the field of learning of imbalanced data and identifies challenges and points out future directions in this relatively new field.

...read moreread less

Abstract: This paper traces some of the recent progress in the field of learning of imbalanced data. It reviews approaches adopted for this problem and it identifies challenges and points out future directions in this relatively new field.

...read moreread less

190 citations

Patent•

Method and apparatus for searching network resources

[...]

Jaswinder Pal Singh, Randolph Y. Wang

29 Apr 2005

TL;DR: In this paper, the present invention relates to one more populating, indexing, and searching a database of fine-grained web objects or object specifications, and pertains to the field of computer software.

...read moreread less

Abstract: The present invention pertains to the field of computer software. More specifically, the present invention relates to one more populating, indexing, and searching a database of fine-grained web objects or object specifications.

...read moreread less

187 citations

Proceedings Article•DOI•

PET: A PErsonalized Trust Model with Reputation and Risk Evaluation for P2P Resource Sharing

[...]

Zhengqiang Liang¹, Weisong Shi¹•Institutions (1)

Wayne State University¹

03 Jan 2005

TL;DR: PET, a personalized trust model, is proposed to help the construction of a good cooperation, especially in the context of economic-based solutions for the P2P resource sharing, and combining with traditional reputation evaluation to derive the trustworthiness in this field.

...read moreread less

Abstract: Building a good cooperation in the P2P resource sharing is a fundamental and challenging research topic because of peer anonymity, peer independence, high dynamics of peer behaviors and network conditions, and the absence of an effective security mechanism. In this paper, we propose PET, a personalized trust model, to help the construction of a good cooperation, especially in the context of economic-based solutions for the P2P resource sharing. The trust model consists of two parts: reputation evaluation and risk evaluation. Reputation is the accumulative assessment of the long-term behavior, while the risk evaluation is the opinion of the short-term behavior. The risk part is employed to deal with the dramatic spoiling of peers, which makes PET differ from other trust models that based on the reputation only. This paper contributes to first modeling the risk as the opinion of short-term trustworthiness and combining with traditional reputation evaluation to derive the trustworthiness in this field.

...read moreread less

172 citations

Book•

Multisourcing: Moving Beyond Outsourcing to Achieve Growth And Agility

[...]

Linda Cohen¹, Allie Young¹•Institutions (1)

Gartner¹

14 Nov 2005

TL;DR: Based on extensive, multi-year research, the authors unveils a new operational model that seamlessly blends internally and externally delivered services not just to cut costs or gain efficiencies, but to maximize growth, agility, and bottom-line results.

...read moreread less

Abstract: Over the last decade, the number of services that can be outsourced has grown exponentially. Yet, research suggests that 50% of outsourcing contracts signed during the last three years will fail to meet expectations. Gartner sourcing experts Linda Cohen and Allie Young argue that this is because most organizations are utilizing ad-hoc approaches to outsourcing that are both shortsighted and ineffective. Based on extensive, multiyear research, this book unveils a new operational model--multisourcing--that seamlessly blends internally and externally delivered services not just to cut costs or gain efficiencies, but to maximize growth, agility, and bottom-line results. Through practical frameworks and illustrative company examples, the authors guide managers in creating a customized plan for managed multisourcing, including how to: assess their current sourcing strategy, strike the right types of sourcing deals, set up effective governance systems, select and evaluate service providers, and measure progress. A new approach to a timely business issue from leading experts in the field, Multisourcing presents a roadmap managers can follow to position their firms as tomorrow's industry leaders.

...read moreread less

168 citations

Book•

Database in Depth: Relational Theory for Practitioners

[...]

C. J. Date

05 May 2005

TL;DR: This book sheds light on the principles behind the relational model, which is fundamental to all database-backed applications--and, consequently, most of the work that goes on in the computing world today.

...read moreread less

Abstract: This book sheds light on the principles behind the relational model, which is fundamental to all database-backed applications--and, consequently, most of the work that goes on in the computing world today. Database in Depth: The Relational Model for Practitioners goes beyond the hype and gets to the heart of how relational databases actually work. Ideal for experienced database developers and designers, this concise guide gives you a clear view of the technology--a view that's not influenced by any vendor or product. Featuring an extensive set of exercises, it will help you: understand why and how the relational model is still directly relevant to modern database technology (and will remain so for the foreseeable future)see why and how the SQL standard is seriously deficientuse the best current theoretical knowledge in the design of their databases and database applicationsmake informed decisions in their daily database professional activities Database in Depth will appeal not only to database developers and designers, but also to a diverse field of professionals and academics, including database administrators (DBAs), information modelers, database consultants, and more. Virtually everyone who deals with relational databases should have at least a passing understanding of the fundamentals of working with relational models. Author C.J. Date has been involved with the relational model from its earliest days. An exceptionally clear-thinking writer, Date lays out principle and theory in a manner that is easily understood. Few others can speak as authoritatively the topic of relational databases as Date can.

...read moreread less

93 citations

Proceedings Article•DOI•

Intelligent Feature Extraction and Tracking for Visualizing Large-Scale 4D Flow Simulations

[...]

Fan-Yin Tzeng¹, Kwan-Liu Ma¹•Institutions (1)

University of California, Davis¹

12 Nov 2005

TL;DR: It is shown that it is possible for a visualization system to "learn" to extract and track features in complex 4D flow field according to their "visual" properties, location, shape, and size.

...read moreread less

Abstract: Terascale simulations produce data that is vast in spatial, temporal, and variable domains, creating a formidable challenge for subsequent analysis. Feature extraction as a data reduction method offers a viable solution to this large data problem. This paper presents a new approach to the problem of extracting and visualizing 4D features within large volume data. Conventional methods requires either an analytical description of the feature of interest or tedious manual intervention throughout the feature extraction and tracking process. We show that it is possible for a visualization system to "learn" to extract and track features in complex 4D flow field according to their "visual" properties, location, shape, and size. The basic approach is to employ machine learning in the process of visualization. Such an intelligent system approach is powerful because it allows us to extract and track an feature of interest in a high-dimensional space without explicitly specifying the relations between those dimensions, resulting in a greatly simplified and intuitive visualization interface.

...read moreread less

Patent•

Hard disk multimedia player and method

[...]

Masazumi Shiozawa¹, Hiroyuki Kondo¹, Masahiko Hajiri¹, Kenji Miyasaka¹, Takamasa Ito¹, Shoichiro Matsuoka¹, Taku Sugawara¹, Satoshi Akagawa¹, Hidehiko Yamamoto¹, Osamu Udagawa¹, Naoki Mato¹, Koichi Sayama¹ - Show less +8 more•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

27 Jun 2005

TL;DR: In this article, a user interface for displaying hierarchical data on a hand-held display device including a first-level display for displaying one or more first level data items in the hierarchical data, and at least one field associated with each first layer data item.

...read moreread less

Abstract: A user interface for displaying hierarchical data on a hand-held display device including a first-level display for displaying one or more first-level data items in the hierarchical data, and at least one field associated with each first-level data item, each field configured to display a first-level data sub-item associated with the first-level data item or a subordinate data indicator The presence of the subordinate data indicator in a field indicates that the field has subordinate data associated with the field, the subordinate data being subordinate to the first-level data items in the hierarchical data

...read moreread less

Proceedings Article•DOI•

A heterogeneous field matching method for record linkage

[...]

Steve Minton, Claude Nanjo, Craig A. Knoblock¹, Martin Michalowski¹, Matthew Michelson¹ - Show less +1 more•Institutions (1)

University of Southern California¹

27 Nov 2005

TL;DR: A new machine learning approach is described that creates expert-like rules for field matching that enables more sophisticated relationships to be modeled, which better capture the complex domain specific, common-sense phenomena that humans use to judge similarity.

...read moreread less

Abstract: Record linkage is the process of determining that two records refer to the same entity. A key subprocess is evaluating how well the individual fields, or attributes, of the records match each other. One approach to matching fields is to use hand-written domain-specific rules. This "expert systems" approach may result in good performance for specific applications, but it is not scalable. This paper describes a new machine learning approach that creates expert-like rules for field matching. In our approach, the relationship between two field values is described by a set of heterogeneous transformations. Previous machine learning methods used simple models to evaluate the distance between two fields. However, our approach enables more sophisticated relationships to be modeled, which better capture the complex domain specific, common-sense phenomena that humans use to judge similarity. We compare our approach to methods that rely on simpler homogeneous models in several domains. By modeling more complex relationships we produce more accurate results.

...read moreread less

Journal Article•DOI•

Data mining in e-commerce: A survey

[...]

N. R. Srinivasa Raghavan¹•Institutions (1)

Indian Institute of Science¹

01 Apr 2005-Sadhana-academy Proceedings in Engineering Sciences

TL;DR: The intent is not to survey the plethora of algorithms in data mining; instead, the current focus being e-commerce, the discussion is limited to data mining in the context of e- commerce.

...read moreread less

Abstract: Data mining has matured as a field of basic and applied research in computer science in general and e-commerce in particular In this paper, we survey some of the recent approaches and architectures where data mining has been applied in the fields of e-commerce and e-business Our intent is not to survey the plethora of algorithms in data mining; instead, our current focus being e-commerce, we limit our discussion to data mining in the context of e-commerce We also mention a few directions for further work in this domain, based on the survey

...read moreread less

Journal Article•DOI•

The VIMOS Integral Field Unit: Data‐Reduction Methods and Quality Assessment

[...]

A. Zanichelli¹, Bianca Garilli¹, M. Scodeggio¹, P. Franzetti¹, D. Rizzo, D. Maccagni¹, R. Merighi¹, J. P. Picat, O. Le Fevre, Sylvie Foucaud¹, D. Bottini¹, V. Le Brun, Roberto Scaramella¹, Laurence Tresse, Giampaolo Vettolani, C. Adami, Magda Arnaboldi¹, Stéphane Arnouts, S. Bardelli¹, M. Bolzonella², A. Cappi¹, Stephane Charlot³, Paolo Ciliegi¹, Thierry Contini, I. Gavignaud⁴, Luigi Guzzo, Olivier Ilbert, A. Iovino¹, H. J. McCracken⁵, Bruno Marano, Christian Marinoni, G. Mathez, A. Mazure, B. Meneux, Stéphane Paltani, R. Pello, Agnieszka Pollo¹, Lucia Pozzetti¹, M. Radovich¹, G. Zamorani¹, E. Zucca¹ - Show less +37 more•Institutions (5)

INAF¹, University of Bologna², Max Planck Society³, European Southern Observatory⁴, Institut d'Astrophysique de Paris⁵

19 Oct 2005-Publications of the Astronomical Society of the Pacific

TL;DR: In this article, the authors present the methods of the data processing software developed to extract the astrophysical signal of faint sources from the VIMOS IFU observations, focusing on the treatment of the fiber-to-fiber relative transmission and the sky subtraction, and the dedicated tasks built to address the peculiarities and unprecedented complexity of the dataset.

...read moreread less

Abstract: With new generation spectrographs integral field spectroscopy is becoming a widely used observational technique. The Integral Field Unit of the VIsible Multi-Object Spectrograph on the ESO-VLT allows to sample a field as large as 54\" x 54\" covered by 6400 fibers coupled with micro-lenses. We are presenting here the methods of the data processing software developed to extract the astrophysical signal of faint sources from the VIMOS IFU observations. We focus on the treatment of the fiber-to-fiber relative transmission and the sky subtraction, and the dedicated tasks we have built to address the peculiarities and unprecedented complexity of the dataset. We review the automated process we have developed under the VIPGI data organization and reduction environment (Scodeggio et al. 2005), along with the quality control performed to validate the process. The VIPGI-IFU data processing environment is available to the scientific community to process VIMOS-IFU data since November 2003.

...read moreread less

Models for learning spatial interactions in natural images for context-based classification

[...]

Sanjiv Kumar¹, Martial Hebert¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 2005

TL;DR: In this paper, a two-layer hierarchical discriminative field model is proposed to capture spatial interactions in images, which is based on the concept of conditional random fields (CRF) proposed by Lafferty et al.

...read moreread less

Abstract: Classification of various image components (pixels, regions and objects) in meaningful categories is a challenging task due to ambiguities inherent to visual data. Natural images exhibit strong contextual dependencies in the form of spatial interactions among components. For example, neighboring pixels tend to have similar class labels, and different parts of an object are related through geometric constraints. Going beyond these, different regions e.g., sky and water, or objects e.g., monitor and keyboard appear in restricted spatial configurations. Modeling these interactions is crucial to achieve good classification accuracy. In this thesis, we present discriminative field models that capture spatial interactions in images in a discriminative framework based on the concept of Conditional Random Fields proposed by Lafferty et al. The discriminative fields offer several advantages over the Markov Random Fields (MRFs) popularly used in computer vision. First, they allow to capture arbitrary dependencies in the observed data by relaxing the restrictive assumption of conditional independence generally made in MRFs for tractability. Second, the interaction in labels in discriminative fields is based on the observed data, instead of being fixed a priori as in MRFs. This is critical to incorporate different types of context in images within a single framework. Finally, the discriminative fields derive their classification power by exploiting probabilistic discriminative models instead of the generative models used in MRFs. Since the graphs induced by the discriminative fields may have arbitrary topology, exact maximum likelihood parameter learning may not be feasible. We present an approach which approximates the gradients of the likelihood with simple piecewise constant functions constructed using inference techniques. To exploit different levels of contextual information in images, a two-layer hierarchical formulation is also described. It encodes both short-range interactions (e.g., pixelwise label smoothing) as well as long-range interactions (e.g., relative configurations of objects or regions) in a tractable manner. The models proposed in this thesis are general enough to be applied to several challenging computer vision tasks such as contextual object detection, semantic scene segmentation, texture recognition, and image denoising seamlessly within a single framework.

...read moreread less

Journal Article•DOI•

Stabilisation diagrams: pole identification using fuzzy clustering techniques

[...]

M. Scionti¹, J. P. Lanslots•Institutions (1)

University of Catania¹

01 Nov 2005-Advances in Engineering Software

TL;DR: This paper introduces fuzzy clustering into the structural mechanics field as a tool to automatically assess stabilization diagrams and presents several advanced algorithms based on the Fuzzy-C-Means clustering technique, including the Gustafson-Kessel and Gath-Geva algorithms.

...read moreread less

Book Chapter•DOI•

A comprehensive approach to anomaly detection in relational databases

[...]

Adrian Spalka¹, Jan Lehnhardt•Institutions (1)

University of Bonn¹

07 Aug 2005-Lecture Notes in Computer Science

TL;DR: This work presents a system for the database extension and the user interaction with a DBMS; it also proposes a misuse detection systemFor the database scheme based on reference values and Δ-relations and shows that already standard statistical functions yield good detection results.

...read moreread less

Abstract: Anomaly detection systems assume that a certain deviation from the regular behaviour of a system can be an indicator for a security violation. They proved their usefulness to networks and operating systems for a long time, but are much less prominent in the field of databases. Relational databases operate on attributes within relations, ie, on data with a very uniform structure, which makes them a prime target for anomaly detection systems. This work presents such a system for the database extension and the user interaction with a DBMS; it also proposes a misuse detection system for the database scheme. In a comprehensive investigation we compare two approaches to deal with the database extension, one based on reference values and one based on Δ-relations, and show that already standard statistical functions yield good detection results. We then apply our methods to the user interaction, which is split into user input and DBMS behaviour. All methods have been implemented in a semi-automatic anomaly detection tool for the MS SQL Server 2000.

...read moreread less

Patent•

Database management system

[...]

Timothy A. Kennaley, Richard J. Mutton, Kevin P. Breese, John L. Dunn

27 Jan 2005

TL;DR: A method of creating a database and providing results from a database, for use by a municipality, and operating a community database, comprises the steps of: providing first and second source databases addressable under first-and second-source data protocols respectively, which protocols are incompatible one with the other as discussed by the authors.

...read moreread less

Abstract: A method of creating a database and of providing results from a database, for use by a municipality, and operating a community database, comprises the steps of: providing first and second source databases addressable under first and second data protocols respectively, which protocols are incompatible one with the other; transforming, comparing and reconciling the data in the first and second source databases in order to optimize each field of data in each record; and storing the optimized data for each record as a record in a resulting master database Records from searches are presented in order of recency The present invention also provides a method of generating revenue from activities associated with a community

...read moreread less

Patent•

Server side filtering and sorting with field level security

[...]

Patrick Conlan¹, Aaron M. Jensen¹, Chih-Jen Huang¹, Robert Turner¹•Institutions (1)

Microsoft¹

27 Sep 2005

TL;DR: In this paper, a project management system is enabled to implement filtering, sorting, and field level security for data associated with managed projects, where a filter for field selection is prepared by a project client application and forwarded to a project server.

...read moreread less

Abstract: A project management system is enabled to implement filtering, sorting, and field level security for data associated with managed projects. A filter for field selection is prepared by a project client application and forwarded to a project server. The server generates an access attribute table based the user permissions that may be set for each field within the managed projects. Upon retrieving the selected fields from project database, the project server builds a secured list of fields. A data set to be provided to the project client is prepared by removing the fields for which the user lacks the requisite access permission prior to sorting the data. The removed data may be used for user-transparent computations within the project server, but guarded from client applications.

...read moreread less

Book Chapter•DOI•

Data Mining Query Languages

[...]

Jean-François Boulicaut¹, Cyrille Masson¹•Institutions (1)

Institut national des sciences Appliquées de Lyon¹

01 Jan 2005

TL;DR: This chapter surveys different proposals of query languages made to support the more or less declarative specification of both data and pattern manipulations in the prolific field of association rule mining.

...read moreread less

Abstract: Many Data Mining algorithms enable to extract different types of patterns from data (e.g., local patterns like itemsets and association rules, models like classifiers). To support the whole knowledge discovery process, we need for integrated systems which can deal either with patterns and data. The inductive database approach has emerged as an unifying framework for such systems. Following this database perspective, knowledge discovery processes become querying processes for which query languages have to be designed. In the prolific field of association rule mining, different proposals of query languages have been made to support the more or less declarative specification of both data and pattern manipulations. In this chapter, we survey some of these proposals. It enables to identify nowadays shortcomings and to point out some promising directions of research in this area.

...read moreread less

Patent•

Techniques for serialization of instances of the XQuery data model

[...]

James W. Warner¹, Zhen Hua Liu¹, Muralidhar Krishnaprasad¹•Institutions (1)

Business International Corporation¹

06 Oct 2005

TL;DR: In this paper, a method for representing XML information is presented, where a serialized image of XML information comprises a collection of one or more serialized data values, where each particular serialised data value in the collection includes data associated with a particular serializer of a plurality of serializer value types.

...read moreread less

Abstract: A method for representing XML information is provided. A serialized image of XML information is generated. The serialized image comprises a collection of one or more serialized data values, where each particular serialized data value in the collection includes data associated with a particular serialized data value type of a plurality of serialized data value types. The serialized image may also comprise a first field that includes a first value, which indicates that the serialized image includes the collection of one or more serialized data values. In some embodiments, the method is performed at a database system that supports a native XML data type, wherein the XML information is one or more instances of the native XML data type.

...read moreread less

Journal Article•DOI•

Statistical algorithms in fault detection and prediction: Toward a healthier network

[...]

Benjamin Cheung¹, Gopal N. Kumar¹, Sudarshan A. Rao¹•Institutions (1)

Alcatel-Lucent¹

17 Feb 2005-Bell Labs Technical Journal

TL;DR: This paper suggests a new generation — the second-generation OFD — that is inherently adaptive and requires minimal human intervention, designed to detect system performance degradations, paving the way to more mature fault prediction strategies.

...read moreread less

Abstract: Very high reliability/availability at affordable cost requires a proactive approach to system faults and failures. This calls for sophisticated fault detection algorithms that ultimately could evolve into fault prediction strategies. This paper presents statistical algorithms — the Operational Fault Detection (OFD) class of algorithms — toward reaching these goals. OFD algorithms analyze system performance metrics to detect fault signatures. The concept behind OFD is to raise alarms for conditions that adversely impact customer revenue or system performance. Initial versions of OFD, deployed in the field, count meaningful events and raise alarms when a test statistic, based on the event counts, exceeds a predefined threshold. Setting the thresholds required human intervention. This is considered time consuming by our customers, even though the concepts of OFD have been well received. This paper suggests a new generation — the second-generation OFD — that is inherently adaptive and requires minimal human intervention. These new algorithms are designed to detect system performance degradations, paving the way to more mature fault prediction strategies. Detecting degradations is a precursor to fault predictions, as degradations are often early signatures of potentially catastrophic faults.

...read moreread less

Patent•

System and method for automatically populating appointment fields

[...]

Timothy Lawrence Brooke¹, David W. Flynt¹•Institutions (1)

Microsoft¹

06 Apr 2005

TL;DR: In this paper, a system and method for automatically populating appointment fields of an appointment template is presented, where a messaging client provides a message having message data associated with one or more fields.

...read moreread less

Abstract: A system and method for automatically populating appointment fields of an appointment template. A messaging client provides a message having message data associated with one or more fields. A field populator automatically transfers the message data associated with the one or more fields to an appropriate field of an appointment response. Time and place data is automatically transferred from a scheduler to an appropriate field of an appointment response.

...read moreread less

Patent•

More efficient search algorithm (MESA) using alpha omega search strategy

[...]

Robert O. Stuart, Scott P. Stuart

07 Sep 2005

TL;DR: In this article, the first and last positional characters are used to distinguish database records from one another in order to improve the efficiency of database search. But the first positional character is nearly as important as the last positional character in distinguishing database records.

...read moreread less

Abstract: A more efficient search algorithm introduces a variety of new tools and strategies to more efficiently search and retrieve desired records from an electronic database. Among these are a strategy that utilizes the first and last positional characters, or phonemes, to exploit the fact that often last positional character is nearly as important as a first positional character in distinguishing database records from one another. In addition, virtual search parameters, that are not a portion of the database records, can also be utilized in distinguishing database records, such as by identifying a number of characters in a search field for a requested database record as a way of distinguishing that record from all others with a different number of characters. The invention finds potential application in any database search application, but is particularly useful in delivering directory assistance services.

...read moreread less

Patent•

Techniques for validating multimedia forms

[...]

Kurt Piersol¹•Institutions (1)

Ricoh¹

18 Feb 2005

TL;DR: In this article, the authors present techniques for creating and processing electronic forms that include at least one field that is configured to accept multimedia information as input, and for validating the electronic form based upon the constraints associated with the fields of the form.

...read moreread less

Abstract: Techniques for creating and processing electronic forms that include at least one field that is configured to accept multimedia information as input. Techniques are provided for creating an electronic form comprising at least one field that is configured to accept multimedia information. Techniques are provided for specifying and associating a set of constraints (or predicates) with fields of a multimedia form, including specifying and associating one or more constraints with fields configured to accept multimedia information. Techniques for validating the electronic form based upon the constraints, if any, associated with the fields of the form. Various actions may be performed depending upon the results of the validation.

...read moreread less

Proceedings Article•DOI•

DrC4.5: Improving C4.5 by means of prior knowledge

[...]

Miriam Baglioni¹, Barbara Furletti¹, Franco Turini¹•Institutions (1)

University of Pisa¹

13 Mar 2005

TL;DR: This paper considers causal dependencies among the attributes of the data records to improve the construction of classifiers via Bayesian Causal Maps (BCMs), and the method is implemented as an adaptation of the C4.5 algorithm.

...read moreread less

Abstract: Classification is one of the most useful techniques for extracting meaningful knowledge from databases. Classifiers, e.g. decision trees, are usually extracted from a table of records, each of which represents an example. However, quite often in real applications there is other knowledge, e.g. owned by experts of the field, that can be usefully used in conjunction with the one hidden inside the examples. As a concrete example of this kind of knowledge we consider causal dependencies among the attributes of the data records. In this paper we discuss how to use such a knowledge to improve the construction of classifiers. The causal dependencies are represented via Bayesian Causal Maps (BCMs), and our method is implemented as an adaptation of the well known C4.5 algorithm.

...read moreread less

Journal Article•DOI•

Evolutionary Rule Mining in Time Series Databases

[...]

Magnus Lie Hetland, Pål Sætrom

01 Feb 2005-Machine Learning

TL;DR: This paper describes the method for evolutionary sequence mining, using a specialized piece of hardware for rule evaluation, and shows how the method can be applied to several different mining tasks, such as supervised sequence prediction, unsupervised mining of interesting rules, discovering connections between separate time series, and investigating tradeoffs between contradictory objectives.

...read moreread less

Abstract: Data mining in the form of rule discovery is a growing field of investigation. A recent addition to this field is the use of evolutionary algorithms in the mining process. While this has been used extensively in the traditional mining of relational databases, it has hardly, if at all, been used in mining sequences and time series. In this paper we describe our method for evolutionary sequence mining, using a specialized piece of hardware for rule evaluation, and show how the method can be applied to several different mining tasks, such as supervised sequence prediction, unsupervised mining of interesting rules, discovering connections between separate time series, and investigating tradeoffs between contradictory objectives by using multiobjective evolution.

...read moreread less

Patent•

Determining fields for presentable files

[...]

Jennifer P. Michelstein¹, Joe K Yap¹•Institutions (1)

Microsoft¹

19 Apr 2005

TL;DR: In this article, a transformation engine is applied to the referenced raw data to produce a result for the given field that is suitable for presentation, and the transformation engine includes multiples sets of presentation rules that may be selectively established for application to the fields.

...read moreread less

Abstract: Fields for presentable files can be determined by an application (i) based on a field type and at least one parameter of the fields and (ii) responsive to raw data and a separate transformation engine, even when the application is unaware of the mechanics of the separate transformation engine. In a described implementation for a given field, the field type indicates that the given field is to be evaluated based on raw data that is referenced by the at least one parameter of the given field. The transformation engine is applied to the referenced raw data to produce a result for the given field that is suitable for presentation. In an example implementation, the transformation engine includes multiples sets of presentation rules that may be selectively established for application to the fields. In an example embodiment, respective presentation rule subsets target respective types of raw data.

...read moreread less

Proceedings Article•DOI•

Mining relational databases with multi-view learning

[...]

Hongyu Guo¹, Herna L. Viktor¹•Institutions (1)

University of Ottawa¹

21 Aug 2005

TL;DR: A classification approach, called MVC (Multi-View Classification), which employs traditional single-table mining techniques to mine data straight from a multi-relational database, and achieves promising results in terms of overall accuracy obtained and run time, when compared with the FOIL and CrossMine learning methods.

...read moreread less

Abstract: Most of today's structured data resides in relational databases where multiple relations are formed by foreign key joins. In recent years, the field of data mining has played a key role in helping humans analyze and explore large databases. Unfortunately, most methods only utilize "flat" data representations. Thus, to apply these single-table data mining techniques, we are forced to incur a computational penalty by first converting the data into this "flat" form. As a result of this transformation, the data not only loses its compact representation but the semantic information present in the relations are reduced or eliminated. In this paper, we describe a classification approach, which addresses this issue by operating directly on relational databases. The approach, called MVC (Multi-View Classification), is based on a multi-view learning framework. In this framework, the target concept is represented in different views and then independently learned using single-table data mining techniques. After constructing multiple classifiers for the target concept in each view, the learners are validated and combined by a meta-learning algorithm. Two methods are employed in the MVC approach, namely (1) target concept propagation and (2) multi-view learning. The propagation method constructs training sets directly from relational databases for use by the multi-view learners. The learning method employs traditional single-table mining techniques to mine data straight from a multi-relational database. Our experiments on benchmark real-world databases show that the MVC method achieves promising results in terms of overall accuracy obtained and run time, when compared with the FOIL and CrossMine learning methods.

...read moreread less

Collapse