I
Ion Muslea
Researcher at University of Southern California
Publications - 38
Citations - 3644
Ion Muslea is an academic researcher from University of Southern California. The author has contributed to research in topics: Information extraction & Web page. The author has an hindex of 24, co-authored 37 publications receiving 3593 citations. Previous affiliations of Ion Muslea include SRI International & Information Sciences Institute.
Papers
More filters
Proceedings ArticleDOI
A hierarchical approach to wrapper induction
TL;DR: This work introduces an inductive algorithm, STALKER, that generates high accuracy extraction rules based on user-labeled training examples that can handle information sources that could not be wrapped by existing techniques.
Journal ArticleDOI
Hierarchical Wrapper Induction for Semistructured Information Sources
TL;DR: This work introduces an inductive algorithm, STALKER, that generates high accuracy extraction rules based on user-labeled training examples that can wrap information sources that could not be wrapped by existing inductive techniques.
Proceedings Article
Active + Semi-supervised Learning = Robust Multi-View Learning
TL;DR: A new multi-view algorithm, Co-EMT, which combines semi-supervised and active learning is introduced, which outperforms the other algorithms both on the parameterized problems and on two additional real world domains.
Extraction Patterns for Information Extraction Tasks: A Survey
TL;DR: This paper surveys the various types of extraction patterns that are generated by machine learning algorithms and identifies three main categories of patterns, which cover a variety of application domains, and compares and contrast the patterns from each category.
Proceedings Article
Modeling Web sources for information integration
Craig A. Knoblock,Steven Minton,José Luis Ambite,Naveen Ashish,Pragnesh Jay Modi,Ion Muslea,Andrew Philpot,Sheila Tejada +7 more
TL;DR: This work has developed methods for mapping web sources into a simple, uniform representation that makes it efficient to integrate multiple sources and makes it easy to maintain these agents and incorporate new sources as they become available.