scispace - formally typeset
Proceedings ArticleDOI

CiteSeer: an automatic citation indexing system

Reads0
Chats0
TLDR
CiteSeer has many advantages over traditional citation indexes, including the ability to create more up-to-date databases which are not limited to a preselected set of journals or restricted by journal publication delays, completely autonomous operation with a corresponding reduction in cost, and powerful interactive browsing of the literature using the context of citations.
Abstract
We present CiteSeer: an autonomous citation indexing system which indexes academic literature in electronic format (e.g. Postscript files on the Web). CiteSeer understands how to parse citations, identify citations to the same paper in different formats, and identify the context of citations in the body of articles. CiteSeer provides most of the advantages of traditional (manually constructed) citation indexes (e.g. the ISI citation indexes), including: literature retrieval by following citation links (e.g. by providing a list of papers that cite a given paper), the evaluation and ranking of papers, authors, journals, etc. based on the number of citations, and the identification of research trends. CiteSeer has many advantages over traditional citation indexes, including the ability to create more up-to-date databases which are not limited to a preselected set of journals or restricted by journal publication delays, completely autonomous operation with a corresponding reduction in cost, and powerful interactive browsing of the literature using the context of citations. Given a particular paper of interest, CiteSeer can display the context of how the paper is cited in subsequent publications. This context may contain a brief summary of the paper, another author’s response to the paper, or subsequent work which builds upon the original article. CiteSeer allows the location of papers by keyword search or by citation links. Papers related to a given paper can be located using common citation information or word vector similarity. CiteSeer will soon be available for public use.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Community detection in graphs

TL;DR: A thorough exposition of community structure, or clustering, is attempted, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists.
Journal ArticleDOI

Community detection in graphs

TL;DR: A thorough exposition of the main elements of the clustering problem can be found in this paper, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.
Journal ArticleDOI

Collective Classification in Network Data

TL;DR: This article introduces four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data.
Proceedings ArticleDOI

Efficient clustering of high-dimensional data sets with application to reference matching

TL;DR: This work presents a new technique for clustering large datasets, using a cheap, approximate distance measure to eciently divide the data into overlapping subsets the authors call canopies, and presents ex- perimental results on grouping bibliographic citations from the reference sections of research papers.
Journal ArticleDOI

Automating the Construction of Internet Portals with Machine Learning

TL;DR: New research in reinforcement learning, information extraction and text classification that enables efficient spidering, the identification of informative text segments, and the population of topic hierarchies are described.
References
More filters
Journal ArticleDOI

Term Weighting Approaches in Automatic Text Retrieval

TL;DR: This paper summarizes the insights gained in automatic term weighting, and provides baseline single term indexing models with which other more elaborate content analysis procedures can be compared.
Journal ArticleDOI

An algorithm for suffix stripping

TL;DR: An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL, and performs slightly better than a much more elaborate system with which it has been compared.
Book

Citation indexing - its theory and application in science, technology, and humanities

TL;DR: Citation indexing-its theory and application in science, technology, and humanities, Citation indexing (Citation Indexing) (CIFS), مرکز فناوری اطلاعات (Citations Indexing),
Proceedings ArticleDOI

Data structures and algorithms for nearest neighbor search in general metric spaces

TL;DR: The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search problems in general metric spaces.
Book ChapterDOI

Conjugate gradient methods for indefinite systems

TL;DR: Conjugate gradient methods have often been used to solve a variety of numerical problems, including linear and nonlinear algebraic equations, eigenvalue problems, and minimization problems as discussed by the authors.