scispace - formally typeset
Open Access

Synthesis Lectures on Information Concepts, Retrieval, and Services

Reads0
Chats0
TLDR
This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN’s), and divergence-based models to create a consolidated and balanced view on the main models.
Abstract
Information Retrieval (IR) models are a core component of IR research and IR systems. e past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: “It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works.” is book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN’s), and divergence-based models. e aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the “relationships between models.” is includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. e Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters.

read more

Citations
More filters
Book

스크린 위의 삶 = Life on the screen : identity in the age of the internet

Sherry Turkle, +1 more
TL;DR: In this paper, Sherry Turkle uses Internet MUDs (multi-user domains, or in older gaming parlance multi-user dungeons) as a launching pad for explorations of software design, user interfaces, simulation, artificial intelligence, artificial life, agents, virtual reality, and the on-line way of life.
Journal Article

The Social Psychology of Groups

TL;DR: The Social Psychology of Groups as discussed by the authors is a seminal work in the field of family studies, where the authors introduced, defined, and illustrated basic concepts in an effort to explain the simplest of social phenomena, the two-person relationship.
Journal ArticleDOI

The Production and Distribution of Knowledge in the United States.

TL;DR: Machlup defined knowledge as "any human (or human-induced) activity designed to create, alter, or confirm in a human mind-one's own or anyone else's-a meaningful apperception, awareness, cognizance, or consciousness of whatever it may be" as discussed by the authors.
Journal Article

Cognitive Surplus: Creativity and Generosity in a Connected Age

TL;DR: Shirky's Cognitive Surplus: Creativity and Generosity in a Connected Age by Clay Shirky as mentioned in this paper argues that the free time of the world's educated citizenry as an aggregate, a kind of cognitive surplus, is not always used wisely.
Journal ArticleDOI

Ontop: Answering SPARQL queries over relational databases

TL;DR: Ontop is presented, an open-source Ontology-Based Data Access (OBDA) system that allows for querying relational data sources through a conceptual representation of the domain of interest, provided in terms of an ontology, to which the data sources are mapped.
References
More filters
Book

Design Patterns: Elements of Reusable Object-Oriented Software

TL;DR: The book is an introduction to the idea of design patterns in software engineering, and a catalog of twenty-three common patterns, which most experienced OOP designers will find out they've known about patterns all along.
Journal ArticleDOI

The anatomy of a large-scale hypertextual Web search engine

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Journal Article

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Sergey Brin, +1 more
- 01 Jan 1998 - 
TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.
Journal ArticleDOI

A translation approach to portable ontology specifications

TL;DR: This paper describes a mechanism for defining ontologies that are portable over representation systems, basing Ontolingua itself on an ontology of domain-independent, representational idioms.
Journal ArticleDOI

Indexing by Latent Semantic Analysis

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.