scispace - formally typeset
Book ChapterDOI

Stemming and Lemmatization for Information Retrieval Systems in Amazigh Language

Amri Samir, +1 more
- pp 222-233
TLDR
Stemming and lemmatization are two language modeling techniques used to improve the document retrieval precision performances.
Abstract
Stemming and lemmatization are two language modeling techniques used to improve the document retrieval precision performances. Stemming is a procedure to reduce all words with the same stem to a common form whereas lemmatization removes inflectional endings and returns the base form of a word.

read more

Citations
More filters
Journal ArticleDOI

Sentence Retrieval using Stemming and Lemmatization with Different Length of the Queries

TL;DR: The results show that data pre-processing with stemming and lemmatization is useful with sentences retrieval as it is with document retrieval.
Journal ArticleDOI

RweetMiner: Automatic identification and categorization of help requests on twitter during disasters

TL;DR: This research formally defines request tweet in the context of social networking sites, so-called rweets, along with its different primary types and sub-types and introduced an architecture to store intermediate data to accelerate the machine learning classifiers' development process.
Proceedings ArticleDOI

Systematic Literature Review of Stemming and Lemmatization Performance for Sentence Similarity

TL;DR: The authors conducted a systematic literature review (SLR) on stemming and lemmatization based on many previous studies related to this topic, and found that a lot of factors go into deciding which preprocessing technique (stemming or lemm atization) is the best option.

Comparison of text preprocessing methods

TL;DR: The pros and cons of several common text preprocessing methods are discussed: removing formatting, tokenization, text normalization, handling punctuation, removing stopwords, stemming and lemmatization, n-gramming, and identifying multiword expressions.
References
More filters
Book

Modern Information Retrieval

TL;DR: In this article, the authors present a rigorous and complete textbook for a first course on information retrieval from the computer science (as opposed to a user-centred) perspective, which provides an up-to-date student oriented treatment of the subject.

Development of a Stemming Algorithm

TL;DR: A new version of a context-sensitive, longest-match stemming algorithm for English is proposed; though developed for use in a library information transfer system, it is of general application.
Journal ArticleDOI

Stemming algorithms: a case study for detailed evaluation

TL;DR: A case study of stemming algorithms is described which describes a number of novel approaches to evaluation and demonstrates their value.
Proceedings ArticleDOI

Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis

TL;DR: Several light stemmers based on heuristics and a statistical stemmer based on co-occurrence for Arabic retrieval and the retrieval effectiveness of these stemmers and of a morphological analyzer on the TREC-2001 data were compared.