Book ChapterDOI
Stemming and Lemmatization for Information Retrieval Systems in Amazigh Language
Amri Samir,Zenkouar Lahbib +1 more
- pp 222-233
TLDR
Stemming and lemmatization are two language modeling techniques used to improve the document retrieval precision performances.Abstract:
Stemming and lemmatization are two language modeling techniques used to improve the document retrieval precision performances. Stemming is a procedure to reduce all words with the same stem to a common form whereas lemmatization removes inflectional endings and returns the base form of a word.read more
Citations
More filters
Journal ArticleDOI
Sentence Retrieval using Stemming and Lemmatization with Different Length of the Queries
TL;DR: The results show that data pre-processing with stemming and lemmatization is useful with sentences retrieval as it is with document retrieval.
Journal ArticleDOI
RweetMiner: Automatic identification and categorization of help requests on twitter during disasters
TL;DR: This research formally defines request tweet in the context of social networking sites, so-called rweets, along with its different primary types and sub-types and introduced an architecture to store intermediate data to accelerate the machine learning classifiers' development process.
Proceedings ArticleDOI
Systematic Literature Review of Stemming and Lemmatization Performance for Sentence Similarity
TL;DR: The authors conducted a systematic literature review (SLR) on stemming and lemmatization based on many previous studies related to this topic, and found that a lot of factors go into deciding which preprocessing technique (stemming or lemm atization) is the best option.
Comparison of text preprocessing methods
TL;DR: The pros and cons of several common text preprocessing methods are discussed: removing formatting, tokenization, text normalization, handling punctuation, removing stopwords, stemming and lemmatization, n-gramming, and identifying multiword expressions.
References
More filters
Book
Modern Information Retrieval
TL;DR: In this article, the authors present a rigorous and complete textbook for a first course on information retrieval from the computer science (as opposed to a user-centred) perspective, which provides an up-to-date student oriented treatment of the subject.
Development of a Stemming Algorithm
TL;DR: A new version of a context-sensitive, longest-match stemming algorithm for English is proposed; though developed for use in a library information transfer system, it is of general application.
Journal ArticleDOI
Stemming algorithms: a case study for detailed evaluation
TL;DR: A case study of stemming algorithms is described which describes a number of novel approaches to evaluation and demonstrates their value.
Proceedings ArticleDOI
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
TL;DR: Several light stemmers based on heuristics and a statistical stemmer based on co-occurrence for Arabic retrieval and the retrieval effectiveness of these stemmers and of a morphological analyzer on the TREC-2001 data were compared.