Indexing of Arabic documents automatically based on lexical analysis

Open AccessPosted Content

Indexing of Arabic documents automatically based on lexical analysis

Abdulrahman Al Molijy, +2 more

- 08 May 2012 -

arXiv: Information Retrieval

Chats0

TLDR

This paper proposed and implemented a method to automatically create and Index for books written in Arabic language and depends largely on text summarization.

Abstract:

The continuous information explosion through the Internet and all information sources makes it necessary to perform all information processing activities automatically in quick and reliable manners. In this paper, we proposed and implemented a method to automatically create and Index for books written in Arabic language. The process depends largely on text summarization and

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Arabic Text Classification Algorithm using TFIDF and Chi Square Measurements

Aymen Abu-Errub

- 16 May 2014 -

International Journal of Computer Applic...

TL;DR: A new method for Arabic text classification is proposed in which a document is compared with pre-defined documents categories based on its contents using the TF.IDF method, then the document is classified into the appropriate sub-category using Chi Square measure.

...read moreread less

Journal ArticleDOI

Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study

Ghazi Raho, +3 more

- 01 Jan 2015 -

International Journal of Advanced Comput...

TL;DR: This paper has compared the performance between different classifiers in different situations using feature selection with stemming, and without stemming, to investigate the effectiveness of use of feature selection.

...read moreread less

Journal ArticleDOI

Linked open data of bibliometric networks: analytics research for personalized library services

Miltiadis D. Lytras, +2 more

- 07 Mar 2019 -

Library Hi Tech

TL;DR: Experimental analysis shows that topic specificity and citation count of publication venues are negatively correlated to each other, the first attempt to discover correlation between topic sensitivity and citation counts of publication venue.

...read moreread less

Proceedings ArticleDOI

Stemming versus multi-words indexing for Arabic documents classification

Mohamed Salim El Bazzi, +3 more

TL;DR: Empirical results on Arabic dataset reveal that the choice of extracted feature's type has a significant impact on conserving semantic information and improving classification accuracy, especially with the morphological complexity of the Arabic language.

...read moreread less

Book ChapterDOI

Concatenation Technique for Extracted Arabic Characters for Efficient Content-based Indexing and Searching

Abdul Khader Jilani Saudagar, +1 more

TL;DR: This research paper demonstrates the work accomplished in the last phase of the ongoing research project with an objective of developing a system for moving Arabic video text extraction for efficient content-based indexing and searching.

...read moreread less

References

PDF

Open Access

More filters

Book

Modern Information Retrieval

Ricardo Baeza-Yates, +1 more

TL;DR: In this article, the authors present a rigorous and complete textbook for a first course on information retrieval from the computer science (as opposed to a user-centred) perspective, which provides an up-to-date student oriented treatment of the subject.

...read moreread less

Journal ArticleDOI

An improved K-nearest-neighbor algorithm for text categorization

Shengyi Jiang, +3 more

- 01 Jan 2012 -

Expert Systems With Applications

TL;DR: An improved KNN algorithm is proposed, which builds the classification model by combining constrained one pass clustering algorithm and KNN text categorization, which can reduce the text similarity computation substantially and outperform the-state-of-the-art KNN, Naive Bayes and Support Vector Machine classifiers.

...read moreread less

Journal Article

Arabic Text Classification Using Maximum Entropy

Alaa M. El-Halees

- 05 Dec 2015 -

IUG Journal of Natural Studies

TL;DR: In this article, the authors focused on classifying Arabic text documents and used a maximum entropy method to classify Arabic documents, they experimented their approach using real data, then compared the results with other existing systems.

...read moreread less

Automated Arabic Text Categorization Using SVM and NB

Saleh Alsaleem

TL;DR: The Experimental results against different Arabic text categorization data sets reveal that SVM algorithm outperforms the Naive Bayesian method (NB) with regards to all measures.

...read moreread less

Posted Content

An Improved k-Nearest Neighbor Algorithm for Text Categorization

Baoli Li, +2 more

- 16 Jun 2003 -

arXiv: Computation and Language

TL;DR: An improved kNN algorithm is proposed, which uses different numbers of nearest neighbors for different categories, rather than a fixed number across all categories, and is promising for some cases, where estimating the parameter k via cross-validation is not allowed.

...read moreread less

Indexing of Arabic documents automatically based on lexical analysis

Citations

Arabic Text Classification Algorithm using TFIDF and Chi Square Measurements

Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study

Linked open data of bibliometric networks: analytics research for personalized library services

Stemming versus multi-words indexing for Arabic documents classification

Concatenation Technique for Extracted Arabic Characters for Efficient Content-based Indexing and Searching

References

Modern Information Retrieval

An improved K-nearest-neighbor algorithm for text categorization

Arabic Text Classification Using Maximum Entropy

Automated Arabic Text Categorization Using SVM and NB

An Improved k-Nearest Neighbor Algorithm for Text Categorization

Related Papers (5)

Query Based Arabic Text Summarization

Building a syntactic rules-based stemmer to improve search effectiveness for arabic language

Arabic text classification: Review study

Arabic sentiment analysis: Lexicon-based and corpus-based

Arabic Script Documents Language Identifications Using Fuzzy ART