scispace - formally typeset
Open AccessPosted Content

Indexing of Arabic documents automatically based on lexical analysis

Reads0
Chats0
TLDR
This paper proposed and implemented a method to automatically create and Index for books written in Arabic language and depends largely on text summarization.
Abstract
The continuous information explosion through the Internet and all information sources makes it necessary to perform all information processing activities automatically in quick and reliable manners. In this paper, we proposed and implemented a method to automatically create and Index for books written in Arabic language. The process depends largely on text summarization and

read more

Citations
More filters
Journal ArticleDOI

Arabic Text Classification Algorithm using TFIDF and Chi Square Measurements

TL;DR: A new method for Arabic text classification is proposed in which a document is compared with pre-defined documents categories based on its contents using the TF.IDF method, then the document is classified into the appropriate sub-category using Chi Square measure.
Journal ArticleDOI

Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study

TL;DR: This paper has compared the performance between different classifiers in different situations using feature selection with stemming, and without stemming, to investigate the effectiveness of use of feature selection.
Journal ArticleDOI

Linked open data of bibliometric networks: analytics research for personalized library services

TL;DR: Experimental analysis shows that topic specificity and citation count of publication venues are negatively correlated to each other, the first attempt to discover correlation between topic sensitivity and citation counts of publication venue.
Proceedings ArticleDOI

Stemming versus multi-words indexing for Arabic documents classification

TL;DR: Empirical results on Arabic dataset reveal that the choice of extracted feature's type has a significant impact on conserving semantic information and improving classification accuracy, especially with the morphological complexity of the Arabic language.
Book ChapterDOI

Concatenation Technique for Extracted Arabic Characters for Efficient Content-based Indexing and Searching

TL;DR: This research paper demonstrates the work accomplished in the last phase of the ongoing research project with an objective of developing a system for moving Arabic video text extraction for efficient content-based indexing and searching.
References
More filters
Book

Modern Information Retrieval

TL;DR: In this article, the authors present a rigorous and complete textbook for a first course on information retrieval from the computer science (as opposed to a user-centred) perspective, which provides an up-to-date student oriented treatment of the subject.
Journal ArticleDOI

An improved K-nearest-neighbor algorithm for text categorization

TL;DR: An improved KNN algorithm is proposed, which builds the classification model by combining constrained one pass clustering algorithm and KNN text categorization, which can reduce the text similarity computation substantially and outperform the-state-of-the-art KNN, Naive Bayes and Support Vector Machine classifiers.
Journal Article

Arabic Text Classification Using Maximum Entropy

TL;DR: In this article, the authors focused on classifying Arabic text documents and used a maximum entropy method to classify Arabic documents, they experimented their approach using real data, then compared the results with other existing systems.

Automated Arabic Text Categorization Using SVM and NB

TL;DR: The Experimental results against different Arabic text categorization data sets reveal that SVM algorithm outperforms the Naive Bayesian method (NB) with regards to all measures.
Posted Content

An Improved k-Nearest Neighbor Algorithm for Text Categorization

TL;DR: An improved kNN algorithm is proposed, which uses different numbers of nearest neighbors for different categories, rather than a fixed number across all categories, and is promising for some cases, where estimating the parameter k via cross-validation is not allowed.