Book ChapterDOI
Processing Large Text Corpus Using N-Gram Language Modeling and Smoothing
Sandhya Avasthi,Ritu Chauhan,D. P. Acharjya +2 more
- pp 21-32
Reads0
Chats0
TLDR
In this article, N-gram models are discussed and evaluated using Good Turing Estimation, perplexity measure and type-to-token ratio to predict the next word when the user provides input.Abstract:
The prediction of next word, letter or phrase for the user, while she is typing, is a really valuable tool for improving user experience. The users are communicating, writing reviews and expressing their opinion on such platforms frequently and many times while moving. It has become necessary to provide the user with an application that can reduce typing effort and spelling errors when they have limited time. The text data is getting larger in size due to the extensive use of all kinds of social media platforms and so implementation of text prediction application is difficult considering the size of text data to be processed for language modeling. This research paper’s primary objective is processing large text corpus and implementing a probabilistic model like N-grams to predict the next word when the user provides input. In this exploratory research, n-gram models are discussed and evaluated using Good Turing Estimation, perplexity measure and type-to-token ratio.read more
Citations
More filters
Journal ArticleDOI
Quranic Optical Text Recognition Using Deep Learning Models
TL;DR: In this paper, a Quranic optical character recognition (OCR) system based on convolutional neural network (CNN) followed by RNN is introduced, and six deep learning models are built to study the effect of different representations of the input and output, and the accuracy and performance of the models.
Book ChapterDOI
Information Extraction and Sentiment Analysis to Gain Insight into the COVID-19 Crisis
Book ChapterDOI
Augmenting Mental Healthcare With Artificial Intelligence, Machine Learning, and Challenges in Telemedicine
TL;DR: The goal of this chapter is to review the literature on artificial intelligence and machine learning algorithms for detecting a person's mental health by utilizing patient health records and explains the use of artificial intelligence in curing and monitoring a patient with mental illness through telemedicine.
Journal ArticleDOI
Shedding light on the reverse logistics’ decision-making: a social-media analytics study of the electronics industry in developing vs developed countries
TL;DR: In this paper , a multi-industry applied model using the deep learning method in social media analysis to make the best decision for returning products in reverse logistics, along with the sustainability and circular economy concerns is proposed.
Journal ArticleDOI
Extracting information and inferences from a large text corpus
TL;DR: In this article , an incremental topic model with word embedding (ITMWE) is proposed that processes large text data in an incremental environment and extracts latent topics that best describe the document collections.
References
More filters
Journal ArticleDOI
A Survey on Techniques in NLP
TL;DR: Three phases of natural language processing namely, language modelling, parts-ofspeech tagging and parsing are described, outlining the approaches used that can be used.
Book ChapterDOI
Techniques, Applications, and Issues in Mining Large-Scale Text Databases
TL;DR: The main objective is to review text mining techniques, application areas, and existing issues.
Journal ArticleDOI
Bayesian Analysis in Natural Language Processing
TL;DR: In this article, the methods and algorithms that are needed to fluently read Bayesian learning papers in NLP and to do research in the area are discussed. But they are partially borrowed from both machine learning and statistics and are partially developed ''in-house''.
Journal ArticleDOI
Text Classification Using the N-Gram Graph Representation Model Over High Frequency Data Streams
TL;DR: This research proposes an innovative and high-accurate text stream classification model that is designed in an elastic distributed way and is capable to service text load with fluctuated frequency.
Book ChapterDOI
SPAM: An Effective and Efficient Spatial Algorithm for Mining Grid Data
Ritu Chauhan,Harleen Kaur +1 more
TL;DR: This chapter has defined the novel framework SpaGrid and SPAM algorithm to retrieve clusters of variant shape and size from large databases and the application of the framework is used with spatial medical databases where the implementation details are discussed with Matlab 7.1.