scispace - formally typeset
Proceedings ArticleDOI

Kannada Grammar Checker Using LSTM Neural Network

TLDR
A model is advocated that employs a deep learning method to train the LSTM (Long Short Term Memory) neural network trained over a massive data set to fulfill the necessary categorisation, using a context-based retention of the data attained through Word2Vec along with the TensorFlow and Keras packages.
Abstract
Language is the most fundamental and historically normal means of communication today. Grammar plays a critical role in the excellence of a language. As individuals have already been educated throughout our existence with an accumulation of knowledge that is accrued, mastered over time with guidelines and a restriction of significance that allows us to comprehend and interact one another. But also to translate such awareness into a computer, to be capable of interpreting and classifying contextual evidence into a proper syntactical form, thereby validating that the information was in the correct form, is incredibly necessary at the current time since it is a sophisticated activity. The paper addresses the issue and asserts the advancement of such grammar verifying mechanism for the Dravidian language Kannada. Among the first account would be that the intricacy of the language poses a problem and preferring to have a rule based stance is an easier route and makes it possible to identify detected flaws competently. It takes a linguistic specialist to compile hundreds of parallel standards that are difficult to preserve. Here, a model is advocated that employs a deep learning method to train the LSTM (Long Short Term Memory) neural network trained over a massive data set to fulfill the necessary categorisation, using a context-based retention of the data attained through Word2Vec along with the TensorFlow and Keras packages. The proposed system is able to perform Grammatical Error Detection (GED) effectively.

read more

Citations
More filters
Book ChapterDOI

KDC: New Dataset for Kannada Document Categorization

TL;DR: In this paper , a new dataset for Kannada document classification is presented and the validation results of the new dataset are presented by experimenting the various machine learning algorithms on the dataset.
References
More filters
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content

Efficient Estimation of Word Representations in Vector Space

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Posted Content

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.
Proceedings Article

Linguistic Regularities in Continuous Space Word Representations

TL;DR: The vector-space word representations that are implicitly learned by the input-layer weights are found to be surprisingly good at capturing syntactic and semantic regularities in language, and that each relationship is characterized by a relation-specific vector offset.
Proceedings Article

Word Representations: A Simple and General Method for Semi-Supervised Learning

TL;DR: This work evaluates Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeds of words on both NER and chunking, and finds that each of the three word representations improves the accuracy of these baselines.