Hindi to English Machine Transliteration of Named Entities using Conditional Random Fields

doi:10.5120/7522-0624

Open AccessJournal ArticleDOI

Hindi to English Machine Transliteration of Named Entities using Conditional Random Fields

M. L. Dhore, +2 more

- 07 Mar 2012 -

International Journal of Computer Applic...

- Vol. 48, Iss: 23, pp 31-37

Chats0

TLDR

This paper focuses on Hindi to English machine transliteration of Indian named entities such as proper nouns, place names and organization names using conditional random fields (CRF).

Abstract:

Machine transliteration has received significant research attention in recent years. In most cases, the source language has been English and the target language is an Asian language. This paper focuses on Hindi to English machine transliteration of Indian named entities such as proper nouns, place names and organization names using conditional random fields (CRF). Hindi is the national language of the India and spoken by more than 500 millions Indian. Hindi is the world‟s fourth most commonly used language after Chinese, English and Spanish. This system takes Indian place name as an input in Hindi language using Devanagari script and transliterates it into English. The input to the system is provided in the form of syllabification in order to apply the n-gram techniques. As more than 50% named entities are formed as a combination of two and three syllabic units, the ngram approach with unigrams, bigrams and trigrams of Hindi are used to train the corpus. The system provides the satisfactory performance for trigrams as compared to unigrams and bigrams. General Terms Machine Transliteration

Hindi to English Machine Transliteration of Named Entities using Conditional Random Fields

Citations

International Journal of Computer and Applications

Rule Based Transliteration Scheme for English to Punjabi

Rule Based Transliteration Scheme for English to Punjabi

A Sequence-to-Sequence based Approach For the double Transliteration of Tunisian Dialect

Romanized Tunisian dialect transliteration using sequence labelling techniques

References

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Probabilistic Models for Segmenting and Labeling Sequence Data

International Journal of Computer and Applications

An Introduction to Conditional Random Fields for Relational Learning

Machine transliteration

Related Papers (5)

Hindi and marathi to english machine transliteration using svm

Punjabi to English Machine Transliteration for Proper Nouns

Word Level Language Identification in Assamese-Bengali-Hindi-English Code-Mixed Social Media Text

TRANSLIT : a large-scale name transliteration resource

A Large-scale Evaluation of Neural Machine Transliteration for Indic Languages