Conference

International Conference on Asian Language Processing

About: International Conference on Asian Language Processing is an academic conference. The conference publishes majorly in the area(s): Computer science & Machine translation. Over the lifetime, 900 publications have been published by the conference receiving 3497 citations.

...read moreread less

Topics: Computer science, Machine translation, Sentence, Language translation, Sentiment analysis ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Designing an Indonesian part of speech tagset and manually tagged Indonesian corpus

[...]

Arawinda Dinakaramani¹, Fam Rashel¹, Andry Luthfi¹, Ruli Manurung¹•Institutions (1)

University of Indonesia¹

04 Dec 2014

TL;DR: The results of this work are an Indonesian POS tagset consisting of 23 tags and an Indonesian corpus of over 250.000 lexical tokens that have been manually tagged using this tagset.

...read moreread less

Abstract: We describe our work on designing a linguistically principled part of speech (POS) tagset for the Indonesian language. The process involves a detailed study and analysis of existing tagsets and the manual tagging of an Indonesian corpus. The results of this work are an Indonesian POS tagset consisting of 23 tags and an Indonesian corpus of over 250.000 lexical tokens that have been manually tagged using this tagset.

...read moreread less

76 citations

Proceedings Article•DOI•

Topic2Vec: Learning distributed representations of topics

[...]

Liqiang Niu¹, Xinyu Dai¹, Jianbing Zhang¹, Jiajun Chen¹•Institutions (1)

Nanjing University¹

01 Oct 2015

TL;DR: Topic2Vec as discussed by the authors proposes to learn topic representations in the same semantic vector space with words, as an alternative to probability distribution, which achieves interesting and meaningful results in many tasks.

...read moreread less

Abstract: Latent Dirichlet Allocation (LDA) mining thematic structure of documents plays an important role in nature language processing and machine learning areas. However, the probability distribution from LDA only describes the statistical relationship of occurrences in the corpus and usually in practice, probability is not the best choice for feature representations. Recently, embedding methods have been proposed to represent words and documents by learning essential concepts and representations, such as Word2Vec and Doc2Vec. The embedded representations have shown more effectiveness than LDA-style representations in many tasks. In this paper, we propose the Topic2Vec approach which can learn topic representations in the same semantic vector space with words, as an alternative to probability distribution. The experimental results show that Topic2Vec achieves interesting and meaningful results.

...read moreread less

69 citations

Proceedings Article•DOI•

Inset lexicon: Evaluation of a word list for Indonesian sentiment analysis in microblogs

[...]

Fajri Koto, Gemala Y. Rahmaningtyas

01 Dec 2017

TL;DR: InSet, an Indonesian sentiment lexicon built to identify written opinion and categorize it into positive or negative opinion, which could be utilized to analyze public sentiment towards particular topic, event, or product is proposed.

...read moreread less

Abstract: In this study, we propose InSet, an Indonesian sentiment lexicon built to identify written opinion and categorize it into positive or negative opinion, which could be utilized to analyze public sentiment towards particular topic, event, or product. Composed using collection of words from Indonesian tweet, InSet was constructed by manually weighting each words and enhanced by adding stemming and synonym set. As the result, we obtained 3,609 positive words and 6,609 negative words with score ranging between −5 and +5. Based on the experiment utilizing the InSet, our method outperforms other rarely found Indonesian lexicon that we used as baseline.

...read moreread less

49 citations

Proceedings Article•DOI•

Automatic Speech Recognition of Code Switching Speech Using 1-Best Rescoring

[...]

Basem H. A. Ahmed¹, Tien-Ping Tan¹•Institutions (1)

Universiti Sains Malaysia¹

13 Nov 2012

TL;DR: A novel approach to automatic recognition of code-switching speech using parallel automatic speech recognizers for speech recognition and rescoring, which shows reduction in WER, when they are used for code switching speech recognition.

...read moreread less

Abstract: In this paper, we propose a novel approach to automatic recognition of code-switching speech The proposed method consists of two phases: automatic speech recognition, and rescoring The framework uses parallel automatic speech recognizers for speech recognition The lattices produced are subsequently joined and rescored to estimate the most probable word sequence Experiment shows that the proposed approach reduction of more than 5% WER, when tested on English/Malay code switching speech In addition, the framework has shown to be very robust Besides, we also propose an acoustic model adaptation approach known as hybrid approach of interpolation and merging to cross adapt acoustic models of different languages to recognize code switching speech The adapted acoustic models show reduction in WER, when they are used for code switching speech recognition

...read moreread less

46 citations

Proceedings Article•DOI•

Emotion Classification on Indonesian Twitter Dataset

[...]

Mei Silviana Saputri¹, Rahmad Mahendra¹, Mirna Adriani¹•Institutions (1)

University of Indonesia¹

01 Nov 2018

TL;DR: This study builds an Indonesian twitter dataset for emotion classification task for under-resourced language, especially Indonesian, and conducts feature engineering to decide the best feature in emotion classification.

...read moreread less

Abstract: The rapid growth of Twitter usage attracts many researchers to utilize Twitter data for several purposes, including emotion analysis. However, there is a resource limitation in standard dataset for emotion analysis task for under-resourced language, especially Indonesian. In this study, we build an Indonesian twitter dataset for emotion classification task which is publicly available. In addition, we conduct feature engineering to decide the best feature in emotion classification. The features used in this research are lexicon-based, Bag-of-Words, word embeddings, orthography and Part-Of-Speech (POS)tag features. We test those features in two datasets with different characteristics. F1-score is employed as an evaluation metric. The results of our experiments show that implementing the combination of all proposed features in our built dataset can achieve 69.73% of F1-Score, which outperforms the baseline model by 26.64%.

...read moreread less

43 citations

Collapse

Performance

Metrics

900

Papers

3,497

Citations

No. of papers from the Conference in previous years
Year	Papers
2022	85
2020	61
2019	81
2018	67
2017	87
2016	85