Geographic Named Entity Recognition and Disambiguation in Mexican News using word embeddings

doi:10.1016/J.ESWA.2021.114855

Journal ArticleDOI

Geographic Named Entity Recognition and Disambiguation in Mexican News using word embeddings

Alejandro Molina-Villegas, +3 more

- 15 Aug 2021 -

Expert Systems With Applications

- Vol. 176, pp 114855

Chats0

TLDR

This study shows that relationships between geographic and semantic spaces arise when the authors apply word embedding models over a corpus of documents in Mexican Spanish, and achieves high accuracy for geographic named entity recognition in Spanish.

Abstract:

In recent years, dense word embeddings for text representation have been widely used since they can model complex semantic and morphological characteristics of language, such as meaning in specific contexts and applications. Contrary to sparse representations, such as one-hot encoding or frequencies, word embeddings provide computational advantages and improvements on the results in many natural language processing tasks, similar to the automatic extraction of geospatial information. Computer systems capable of discovering geographic information from natural language involve a complex process called geoparsing. In this work, we explore the use of word embeddings for two NLP tasks: Geographic Named Entity Recognition and Geographic Entity Disambiguation, both as an effort to develop the first Mexican Geoparser. Our study shows that relationships between geographic and semantic spaces arise when we apply word embedding models over a corpus of documents in Mexican Spanish. Our models achieved high accuracy for geographic named entity recognition in Spanish.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model

Ren Li, +5 more

- 01 Oct 2021 -

Advanced Engineering Informatics

TL;DR: A novel lexicon augmented machine reading comprehension-based NER neural model for identifying flat and nested entities from Chinese bridge inspection text and results show that the proposed model outperforms other mainstream NER models on the bridge inspection corpus.

...read moreread less

Journal ArticleDOI

Legal Text Recognition Using LSTM-CRF Deep Learning Model

Hesheng Xu, +1 more

- 17 Mar 2022 -

Computational Intelligence and Neuroscie...

TL;DR: The parameter learning result using log-likelihood is better than that using the maximum interval criterion, and it is ideal for the Bi-LSTM-CRF model, which is more suitable for recognizing extended entities.

...read moreread less

Journal ArticleDOI

Chinese Named Entity Recognition in the Geoscience Domain Based on BERT

Marjanne Zander

- 24 Feb 2022 -

Earth and Space Science

TL;DR: In this article , a deep learning-based geological named entity recognition model was proposed to obtain character vectors rich in semantic information through the BERT pretrained language model to alleviate the lack of specificity of static word vectors (e.g., word2vec) and to improve the extraction capability of complex geological entities.

...read moreread less

Peer ReviewDOI

Chinese Named Entity Recognition in the Geoscience Domain Based on BERT

Xia Lv, +7 more

- 14 Feb 2022 -

Earth and Space Science

TL;DR: An integrated deep learning model incorporating BERT, BiGRU and CRF is constructed to obtain character vectors rich in semantic information through the BERT pretrained language model to alleviate for the lack of specificity of static word vectors and to improve the extraction capability of complex geological entities.

...read moreread less

Journal ArticleDOI

ACE-ADP: Adversarial Contextual Embeddings Based Named Entity Recognition for Agricultural Diseases and Pests

Xuchao Guo, +6 more

- 24 Sep 2021 -

Agriculture

TL;DR: An adversarial contextual embeddings-based model named ACE-ADP is proposed for named entity recognition in Chinese agricultural diseases and pests domain and demonstrated that it could not only effectively extract rare entities but also maintain a powerful ability to predict new entities in new datasets with high accuracy.

...read moreread less

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018 -

arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Posted Content

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

- 16 Jan 2013 -

arXiv: Computation and Language

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.

...read moreread less

Journal ArticleDOI

Enriching Word Vectors with Subword Information

Piotr Bojanowski, +3 more

- 12 Jun 2017 -

Transactions of the Association for Comp...

TL;DR: This paper proposed a new approach based on skip-gram model, where each word is represented as a bag of character n-grams, words being represented as the sum of these representations, allowing to train models on large corpora quickly and allowing to compute word representations for words that did not appear in the training data.

...read moreread less

Proceedings ArticleDOI

Deep contextualized word representations

Matthew E. Peters, +6 more

TL;DR: This paper introduced a new type of deep contextualized word representation that models both complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).

...read moreread less