Fine-Grained Named Entity Recognition in Legal Documents

doi:10.1007/978-3-030-33220-4_20

Open AccessBook ChapterDOI

Fine-Grained Named Entity Recognition in Legal Documents

Elena Leitner, +2 more

- pp 272-287

Chats0

TLDR

The work presented in this paper was carried out under the umbrella of the European project LYNX that develops a semantic platform that enables the development of various document processing and analysis applications for the legal domain.

Abstract:

This paper describes an approach at Named Entity Recognition (NER) in German language documents from the legal domain. For this purpose, a dataset consisting of German court decisions was developed. The source texts were manually annotated with 19 semantic classes: person, judge, lawyer, country, city, street, landscape, organization, company, institution, court, brand, law, ordinance, European legal norm, regulation, contract, court decision, and legal literature. The dataset consists of approx. 67,000 sentences and contains 54,000 annotated entities. The 19 fine-grained classes were automatically generalised to seven more coarse-grained classes (person, location, organization, legal norm, case-by-case regulation, court decision, and legal literature). Thus, the dataset includes two annotation variants, i.e., coarse- and fine-grained. For the task of NER, Conditional Random Fields (CRFs) and bidirectional Long-Short Term Memory Networks (BiLSTMs) were applied to the dataset as state of the art models. Three different models were developed for each of these two model families and tested with the coarse- and fine-grained annotations. The BiLSTM models achieve the best performance with an 95.46 F\(_1\) score for the fine-grained classes and 95.95 for the coarse-grained ones. The CRF models reach a maximum of 93.23 for the fine-grained classes and 93.22 for the coarse-grained ones. The work presented in this paper was carried out under the umbrella of the European project LYNX that develops a semantic platform that enables the development of various document processing and analysis applications for the legal domain.

Fine-Grained Named Entity Recognition in Legal Documents

Citations

An end-to-end joint model for evidence information extraction from court record document

A comparative study of automated legal text classification using random forests and deep learning

A comparative study of automated legal text classification using random forests and deep learning

A Dataset of German Legal Documents for Named Entity Recognition

Named Entity Recognition in the Romanian Legal Domain

References

Neural Architectures for Named Entity Recognition

Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling

Bidirectional LSTM-CRF Models for Sequence Tagging

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Related Papers (5)

Neural Architectures for Named Entity Recognition

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

Named entity recognition and resolution in legal text

Extracting contract elements