scispace - formally typeset
Open AccessProceedings ArticleDOI

End-to-End Extraction of Structured Information from Business Documents with Pointer-Generator Networks

Reads0
Chats0
TLDR
This paper discusses a new method for training extraction models directly from the textual value of information and shows that it performs competitively with a standard word classifier without requiring costly word level supervision.
Abstract
The predominant approaches for extracting key information from documents resort to classifiers predicting the information type of each word. However, the word level ground truth used for learning is expensive to obtain since it is not naturally produced by the extraction task. In this paper, we discuss a new method for training extraction models directly from the textual value of information. The extracted information of a document is represented as a sequence of tokens in the XML language. We learn to output this representation with a pointer-generator network that alternately copies the document words carrying information and generates the XML tags delimiting the types of information. The ability of our end-to-end method to retrieve structured information is assessed on a large set of business documents. We show that it performs competitively with a standard word classifier without requiring costly word level supervision.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

ViBERTgrid: A Jointly Trained Multi-modal 2D Document Representation for Key Information Extraction from Documents

TL;DR: Li et al. as discussed by the authors propose a new multi-modal backbone network by concatenating a BERTgrid to an intermediate layer of a CNN model, where the input of CNN is a document image and the BERT grid is a grid of word embeddings, to generate a more powerful grid-based document representation.
Proceedings ArticleDOI

Query-driven Generative Network for Document Information Extraction in the Wild

TL;DR: A novel architecture, termed Query-driven Generative Network (QGN), which is equipped with two consecutive modules, i.e., Layout Context-aware Module (LCM) and Structured Generation Module (SGM), to build up a more practical DIE paradigm for real-world scenarios where input document images may contain unknown layouts and keys in the scenes of the problematic OCR results.
Book ChapterDOI

Data-Efficient Information Extraction from Documents with Pre-trained Language Models.

TL;DR: In this article, a pre-trained model for encoding 2D documents, LayoutLM, reveals a high sample-efficiency when fine-tuned on public and real-world Information Extraction (IE) datasets.
Book ChapterDOI

DocReader: Bounding-Box Free Training of a Document Information Extraction Model.

TL;DR: DocReader as mentioned in this paper is an end-to-end neural-network-based information extraction solution which can be trained using solely the images and the target values that need to be read, thus eliminating the need for any additional annotations beyond what is naturally available in existing human-operated service centres.
References
More filters
Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Journal ArticleDOI

The Hungarian method for the assignment problem

TL;DR: This paper has always been one of my favorite children, combining as it does elements of the duality of linear programming and combinatorial tools from graph theory, and it may be of some interest to tell the story of its origin this article.
Proceedings ArticleDOI

Effective Approaches to Attention-based Neural Machine Translation

TL;DR: A global approach which always attends to all source words and a local one that only looks at a subset of source words at a time are examined, demonstrating the effectiveness of both approaches on the WMT translation tasks between English and German in both directions.
Posted Content

On the difficulty of training Recurrent Neural Networks

TL;DR: This paper proposes a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem and validates empirically the hypothesis and proposed solutions.
Related Papers (5)