Home
/
Topics
/
Document layout analysis

Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1969

Papers

PDF

Open Access

More filters

Patent•

Method for generating document using tables storing pointers and indexes

[...]

James Lee Cooper, Deebitsudo San Suushii Maaku

03 Oct 1983

TL;DR: In this article, a flexible, expandable document structure incorporating information item blocks and indexing blocks related through pointers and means for applying visual and informational attributes to document text is presented.

...read moreread less

Abstract: A document processing system including a control structure having separated supervisory and document functions. The document functions, including a document buffer and document access control means are the sole means for accessing documents and the document function routines are selected from predetermined library of such routines. The system includes a flexible, expandable document structure incorporating information item blocks and indexing blocks related through pointers and means for applying visual and informational attributes to document text.

...read moreread less

46 citations

Journal Article•DOI•

A Texture-based Method for Document Segmentation and Classification

[...]

Ming-Wei Lin¹, Jules-Raymond Tapamo¹, Baird Ndovie¹•Institutions (1)

University of KwaZulu-Natal¹

15 Oct 2007

TL;DR: This paper presents a hybrid approach to segment and classify contents of document images, segmented into three types of regions: Graphics, Text and Space.

...read moreread less

Abstract: In this paper we present a hybrid approach to segment and classify contents of document images. A Document Image is segmented into three types of regions: Graphics, Text and Space. The image of a document is subdivided into blocks and for each block five GLCM (Grey Level Co-occurrence Matrix) features are extracted. Based on these features, blocks are then clustered into three groups using K-Means algorithm; connected blocks that belong to the same group are merged. The classification of groups is done using pre-learned heuristic rules. Experiments were conducted on scanned newspapers and images from MediaTeam Document Database

...read moreread less

46 citations

Patent•

Document reading apparatus having a function of determining effective document region based on a detected data

[...]

Noriyuki Okisu¹, Shinya Matsuda¹, Satoshi Nakamura¹, Jun Minakuti¹•Institutions (1)

Minolta¹

01 Dec 1992

TL;DR: A document reading apparatus which can determine an effective image pickup area containing no object such as operator's hands or fingers pressing a document and rectifying image data prior to imaging operation, making use of a difference of the object from the document in chromaticity, luminous density, and the like as mentioned in this paper.

...read moreread less

Abstract: A document reading apparatus which can determine an effective image pickup area containing no object such as operator's hands or fingers pressing a document and rectify image data prior to imaging operation, making use of a difference of the object from the document in chromaticity, luminous density, and the like.

...read moreread less

45 citations

Patent•

Method for graph-based table recognition

[...]

M. Armon Rahgozar¹, Robert Cooperman¹•Institutions (1)

Xerox¹

11 Jan 1996

TL;DR: In this article, the authors present a method for bottom-up recognition of tables within a document based on the paradigm of graph rewriting, where the document image is transformed into a layout graph whose nodes and edges represent document entities and their interrelations respectively.

...read moreread less

Abstract: The present invention is a method for bottom-up recognition of tables within a document. This method is based on the paradigm of graph-rewriting. First, the document image is transformed into a layout graph whose nodes and edges represent document entities and their interrelations respectively. This graph is subsequently rewritten using a set of rules designed based on apriori document knowledge and general formatting conventions. The resulting graph provides a logical view of the document content. It can be parsed to provide general format analysis information.

...read moreread less

45 citations

Journal Article•

Skew detection technique for binary document images based on Hough transform

[...]

Manjunath Aradhya V N, Hemantha Kumar G, Palaiahnakote Shivakumara

01 Jan 2006-World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering

TL;DR: A novel skew detection method is presented for binary document images that considered the some selected characters of the text which may be subjected to thinning and Hough transform to estimate skew angle accurately.

...read moreread less

Abstract: Document image processing has become an increasingly important technology in the automation of office documentation tasks. During document scanning, skew is inevitably introduced into the incoming document image. Since the algorithm for layout analysis and character recognition are generally very sensitive to the page skew. Hence, skew detection and correction in document images are the critical steps before layout analysis. In this paper, a novel skew detection method is presented for binary document images. The method considered the some selected characters of the text which may be subjected to thinning and Hough transform to estimate skew angle accurately. Several experiments have been conducted on various types of documents such as documents containing English Documents, Journals, Text-Book, Different Languages and Document with different fonts, Documents

...read moreread less

45 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
…
38
39
40
41
42
43
44
…
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

1,488

Papers

35,779

Citations

No. of papers in the topic in previous years
Year	Papers
2023	5
2022	19
2021	34
2020	19
2019	14
2018	9

Document layout analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics