Topic

Document layout analysis

About: Document layout analysis is a research topic. Over the lifetime, 1462 publications have been published within this topic receiving 34021 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Patent•

Document management system with enhanced intelligent document recognition capabilities

[...]

Suresh S. Pandian, Thyagarajan Swaminathan, Subramaniyan Neelagandan, Krishna K. Srinivasan, Randal J. Martin - Show less +1 more

10 Jun 2005

TL;DR: An intelligent document recognition-based document management system as discussed by the authors includes modules for image capture, image enhancement, image identification, optical character recognition (OCR), data extraction, and quality assurance.

...read moreread less

Abstract: An intelligent document recognition-based document management system (Fig. 2) includes modules for image capture (32), image enhancement (32), image identification (34), optical character recognition (36), data extraction (37) and quality assurance (42). The system captures data from electronic documents as diverse as facsimile images, scanned images and images from document management systems. It processes these images and presents the data in, for example, a standard XML format. The document management system processes both structured document images (40) (ones which have a standard format) and unstructured document images (38) (ones which do not have a standard format). The system can extract images directly from a facsimile machine, a scanner or a document management system for processing.

...read moreread less

233 citations

Patent•

General purpose shape-based layout processing scheme for IC layout modifications

[...]

Deepak Agrawal, Fang-Cheng Chang, Hyungjip Kim, Yao-Ting Wang, Myunghoon Yoon - Show less +1 more

02 Aug 2000

TL;DR: In this article, a shape can be defined by a set of associated edges in a specified configuration, and a catalog of shapes is defined and layout processing actions are associated with the various shapes.

...read moreread less

Abstract: Layout processing can be applied to an integrated circuit (IC) layout using a shape-based system. A shape can be defined by a set of associated edges in a specified configuration. A catalog of shapes is defined and layout processing actions are associated with the various shapes. Each layout processing action applies a specified layout modification to its associated shape. A shape-based rule system advantageously enables efficient formulation and precise application of layout modifications. Shapes/actions can be provided as defaults, can be retrieved from a remote source, or can be defined by the user. The layout processing actions can be compiled in a bias table. The bias table can include both rule-based and model-based actions, and can also include single-edge shapes for completeness. The scanning of the IC layout can be performed in order of increasing or decreasing complexity, or can be specified by the user. The appropriate layout processing actions are applied to matching portions of the IC layout to form the corrected photomask layout. This process can be sequential or batch mode. Shape and action conflicts can be resolved by marking identified/modified elements or by designing rules for orderly resolution of any inconsistencies or overlaps.

...read moreread less

224 citations

Patent•

System and method for context-based document retrieval

[...]

Shermann Loyall Min, Constantin Lorenzo Tanno, Zachary Mainen, William Russell Softky

28 Jul 2000

TL;DR: In this article, a system and method for text-based document retrieval is proposed, which is based on utilizing information contained in the document collection about the statistics of word relationships (context) to facilitate the specification of search queries and document comparison.

...read moreread less

Abstract: A system and method for document retrieval is disclosed The invention addresses a major problem in text-based document retrieval: rapidly finding a small subset of documents in a large document collection (eg Web pages on the Internet) that are relevant to a limited set of query terms supplied by the user The invention is based on utilizing information contained in the document collection about the statistics of word relationships (“context”) to facilitate the specification of search queries and document comparison The method consists of first compiling word relationships into a context database that captures the statistics of word proximity and occurrence throughout the document collection At retrieval time, a search matrix is computed from a set of user-supplied keywords and the context database For each document in the collection, a similar matrix is computed using the contents of the document and the context database Document relevance is determined by comparing the similarity of the search and document matrices The disclosed system therefore retrieves documents with contextual similarity rather than word frequency similarity, simplifying search specification while allowing greater search precision

...read moreread less

221 citations

Patent•

Document identification by characteristics matching

[...]

Roland G. Borrey, Daniel G. Borrey

31 May 1989

TL;DR: In this paper, the authors used the technique of recognition of global document features compared to a knowledge base of known document types to segment the digitized image of a document into physical and logical areas of significance and attempt to label these areas by determining the type of information they contain.

...read moreread less

Abstract: This invention relates to an automatic identification method for scanned documents in an electronic document capture and storage system. The invention uses the technique of recognition of global document features compared to a knowledge base of known document types. The system first segments the digitized image of a document into physical and logical areas of significance and attempts to label these areas by determining the type of information they contain, without using OCR techniques. The system then attempts to match the areas segmented to objects described in the knowledge base. The system labels the areas successfully matched then selects the most probable document type based on the areas found within the document. Using computer learning methods, the system is capable of improving its knowledge of the documents it is supposed to recognize, by dynamically modifying the characteristics of its knowledge base thus sharpening its decision making capability.

...read moreread less

217 citations

Patent•

Reformatting documents using document analysis information

[...]

Kathrin Berkner¹, Christophe Marle¹, Edward L. Schwartz¹, Michael J. Gormish¹•Institutions (1)

Ricoh¹

16 Aug 2007

TL;DR: In this paper, a method and apparatus for reformatting electronic documents is disclosed, which consists of performing layout analysis on an electronic version of a document to locate text zones, assigning attributes for scale and importance to text zones in the electronic version, and reformating text based on the attributes to create an image.

...read moreread less

Abstract: A method and apparatus for reformatting electronic documents is disclosed. In one embodiment, the method comprises performing layout analysis on an electronic version of a document to locate text zones, assigning attributes for scale and importance to text zones in the electronic version of the document, and reformatting text in the electronic version of the document based on the attributes to create an image.

...read moreread less

216 citations

Collapse

Network Information

Performance

Metrics

1,488

Papers

35,779

Citations

No. of papers in the topic in previous years
Year	Papers
2023	5
2022	19
2021	34
2020	19
2019	14
2018	9

Document layout analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics