scispace - formally typeset
Search or ask a question

Showing papers by "Robert Sablatnig published in 2011"


Proceedings ArticleDOI
18 Sep 2011
TL;DR: Results show that the proposed method is able to locate initials, headings and text areas in ancient manuscripts containing stains, tears and partially faded-out ink sufficiently well.
Abstract: We propose a layout analysis method for historical manuscripts that relies on the part-based identification of layout entities. A layout entity -- such as letters of the text, initials or headings -- is composed of a set of characteristic segments or structures, which is dissimilar for distinct classes in the manuscripts under consideration. This fact is exploited in order to segment a manuscript page into homogeneous regions. Historical documents traditionally involve challenges such as uneven writing support and varying shapes of characters, fluctuating text lines, changing scripts and writing styles, and variance in the layout itself. Hence, a part-based detection of layout entities is proposed using a multi-stage algorithm for the localization of the entities, based on interest points. Results show that the proposed method is able to locate initials, headings and text areas in ancient manuscripts containing stains, tears and partially faded-out ink sufficiently well.

39 citations


Book ChapterDOI
26 Sep 2011
TL;DR: The current paper objective is to classify the architectural style of facade windows belonging to Romanesque, Gothic and Renaissance/Baroque European main architectural periods through clustering and learning of local features.
Abstract: Building facade classification by architectural styles allows categorization of large databases of building images into semantic categories belonging to certain historic periods, regions and cultural influences. Image databases sorted by architectural styles permit effective and fast image search for the purposes of content-based image retrieval, 3D reconstruction, 3D city-modeling, virtual tourism and indexing of cultural heritage buildings. Building facade classification is viewed as a task of classifying separate architectural structural elements, like windows, domes, towers, columns, etc, as every architectural style applies certain rules and characteristic forms for the design and construction of the structural parts mentioned. In the context of building facade architectural style classification the current paper objective is to classify the architectural style of facade windows. Typical windows belonging to Romanesque, Gothic and Renaissance/Baroque European main architectural periods are classified. The approach is based on clustering and learning of local features, applying intelligence that architects use to classify windows of the mentioned architectural styles in the training stage.

31 citations


01 Jan 2011
TL;DR: This paper reviews state-of-the-art literature on wearable diaries and lifelogging systems, and describes key issues and main challenges of building modern diary systems.
Abstract: Diaries have transformed over the last decade. Originally in handwritten text format, photo albums and visual diaries became popular as photography became commonly available. Traditionally intended to remain private, the dimension of audience was added in blogs along with Internet communication. However, humans have a limited capacity to record their lives. Lifelogs overcome this limit and collect and store a person’s personal information digitally. This can be done by recording all computer and cell phone activity and mobile context (e.g. GPS), but also adding multiple wearable sensors such as “always-on” cameras or bio-sensors. The enormous amounts of data created needs to be processed in order to be made useful to humans. In this paper, we review state-of-the-art literature on wearable diaries and lifelogging systems, and discuss the key issues and main challenges. 1 Diaries, Blogs and Lifelogs Humans record their history, from global events that move the world to small everyday life events that they consider important. In this paper, we review the current trends of personal record keeping. We focus on approaches using wearable sensors to passively capture data. First, forms of personal record keeping are discussed, especially how diaries have changed over time, becoming increasingly multimodal. Section 2 describes key issues and main challenges of building modern diary systems. In Section 3 we present selected projects which offer unique diary solutions. We conclude with thoughts on future challenges are offered. 1.1. Traditional diary A diary is a sequence of entries arranged chronologically, created to report on what has happened over the course of a period of time. Personal diaries usually include the writer’s thoughts and feelings. Originally handwritten in the form of books, the diary transformed from paper to electronic formats. Along with new media, new forms of diaries also developed. 1.1.1 Visual diary Since photography became commonly available, photos were added to the text in the diary to illustrate events. In some cases the emphasis shifted towards photography altogether and it became common to create photo albums with short text comments. Photos simplified the record keeping and added a further dimension to the records that made the memories seem more real [26]. Sontag [26] goes even further and says that it is the photographic capture of reality that gives us the feeling of the realness of our lives, it helps us reconstruct our personal history. We don’t believe our perception until a photo confirms it [26] as is also illustrated by the popular Internet-phrase “pics or it didn’t happen” ∗This work was supported by FFG, project no. 830043 demanding photographic proof of an unbelievable story. Artists use visual diaries to sketch drawings of their ideas and to collect images or other media. They serve them as inspiration and as a means to reflect on their artistic growth [1]. In travel diaries the author might even add small souvenirs, brochures, postcards or other nostalgic items to the diary. 1.1.2 Blogs The most distinct and today probably also most popular form of diary are blogs. While the “paper version” was traditionally intended to remain private, this aspect widely changed as people adopted the new medium to chronicle their lives with the additional dimension of audience. Although the reasons given by bloggers such as documenting my life, expressing opinions, letting off steam (shouts, feelings and thoughts), inspiration (“to write is to think”), and building communities, the bloggers are conscious about their audience and censor the information they publish [21]. Admittedly, the entries serve rather for self-presentation and narcissism than for creating an extensive digital memory [21]. Still, the main point of a traditional diary (public or private) is that the content is actively generated by the user. However, documenting a life takes discipline and effort and, hence, the amount of information in the traditional diary is constrained either by the capacity of the medium or the writer. This limit forces the individual to filter the content he/she writes down. 1.2. LifeLog This issue of necessity (and difficulty) to limit information that can be preserved in a diary by a single human is being addressed by new technologies. Technological advancements today make it possible to explore new ways and possibilities to capture, collect and store information. First envisioned by Vannevar Bush in 1945 [5], a LifeLog presents the notion to capture and store a whole lifetime of a person’s personal information digitally, so that it can be retrieved whenever needed. In 2001, this idea was revived by an experiment of Gordon Bell [2], who scanned all of his paperwork, photographs, medical records and other personal data. The initial focus for lifelogging was on desktop applications only, however, it has shifted towards mobile access and capture. Mobile devices are capable of more than just storing they capture data passively, without the user’s conscious initiative. Lifelog systems collect a variety of signals subsets of the following data categories (as described in [7]): Passive visual capture: Wearable “always-on” cameras automatically take images or videos. Biometrics: Wearable sensors measuring bio-signals, such as heart rate, galvanic skin response, skin temperature or body motion. Mobile context: Cell phones can provide information such as location cues in the form of GPS data, wireless network presence and GSM location data. Co-present Bluetooth devices may indicate people present nearby. Mobile activity: Call logs, SMSs, even email-logs, activity on the web and social networks sites can be gathered from mobile phones. Desktop/laptop computer activity: Basically all PC/laptop activity of the user can be monitored, the time and duration of each task measured, documents saved, etc. Active capture: Indirect (writing blogs is monitored as computer activity) or direct (add photos, write comments and annotate the lifelog’s content) capture. Such technologies, however, are not limited to the obvious use of personal reminiscence only. Various applications featuring at least a subset of these possibilities have been used, e.g. in medical and therapeutic solutions, security enhancement, or to encourage self-reflection.

21 citations


Proceedings ArticleDOI
01 Jan 2011
TL;DR: A reconstruction methodology for shredded documents is presented in this paper which recognizes characters at the stripes' borders and matches them subsequently, using an Optical Character Recognition (OCR) system that is capable of recognizing partially visible characters by means of local features.
Abstract: Document reconstruction affects different areas such as archeology, philology and forensics A reconstruction of fragmented writing materials allows to retrieve and to analyze the lost content Due to the complexity of reconstruction, automated algorithms are necessary A reconstruction methodology for shredded documents is presented in this paper which recognizes characters at the stripes' borders and matches them subsequently In order to achieve this, an Optical Character Recognition (OCR) system is exploited, that is capable of recognizing partially visible characters by means of local features Thus, no binarization needs to be performed Preliminary results show the ability of matching shredded documents using the information of cut characters (6 pages)

19 citations


Proceedings ArticleDOI
18 Sep 2011
TL;DR: The proposed method aims at clustering document snippets, so that an automated clustering of documents can be performed, and shows promising results on a dataset consisting of document snippets with varying shapes, content writing and layout.
Abstract: In general document image analysis methods are pre-processing steps for Optical Character Recognition (OCR) systems. In contrast, the proposed method aims at clustering document snippets, so that an automated clustering of documents can be performed. Therefore, words are classified according to printed text, manuscripts, and noise. Where, the third class corrects falsely segmented background elements. Having classified text elements, a layout analysis is carried out which groups words into text lines and paragraphs. A back propagation of the class weights - assigned to each word in the first step - enables correcting wrong class labels. The proposed method shows promising results on a dataset consisting of document snippets with varying shapes, content writing and layout. In addition, the system is compared to page segmentation methods of the ICDAR 2009 Page Segmentation Competition.

18 citations


Proceedings ArticleDOI
01 Nov 2011
TL;DR: A detailed analysis of a 4D representation of events, which are generated by a dynamic stereo vision sensor for the recognition of person's fall, which is shown in this work with promising outcomes.
Abstract: This paper presents a detailed analysis of a 4D representation of events, which are generated by a dynamic stereo vision sensor for the recognition of person's fall. Dynamic vision detectors consist of self-signaling pixels that autonomously react to scene dynamics and asynchronously generate events upon relative light intensity change. Their complete on-chip redundancy reduction, wide dynamic range and high temporal resolution allow efficient and continuous activity monitoring in natural environment. Using a stereo pair of dynamic vision detectors, it is possible to represent the scene dynamics in a 4D space (including time) at a high temporal resolution. In this work, we performed 100 recordings of scenarios including falls in indoor environment using this dynamic stereo vision sensor. Seven features have been extracted and analyzed for three types of falls such that robust parameters will be kept for fall recognition. The result of this analysis is shown in this work with promising outcomes.

15 citations


Proceedings Article
01 Aug 2011
TL;DR: A binarization-free layout analysis method for ancient manuscripts is proposed, which identifies and localizes layout entities exploiting their structural similarities on the local level.
Abstract: A binarization-free layout analysis method for ancient manuscripts is proposed, which identifies and localizes layout entities exploiting their structural similarities on the local level. Hence, the textual entities are disassembled into segments, and a part-based detection is done which employs local gradient features known from the field of object recognition, the Scale Invariant Feature Transform (SIFT), to describe these structures. Layout analysis is the first step in the process of document understanding; it identifies regions of interest and, hence, serves as input for other algorithms such as Optical Character Recognition (OCR). Moreover, the document layout allows scholars to establish the spatio-temporal origin, authenticate, or index a document. The layout entities considered in this approach include the body text, embellished initials, plain initials and headings.

8 citations


Proceedings ArticleDOI
18 Sep 2011
TL;DR: The proposed binarization algorithm uses a scale space to avoid the estimation of script size dependent parameters and the use of integral images for the calculation of the mean, standard deviation and morphological operations allow for an efficient implementation of the method presented.
Abstract: The proposed binarization algorithm uses a scale space to avoid the estimation of script size dependent parameters. Due to the continous smoothing from finer to coarse scales, noise such as background clutter is suppressed since coarse scales characterize homogeneous regions of the image. Thus, coarser scales of the scale space can be used as a foreground estimation to apply a weigthing scheme robust against noise present in, for instance carbon copies or ancient and degraded documents. Additionally the information of filled regions is propagated through the scales. The use of integral images for the calculation of the mean, standard deviation and morphological operations allow for an efficient implementation of the method presented. The binarization of each scale is based on changes of the local intensity as proposed by Su et al.

6 citations


Dissertation
01 Jan 2011
TL;DR: Preliminary results show that the proposed system can handle highly degraded manuscript images with background noise, e.g. stains, tears, and faded characters.
Abstract: In this paper, Slavonic manuscripts from the 11th century written in Glagolitic script are investigated. State-of-the-art optical character recognition methods produce poor results for degraded handwritten document images. This is largely due to a lack of suitable results from basic pre-processing steps such as binarization and image segmentation. Therefore, a new, binarization-free approach will be presented that is independent of pre-processing deficiencies. It additionally incorporates local information in order to recognize also fragmented or faded characters. The proposed algorithm consists of two steps: character classification and character localization. Firstly scale invariant feature transform features are extracted and classified using support vector machines. On this basis interest points are clustered according to their spatial information. Then, characters are localized and eventually recognized by a weighted voting scheme of pre-classified local descriptors. Preliminary results show that the proposed system can handle highly degraded manuscript images with background noise, e.g. stains, tears, and faded characters.

4 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: From the experimental results on test images, EBSR preserves the structure of the original image compared to RBSR, and is preferred except when there are large motions in consecutive frames.
Abstract: By comparing two classes of Super-Resolution (SR) namely Example-Based Super-Resolution (EBSR) and Reconstruction-Based Super-Resolution (RBSR), we investigate two points: Firstly, which SR technique EBSR or RBSR will produce SR image that preserves Structure Similarity (SSIM) to the original image? Secondly, which SR technique will produce SR image that is more appealing to human eyes? For resultant SR image, EBSR predicts the relation between high and low frequencies in an image, whereas RBSR algorithms rely on a sequence of frames. From the experimental results on test images, we find that compared to RBSR, EBSR preserves the structure of the original image. Knowing this capability is important for detection and recognition systems. In terms of visual appearance, RBSR is preferred except when there are large motions in consecutive frames. Moreover, the aliasing artifacts cannot be removed by EBSR algorithms.

4 citations


Journal ArticleDOI
03 Mar 2011
TL;DR: An automatic character decomposition and primitive extraction dissects the scriptural elements into analysable pieces that are necessary for palaeographic and graphemic analyses, writing tool recognition, text restoration, and optical character recognition.
Abstract: This paper presents an overview of data acquisition and processing procedures of an interdisciplinary project of philologists and image processing experts aiming at the decipherment and reconstruction of damaged manuscripts. The digital raw image data was acquired via multi-spectral imaging. As a preparatory step we developed a method of foreground-background separation (binarisation) especially designed for multi-spectral images of degraded documents. On the basis of the binarised images further applications were developed: an automatic character decomposition and primitive extraction dissects the scriptural elements into analysable pieces that are necessary for palaeographic and graphemic analyses, writing tool recognition, text restoration, and optical character recognition. The results of the relevant procedures can be stored and interrogated in a database application. Furthermore, a semi-automatic page layout analysis provides codicological information on latent page contents (script, ruling, decorations).

18 Dec 2011
TL;DR: In this article, the acquisition and digital processing of multi-spectral images containing historic writings containing overwritten historic text and an overlying handwriting, which is considerably younger than the overlying text, is discussed.
Abstract: Multi-spectral imaging has proven its usefulness for the examination of ancient manuscripts, since it enhances the legibility of vanished or erased writings. Hence, this non-invasive conservation technique facilitates the work of philologists. This paper is concerned with the acquisition and digital processing of multi-spectral images containing historic writings. The manuscript pages examined contain an overwritten historic text and an overlying handwriting, which is considerably younger than the overlying text. The younger texts are visible under all wavelengths utilized, while the older texts are best legible under UltraViolet illumination. This work presents efforts, which have been taken, in order to make the ancient writings readable. At first the image acquisition setup is detailed and the image processing methods, which have been applied, are explained in the second part of the document.