Home
/
Authors
/
Syed Saqib Bukhari

Author

Syed Saqib Bukhari

German Research Centre for Artificial Intelligence

Other affiliations: Kaiserslautern University of Technology

Bio: Syed Saqib Bukhari is an academic researcher from German Research Centre for Artificial Intelligence. The author has contributed to research in topics: Optical character recognition & Eye tracking. The author has an hindex of 20, co-authored 95 publications receiving 1150 citations. Previous affiliations of Syed Saqib Bukhari include Kaiserslautern University of Technology.

Papers published on a yearly basis

2021
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Document image segmentation using discriminative learning over connected components

[...]

Syed Saqib Bukhari¹, Mayce Al Azawi¹, Faisal Shafait², Thomas M. Breuel¹•Institutions (2)

Kaiserslautern University of Technology¹, German Research Centre for Artificial Intelligence²

09 Jun 2010

TL;DR: This work trains a self-tunable multi-layer perceptron (MLP) classifier for distinguishing between text and non-text connected components using shape and context information as a feature vector to introduce connected component based classification.

...read moreread less

Abstract: Segmentation of a document image into text and non-text regions is an important preprocessing step for a variety of document image analysis tasks, like improving OCR, document compression etc. Most of the state-of-the-art document image segmentation approaches perform segmentation using pixel-based or zone(block)-based classification. Pixel-based classification approaches are time consuming, whereas block-based methods heavily depend on the accuracy of block segmentation step. In contrast to the state-of-the-art document image segmentation approaches, our segmentation approach introduces connected component based classification, thereby not requiring a block segmentation beforehand. Here we train a self-tunable multi-layer perceptron (MLP) classifier for distinguishing between text and non-text connected components using shape and context information as a feature vector. Experimental results prove the effectiveness of our proposed algorithm. We have evaluated our method on subset of UW-III, ICDAR 2009 page segmentation competition test images and circuit diagrams datasets and compared its results with the state-of-the-art leptonica's page segmentation algorithm.

...read moreread less

81 citations

Proceedings Article•DOI•

Script-Independent Handwritten Textlines Segmentation Using Active Contours

[...]

Syed Saqib Bukhari¹, Faisal Shafait², Thomas M. Breuel¹•Institutions (2)

Kaiserslautern University of Technology¹, German Research Centre for Artificial Intelligence²

26 Jul 2009

TL;DR: A novel, script-independent textline segmentation approach for handwritten documents, which is robust against above mentioned problems, and uses matched filter bank approach for smoothing and does not require heuristic post processing steps for merging or splitting segmented textlines.

...read moreread less

Abstract: Handwritten document images contain textlines with multi orientations, touching and overlapping characters within consecutive textlines, and small inter-line spacing making textline segmentation a difficult task. In this paper we propose a novel, script-independent textline segmentation approach for handwritten documents, which is robust against above mentioned problems. We model textline extraction as a general image segmentation task. We compute the central line of parts of textlines using ridges over the smoothed image. Then we adapt the state-of-the-art active contours (snakes) over ridges, which results in textline segmentation. Unlike the ``Level Set'' and "Mumford-Shah model'' based handwritten textline segmentation methods, our method use matched filter bank approach for smoothing and does not require heuristic post processing steps for merging or splitting segmented textlines. Experimental results prove the effectiveness of the proposed algorithm. We evaluated our algorithm on ICDAR 2007 handwritten segmentation contest dataset and obtained an accuracy of 96.3%.

...read moreread less

77 citations

Proceedings Article•DOI•

Layout Analysis for Arabic Historical Document Images Using Machine Learning

[...]

Syed Saqib Bukhari¹, Thomas M. Breuel¹, Abedelkadir Asi², Jihad El-Sana²•Institutions (2)

Kaiserslautern University of Technology¹, Ben-Gurion University of the Negev²

18 Sep 2012

TL;DR: This work introduces an approach that segments text appearing in page margins from manuscripts with complex layout format, independent of block segmentation, as well as pixel level analysis.

...read moreread less

Abstract: Page layout analysis is a fundamental step of any document image understanding system. We introduce an approach that segments text appearing in page margins (a.k.a side-notes text) from manuscripts with complex layout format. Simple and discriminative features are extracted in a connected-component level and subsequently robust feature vectors are generated. Multilayer perception classifier is exploited to classify connected components to the relevant class of text. A voting scheme is then applied to refine the resulting segmentation and produce the final classification. In contrast to state-of-the-art segmentation approaches, this method is independent of block segmentation, as well as pixel level analysis. The proposed method has been trained and tested on a dataset that contains a variety of complex side-notes layout formats, achieving a segmentation accuracy of about 95%.

...read moreread less

65 citations

Proceedings Article•DOI•

Improved document image segmentation algorithm using multiresolution morphology

[...]

Syed Saqib Bukhari¹, Faisal Shafait, Thomas M. Breuel¹•Institutions (1)

Kaiserslautern University of Technology¹

24 Jan 2011

TL;DR: Modifications to the text/non-text segmentation algorithm presented by Bloomberg are described which result in significant improvements and achieved better segmentation accuracy than the original algorithm for UW-III, UNLV, ICDAR 2009 page segmentation competition test images and circuit diagram datasets.

...read moreread less

Abstract: Page segmentation into text and non-text elements is an essential preprocessing step before optical character recognition (OCR) operation. In case of poor segmentation, an OCR classification engine produces garbage characters due to the presence of non-text elements. This paper describes modifications to the text/non-text segmentation algorithm presented by Bloomberg,1 which is also available in his open-source Leptonica library.2The modifications result in significant improvements and achieved better segmentation accuracy than the original algorithm for UW-III, UNLV, ICDAR 2009 page segmentation competition test images and circuit diagram datasets.

...read moreread less

62 citations

Proceedings Article•DOI•

Comparative Study between Traditional Machine Learning and Deep Learning Approaches for Text Classification

[...]

Cannannore Nidhi Narayana Kamath¹, Syed Saqib Bukhari¹, Andreas Dengel¹•Institutions (1)

German Research Centre for Artificial Intelligence¹

28 Aug 2018

TL;DR: This work has created multiple classifiers for document classification and compared their accuracy on raw and processed data and is also exploring hierarchical classifier for classification of classes and subclasses.

...read moreread less

Abstract: In this contemporaneous world, it is an obligation for any organization working with documents to end up with the insipid task of classifying truckload of documents, which is the nascent stage of venturing into the realm of information retrieval and data mining. But classification of such humongous documents into multiple classes, calls for a lot of time and labor. Hence a system which could classify these documents with acceptable accuracy would be of an unfathomable help in document engineering. We have created multiple classifiers for document classification and compared their accuracy on raw and processed data. We have garnered data used in a corporate organization as well as publicly available data for comparison. Data is processed by removing the stop-words and stemming is implemented to produce root words. Multiple traditional machine learning techniques like Naive Bayes, Logistic Regression, Support Vector Machine, Random forest Classifier and Multi-Layer Perceptron are used for classification of documents. Classifiers are applied on raw and processed data separately and their accuracy is noted. Along with this, Deep learning technique such as Convolution Neural Network is also used to classify the data and its accuracy is compared with that of traditional machine learning techniques. We are also exploring hierarchical classifiers for classification of classes and subclasses. The system classifies the data faster and with better accuracy than if done manually. The results are discussed in the results and evaluation section.

...read moreread less

59 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

High-quality single-shot capture of facial geometry

[...]

Thabo Beeler

01 Jan 2010

TL;DR: A passive stereo system for capturing the 3D geometry of a face in a single-shot under standard light sources is described, modified of standard stereo refinement methods to capture pore-scale geometry, using a qualitative approach that produces visually realistic results.

...read moreread less

Abstract: This paper describes a passive stereo system for capturing the 3D geometry of a face in a single-shot under standard light sources. The system is low-cost and easy to deploy. Results are submillimeter accurate and commensurate with those from state-of-the-art systems based on active lighting, and the models meet the quality requirements of a demanding domain like the movie industry. Recovered models are shown for captures from both high-end cameras in a studio setting and from a consumer binocular-stereo camera, demonstrating scalability across a spectrum of camera deployments, and showing the potential for 3D face modeling to move beyond the professional arena and into the emerging consumer market in stereoscopic photography. Our primary technical contribution is a modification of standard stereo refinement methods to capture pore-scale geometry, using a qualitative approach that produces visually realistic results. The second technical contribution is a calibration method suited to face capture systems. The systemic contribution includes multiple demonstrations of system robustness and quality. These include capture in a studio setup, capture off a consumer binocular-stereo camera, scanning of faces of varying gender and ethnicity and age, capture of highly-transient facial expression, and scanning a physical mask to provide ground-truth validation.

...read moreread less

254 citations

Proceedings Article•DOI•

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

[...]

Xiao Yang¹, Ersin Yumer², Paul Asente², Mike Kraley², Daniel Kifer, C. Lee Giles³ - Show less +2 more•Institutions (3)

Pennsylvania State University¹, Adobe Systems², Penn State College of Information Sciences and Technology³

01 Jul 2017

TL;DR: Li et al. as mentioned in this paper presented an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images, which considers document semantic structure extraction as a pixel-wise segmentation task, and proposes a unified model that classifies pixels based not only on their visual appearance, but also on the content of underlying text.

...read moreread less

Abstract: We present an end-to-end, multimodal, fully convolutional network for extracting semantic structures from document images. We consider document semantic structure extraction as a pixel-wise segmentation task, and propose a unified model that classifies pixels based not only on their visual appearance, as in the traditional page segmentation task, but also on the content of underlying text. Moreover, we propose an efficient synthetic document generation process that we use to generate pretraining data for our network. Once the network is trained on a large set of synthetic documents, we fine-tune the network on unlabeled real documents using a semi-supervised approach. We systematically study the optimum network architecture and show that both our multimodal approach and the synthetic data pretraining significantly boost the performance.

...read moreread less

199 citations

Journal Article•DOI•

A Review and Analysis of Eye-Gaze Estimation Systems, Algorithms and Performance Evaluation Methods in Consumer Platforms

[...]

Anuradha Kar¹, Peter Corcoran¹•Institutions (1)

National University of Ireland, Galway¹

07 Aug 2017-IEEE Access

TL;DR: A key outcome from this review is the realization of a need to develop standardized methodologies for the performance evaluation of gaze tracking systems and achieve consistency in their specification and comparative evaluation.

...read moreread less

Abstract: In this paper, a review is presented for the research on eye gaze estimation techniques and applications, which has progressed in diverse ways over the past two decades. Several generic eye gaze use-cases are identified: desktop, TV, head-mounted, automotive, and handheld devices. Analysis of the literature leads to the identification of several platform specific factors that influence gaze tracking accuracy. A key outcome from this review is the realization of a need to develop standardized methodologies for the performance evaluation of gaze tracking systems and achieve consistency in their specification and comparative evaluation. To address this need, the concept of a methodological framework for practical evaluation of different gaze tracking systems is proposed.

...read moreread less

193 citations

Journal Article•DOI•

A Review of Psychophysiological Measures to Assess Cognitive States in Real-World Driving

[...]

Monika Lohani¹, Brennan R. Payne¹, David L. Strayer¹•Institutions (1)

University of Utah¹

19 Mar 2019-Frontiers in Human Neuroscience

TL;DR: A selective review of the psychophysiological measures that can be utilized to assess cognitive states in real-world driving environments to advance the development of effective human-machine driving interfaces and driver support systems is provided.

...read moreread less

Abstract: As driving functions become increasingly automated, motorists run the risk of becoming cognitively removed from the driving process. Psychophysiological measures may provide added value not captured through behavioral or self-report measures alone. This paper provides a selective review of the psychophysiological measures that can be utilized to assess cognitive states in real-world driving environments. First, the importance of psychophysiological measures within the context of traffic safety is discussed. Next, the most commonly used physiology-based indices of cognitive states are considered as potential candidates relevant for driving research. These include: electroencephalography and event-related potentials, optical imaging, heart rate and heart rate variability, blood pressure, skin conductance, electromyography, thermal imaging, and pupillometry. For each of these measures, an overview is provided, followed by a discussion of the methods for measuring it in a driving context. Drawing from recent empirical driving and psychophysiology research, the relative strengths and limitations of each measure are discussed to highlight each measures' unique value. Challenges and recommendations for valid and reliable quantification from lab to (less predictable) real-world driving settings are considered. Finally, we discuss measures that may be better candidates for a near real-time assessment of motorists' cognitive states that can be utilized in applied settings outside the lab. This review synthesizes the literature on in-vehicle psychophysiological measures to advance the development of effective human-machine driving interfaces and driver support systems.

...read moreread less

174 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195

Collapse