Home
/
Authors
/
Imran Malik

Author

Imran Malik

Bio: Imran Malik is an academic researcher from University of the Sciences. The author has contributed to research in topics: Deep learning & Table (database). The author has an hindex of 1, co-authored 2 publications receiving 113 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Table Detection Using Deep Learning

[...]

Azka Gilani, Shah Rukh Qasim¹, Imran Malik¹, Faisal Shafait¹•Institutions (1)

University of the Sciences¹

01 Nov 2017

TL;DR: The proposed method works with high precision on document images with varying layouts that include documents, research papers, and magazines and beats Tesseract's state of the art table detection system by a significant margin.

...read moreread less

Abstract: Table detection is a crucial step in many document analysis applications as tables are used for presenting essential information to the reader in a structured manner. It is a hard problem due to varying layouts and encodings of the tables. Researchers have proposed numerous techniques for table detection based on layout analysis of documents. Most of these techniques fail to generalize because they rely on hand engineered features which are not robust to layout variations. In this paper, we have presented a deep learning based method for table detection. In the proposed method, document images are first pre-processed. These images are then fed to a Region Proposal Network followed by a fully connected neural network for table detection. The proposed method works with high precision on document images with varying layouts that include documents, research papers, and magazines. We have done our evaluations on publicly available UNLV dataset where it beats Tesseract's state of the art table detection system by a significant margin.

...read moreread less

159 citations

Book Chapter•DOI•

Detection of Subject Attention in an Active Environment Through Facial Expressions Using Deep Learning Techniques and Computer Vision

[...]

Naqash Gerard¹, Talha Yousuf¹, Ahmed Husnain Johar¹, Umer Asgher¹, Imran Malik¹, Adnan ul Hasan¹, Faisal Shafait¹ - Show less +3 more•Institutions (1)

University of the Sciences¹

16 Jul 2020

TL;DR: Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) are used to extract the percentage of attentiveness and non-attentiveness of students based on the student emotions in the classroom using deep learning techniques along with computer vision.

...read moreread less

Abstract: This research aims for investigation of workers in an industrial environment and can be used as an alternate for monitoring an attention of operator in real-time Detection of attentiveness and non-attentiveness of people working in an industry could help to identify the weaknesses and strengths of any industrial organization Human factor is the main and the most critical part of any industrial organization As a special case, we have established how to detect student attention in the classroom using deep learning techniques along with computer vision Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) are used to extract the percentage of attentiveness and non-attentiveness of students based on the student emotions in the classroom We used the FER-2013 data set for this paper As per the study, human has finite number of emotions So, it is easy if we include some emotions in an attentive (Happy, Anger, Surprised and Neutral) domain and some emotions in non-attentive (Sad, Fear and Disgust) domain This will help the teacher in a way that he can easily evaluate his class attentiveness On another side, it is also the evaluation of the teacher’s teaching methodology because if the students are engaged in his lecture it means his teaching methodology is good and if most of the students are not engaged then the teacher needs to revise his methodology of teaching in order to engage his class during the lecture

...read moreread less

1 citations

Cited by

PDF

Open Access

More filters

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Posted Content•

PubLayNet: largest dataset ever for document layout analysis.

[...]

Xu Zhong¹, Jianbin Tang¹, Antonio Jimeno Yepes¹•Institutions (1)

IBM¹

16 Aug 2019-arXiv: Computation and Language

TL;DR: The authors developed the PubLayNet dataset for document layout analysis by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central, where typical document layout elements are annotated.

...read moreread less

Abstract: Recognizing the layout of unstructured digital documents is an important step when parsing the documents into structured machine-readable format for downstream applications. Deep neural networks that are developed for computer vision have been proven to be an effective method to analyze layout of document images. However, document layout datasets that are currently publicly available are several magnitudes smaller than established computing vision datasets. Models have to be trained by transfer learning from a base model that is pre-trained on a traditional computer vision dataset. In this paper, we develop the PubLayNet dataset for document layout analysis by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central. The size of the dataset is comparable to established computer vision datasets, containing over 360 thousand document images, where typical document layout elements are annotated. The experiments demonstrate that deep neural networks trained on PubLayNet accurately recognize the layout of scientific articles. The pre-trained models are also a more effective base mode for transfer learning on a different document domain. We release the dataset (this https URL) to support development and evaluation of more advanced models for document layout analysis.

...read moreread less

177 citations

Proceedings Article•DOI•

PubLayNet: Largest Dataset Ever for Document Layout Analysis

[...]

Xu Zhong¹, Jianbin Tang¹, Antonio Jimeno Yepes¹•Institutions (1)

IBM¹

16 Aug 2019

TL;DR: The PubLayNet dataset for document layout analysis is developed by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central and demonstrated that deep neural networks trained on Pub LayNet accurately recognize the layout of scientific articles.

...read moreread less

Abstract: Recognizing the layout of unstructured digital documents is an important step when parsing the documents into structured machine-readable format for downstream applications. Deep neural networks that are developed for computer vision have been proven to be an effective method to analyze layout of document images. However, document layout datasets that are currently publicly available are several magnitudes smaller than established computing vision datasets. Models have to be trained by transfer learning from a base model that is pre-trained on a traditional computer vision dataset. In this paper, we develop the PubLayNet dataset for document layout analysis by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central. The size of the dataset is comparable to established computer vision datasets, containing over 360 thousand document images, where typical document layout elements are annotated. The experiments demonstrate that deep neural networks trained on PubLayNet accurately recognize the layout of scientific articles. The pre-trained models are also a more effective base mode for transfer learning on a different document domain. We release the dataset (https://github.com/ibm-aur-nlp/PubLayNet) to support development and evaluation of more advanced models for document layout analysis.

...read moreread less

160 citations

Proceedings Article•DOI•

Rethinking Table Recognition using Graph Neural Networks

[...]

Shah Rukh Qasim¹, Hassan Mahmood¹, Faisal Shafait¹•Institutions (1)

University of the Sciences¹

01 Sep 2019

TL;DR: This paper proposed an architecture based on graph networks as a better alternative to standard neural networks for table recognition, which combines the benefits of convolutional neural network for visual feature extraction and graph networks for dealing with the problem structure.

...read moreread less

Abstract: Document structure analysis, such as zone segmentation and table recognition, is a complex problem in document processing and is an active area of research. The recent success of deep learning in solving various computer vision and machine learning problems has not been reflected in document structure analysis since conventional neural networks are not well suited to the input structure of the problem. In this paper, we propose an architecture based on graph networks as a better alternative to standard neural networks for table recognition. We argue that graph networks are a more natural choice for these problems, and explore two gradient-based graph neural networks. Our proposed architecture combines the benefits of convolutional neural networks for visual feature extraction and graph networks for dealing with the problem structure. We empirically demonstrate that our method outperforms the baseline by a significant margin. In addition, we identify the lack of large scale datasets as a major hindrance for deep learning research for structure analysis and present a new large scale synthetic dataset for the problem of table recognition. Finally, we open-source our implementation of dataset generation and the training framework of our graph networks to promote reproducible research in this direction.

...read moreread less

111 citations

Book Chapter•DOI•

A Saliency-Based Convolutional Neural Network for Table and Chart Detection in Digitized Documents

[...]

Isaak Kavasidis¹, Carmelo Pino¹, Simone Palazzo¹, Francesco Rundo², Daniela Giordano¹, P. Messina, Concetto Spampinato¹ - Show less +3 more•Institutions (2)

University of Catania¹, STMicroelectronics²

09 Sep 2019

TL;DR: A saliency-based fully-convolutional neural network performing multi-scale reasoning on visual cues followed by a fully-connected conditional random field (CRF) for localizing tables and charts in digital/digitized documents is proposed.

...read moreread less

Abstract: Within the realm of information extraction from documents, detection of tables and charts is particularly needed as they contain a visual summary of the most valuable information contained in a document. For a complete automation of the visual information extraction process from tables and charts, it is necessary to develop techniques that localize them and identify precisely their boundaries. In this paper we aim at solving the table/chart detection task through an approach that combines deep convolutional neural networks, graphical models and saliency concepts. In particular, we propose a saliency-based fully-convolutional neural network performing multi-scale reasoning on visual cues followed by a fully-connected conditional random field (CRF) for localizing tables and charts in digital/digitized documents. Performance analysis, carried out on an extended version of the ICDAR 2013 (with annotated charts as well as tables) dataset, shows that our approach yields promising results, outperforming existing models.

...read moreread less

100 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Collapse