Showing papers by "Michael Blumenstein published in 2022"

PDF

Open Access

Journal Article•DOI•

A new method for detection and prediction of occluded text in natural scene images

[...]

Ayush Mittal¹, Dieter Fleck², Palaiahnakote Shivakumara³, Umapada Pal¹, Tong Lu², Michael Blumenstein⁴ - Show less +2 more•Institutions (4)

Indian Statistical Institute¹, Nanjing University², Information Technology University³, University of Technology, Sydney⁴

01 Jan 2022-Signal Processing-image Communication

TL;DR: Zhang et al. as discussed by the authors exploited the property of DCT for finding significant information in images by selecting multiple channels, and proposed a method that studies texture distribution based on statistical measurement to extract features.

...read moreread less

Abstract: Text detection from natural scene images is an active research area for computer vision, signal, and image processing because of several real-time applications such as driving vehicles automatically and tracing person behaviors during sports or marathon events. In these situations, there is a high probability of missing text information due to the occlusion of different objects/persons while capturing images. Unlike most of the existing methods, which focus only on text detection by ignoring the effect of missing texts, this work detects and predicts missing texts so that the performance of the OCR improves. The proposed method exploits the property of DCT for finding significant information in images by selecting multiple channels. For chosen DCT channels, the proposed method studies texture distribution based on statistical measurement to extract features. We propose to adopt Bayesian classifier for categorizing text pixels using extracted features. Then a deep learning model is proposed for eliminating false positives to improve text detection performance. Further, the proposed method employs a Natural Language Processing (NLP) model for predicting missing text information by using detected and recognition texts. Experimental results on our dataset, which contains texts occluded by objects, show that the proposed method is effective in predicting missing text information. To demonstrate the effectiveness and objectiveness of the proposed method, we also tested it on the standard datasets of natural scene images, namely, ICDAR 2017-MLT, Total-Text, and CTW1500.

...read moreread less

6 citations

Book Chapter•DOI•

SharkSpotter: Shark Detection with Drones for Human Safety and Environmental Protection

[...]

Nabin Sharma, Muhammed Saqib, Paul Scully-Power, Michael Blumenstein

01 Jan 2022

4 citations

Journal Article•DOI•

A Survey on Object Instance Segmentation

[...]

Rabi Sharma, Muhammad Saqib, C. T. Lin, Michael Blumenstein

29 Sep 2022-SN computer science

TL;DR: In this paper , the authors give a detail introduction to instance segmentation technology based on deep learning, reinforcement learning and transformers and discuss about its development in this field along with the most common datasets used.

...read moreread less

Abstract: Abstract In recent years, instance segmentation has become a key research area in computer vision. This technology has been applied in varied applications such as robotics, healthcare and intelligent driving. Instance segmentation technology not only detects the location of the object but also marks edges for each single instance, which can solve both object detection and semantic segmentation concurrently. Our survey will give a detail introduction to the instance segmentation technology based on deep learning, reinforcement learning and transformers. Further, we will discuss about its development in this field along with the most common datasets used. We will also focus on different challenges and future development scope for instance segmentation. This technology will provide a strong reference for future researchers in our survey paper.

...read moreread less

4 citations

Proceedings Article•DOI•

TIPS: Text-Induced Pose Synthesis

[...]

Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein - Show less +1 more

24 Jul 2022

TL;DR: This paper first presents the shortcomings of current pose transfer algorithms and then proposes a novel text-based pose transfer technique to address those issues, which generates promising results with significant qualitative and quantitative scores in the authors' experiments.

...read moreread less

Abstract: In computer vision, human pose synthesis and transfer deal with probabilistic image generation of a person in a previously unseen pose from an already available observation of that person. Though researchers have recently proposed several methods to achieve this task, most of these techniques derive the target pose directly from the desired target image on a specific dataset, making the underlying process challenging to apply in real-world scenarios as the generation of the target image is the actual aim. In this paper, we first present the shortcomings of current pose transfer algorithms and then propose a novel text-based pose transfer technique to address those issues. We divide the problem into three independent stages: (a) text to pose representation, (b) pose refinement, and (c) pose rendering. To the best of our knowledge, this is one of the first attempts to develop a text-based pose transfer framework where we also introduce a new dataset DF-PASS, by adding descriptive pose annotations for the images of the DeepFashion dataset. The proposed method generates promising results with significant qualitative and quantitative scores in our experiments.

...read moreread less

4 citations

Journal Article•DOI•

A Conformable Moments-Based Deep Learning System for Forged Handwriting Detection.

[...]

Lokesh Nandanwar, Palaiahnakote Shivakumara, Hamid A. Jalab, Rabha W. Ibrahim, R. Raghavendra, Umapada Pal, Tong Lu, Michael Blumenstein - Show less +4 more

21 Sep 2022-IEEE transactions on neural networks and learning systems

TL;DR: A new model based on conformable moments and deep ensemble neural networks for forged handwriting detection in noisy and blurry environments is presented and experimental results demonstrate that the proposed method outperforms the existing methods in terms of classification rate.

...read moreread less

Abstract: Detecting forged handwriting is important in a wide variety of machine learning applications, and it is challenging when the input images are degraded with noise and blur. This article presents a new model based on conformable moments (CMs) and deep ensemble neural networks (DENNs) for forged handwriting detection in noisy and blurry environments. Since CMs involve fractional calculus with the ability to model nonlinearities and geometrical moments as well as preserving spatial relationships between pixels, fine details in images are preserved. This motivates us to introduce a DENN classifier, which integrates stenographic kernels and spatial features to classify input images as normal (original, clean images), altered (handwriting changed through copy-paste and insertion operations), noisy (added noise to original image), blurred (added blur to original image), altered-noise (noise is added to the altered image), and altered-blurred (blur is added to the altered image). To evaluate our model, we use a newly introduced dataset, which comprises handwritten words altered at the character level, as well as several standard datasets, namely ACPR 2019, ICPR 2018-FDC, and the IMEI dataset. The first two of these datasets include handwriting samples that are altered at the character and word levels, and the third dataset comprises forged International Mobile Equipment Identity (IMEI) numbers. Experimental results demonstrate that the proposed method outperforms the existing methods in terms of classification rate.

...read moreread less

2 citations

Proceedings Article•DOI•

Scene Aware Person Image Generation through Global Contextual Conditioning

[...]

Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein - Show less +1 more

06 Jun 2022

TL;DR: This work proposes a novel pipeline to generate and insert contextually relevant person images into an existing scene while preserving the global semantics and achieves high- resolution photo-realistic generation results while preserves the general context of the scene.

...read moreread less

Abstract: Person image generation is an intriguing yet challenging problem. However, this task becomes even more difficult under constrained situations. In this work, we propose a novel pipeline to generate and insert contextually relevant person images into an existing scene while preserving the global semantics. More specifically, we aim to insert a person such that the location, pose, and scale of the person being inserted blends in with the existing persons in the scene. Our method uses three individual networks in a sequential pipeline. At first, we predict the potential location and the skeletal structure of the new person by conditioning a Wasserstein Generative Adversarial Network (WGAN) on the existing human skeletons present in the scene. Next, the predicted skeleton is refined through a shallow linear network to achieve higher structural accuracy in the generated image. Finally, the target image is generated from the refined skeleton using another generative network conditioned on a given image of the target person. In our experiments, we achieve high-resolution photo-realistic generation results while preserving the general context of the scene. We conclude our paper with multiple qualitative and quantitative benchmarks on the results.

...read moreread less

2 citations

Journal Article•DOI•

TIC: Text-Guided Image Colorization

[...]

Subhankar Ghosh, Prasun Roy, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein - Show less +1 more

04 Aug 2022-arXiv.org

TL;DR: This work has proposed a novel deep network that takes two inputs (the grayscale image and the respective encoded text description) and tries to predict the relevant color gamut and finds that it outperforms the state-of-the-art colorization algorithms both qualitatively and quantitatively.

...read moreread less

Abstract: —Image colorization is a well-known problem in computer vision. However, due to the ill-posed nature of the task, image colorization is inherently challenging. Though several attempts have been made by researchers to make the colorization pipeline automatic, these processes often produce unrealistic results due to a lack of conditioning. In this work, we attempt to integrate textual descriptions as an auxiliary condition, along with the grayscale image that is to be colorized, to improve the ﬁdelity of the colorization process. To the best of our knowledge, this is one of the ﬁrst attempts to incorporate textual conditioning in the colorization pipeline. To do so, we have proposed a novel deep network that takes two inputs (the grayscale image and the respective encoded text description) and tries to predict the relevant color gamut. As the respective textual descriptions contain color information of the objects present in the scene, the text encoding helps to improve the overall quality of the predicted colors. We have evaluated our proposed model using different metrics and found that it outperforms the state-of-the-art colorization algorithms both qualitatively and quantitatively.

...read moreread less

2 citations

Journal Article•DOI•

Local Resultant Gradient Vector Difference and Inpainting for 3D Text Detection in the Wild

[...]

Dajian Zhong, Palaiahnakote Shivakumara, Lokesh Nandanwar, Umapada Pal, Michael Blumenstein, Yue Lu - Show less +2 more

26 Apr 2022-International Journal of Pattern Recognition and Artificial Intelligence

1 citations

Journal Article•DOI•

Writer age estimation through handwriting

[...]

Zhi-Xin Huang, Palaiahnakote Shivakumara, Maryam Asadzadeh Kaljahi, Ahlad Kumar, Umapada Pal, Tong Lu, Michael Blumenstein - Show less +3 more

12 Oct 2022-Multimedia Tools and Applications

1 citations

Journal Article•DOI•

Classification of aesthetic natural scene images using statistical and semantic features

[...]

Kunal Biswas, Palaiahnakote Shivakumara, Umapada Pal, Tong Lu, Michael Blumenstein, Josep Lladós - Show less +2 more

24 Sep 2022-Multimedia Tools and Applications

Journal Article•DOI•

A Knowledge Enforcement Network-Based Approach for Classifying a Photographer's Images

[...]

Palaiahnakote Shivakumara, Pinaki Nath Chowdhury, Umapada Pal, David Doermann, Raghavendra Ramachandra, Tong Lu, Michael Blumenstein - Show less +3 more

10 Dec 2022-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: Zhang et al. as discussed by the authors used focused and defocused information in the input images to extract contextual information and fused them to estimate cross-covariance and define a linear relationship between them.

...read moreread less

Abstract: Classification of photos captured by different photographers is an important and challenging problem in knowledge-based and image processing. Monitoring and authenticating images uploaded on social media are essential, and verifying the source is one key piece of evidence. We present a novel framework for classifying photos of different photographers based on the combination of local features and deep learning models. The proposed work uses focused and defocused information in the input images to extract contextual information. The model estimates the weighted gradient and calculates entropy to strengthen context features. The focused and defocused information is fused to estimate cross-covariance and define a linear relationship between them. This relationship results in a feature matrix fed to Knowledge Enforcement Network (KEN) for obtaining representative features. Due to the strong discriminative ability of deep learning models, we employ the lightweight and accurate MobileNetV2. The output of KEN and MobileNetV2 is sent to a classifier for photographer classification. Experimental results of the proposed model on our dataset of 46 photographer classes (46234 images) and publicly available datasets of 41 photographer classes (218303 images) show that the method outperforms the existing techniques by 5%–10% on average. The dataset created for the experimental purpose will be made available upon publication.

...read moreread less

O mni -s cale cnn s : a simple and effective ker nel size configuration for time series classifi cation

[...]

Wensi Tang, G Long, Lu Liu, Tianyi Zhou, Michael Blumenstein, Jing Jiang - Show less +2 more

TL;DR: An Omni-Scale block (OS-block) is proposed for 1D-CNNs, where the kernel sizes are decided by a simple and universal rule and it is a set of kernel sizes that can cover the best RF size across different datasets via consisting of multiple prime numbers according to the length of the time series.

...read moreread less

Abstract: The Receptive Field (RF) size has been one of the most important factors for One Dimensional Convolutional Neural Networks (1D-CNNs) on time series classiﬁcation tasks. Large efforts have been taken to choose the appropriate size because it has a huge inﬂuence on the performance and differs signiﬁcantly for each dataset. In this paper, we propose an Omni-Scale block (OS-block) for 1D-CNNs, where the kernel sizes are decided by a simple and universal rule. Particularly, it is a set of kernel sizes that can efﬁciently cover the best RF size across different datasets via consisting of multiple prime numbers according to the length of the time series. The experiment result shows that models with the OS-block can achieve a similar performance as models with the searched optimal RF size and due to the strong optimal RF size capture ability, simple 1D-CNN models with OS-block achieves the state-of-the-art performance on four time series benchmarks, including both univariate and multivariate data from multiple domains. Comprehensive analysis and discussions shed light on why the OS-block can capture optimal RF sizes across different datasets. Code available here 1

...read moreread less

Book Chapter•DOI•

Exploring Transformers for Intruder Detection in Complex Maritime Environment

[...]

Mrunalini Nalamati, Muhammad Saqib, Nabin Sharma, Michael Blumenstein

01 Jan 2022

Journal Article•DOI•

A new robust approach for altered handwritten text detection

[...]

G. A. Patil, Palaiahnakote Shivakumara, Shivanand S. Gornale, Umapada Pal, Michael Blumenstein - Show less +1 more

16 Nov 2022-Multimedia Tools and Applications

Journal Article•DOI•

New Deep Spatio-Structural Features of Handwritten Text Lines for Document Age Classification

[...]

Palaiahnakote Shivakumara, A. Das, K. S. Raghunandan, Umapada Pal, Michael Blumenstein - Show less +1 more

26 Apr 2022-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: In this paper , a novel method for document age classification at the text line level is presented, which extracts structural, contrast, and spatial features to study degradations at different wavelet decomposition levels.

...read moreread less

Abstract: Document age estimation using handwritten text line images is useful for several pattern recognition and artificial intelligence applications such as forged signature verification, writer identification, gender identification, personality traits identification, and fraudulent document identification. This paper presents a novel method for document age classification at the text line level. For segmenting text lines from handwritten document images, the wavelet decomposition is used in a novel way. We explore multiple levels of wavelet decomposition, which introduce blur as the number of levels increases for detecting word components. The detected components are then used for a direction guided-driven growing approach with linearity, and nonlinearity criteria for segmenting text lines. For classification of text line images of different ages, inspired by the observation that, as the age of a document increases, the quality of its image degrades, the proposed method extracts the structural, contrast, and spatial features to study degradations at different wavelet decomposition levels. The specific advantages of DenseNet, namely, strong feature propagation, mitigation of the vanishing gradient problem, reuse of features, and the reduction of the number of parameters motivated us to use DenseNet121 along with a Multi-layer Perceptron (MLP) for the classification of text lines of different ages by feeding features and the original image as input. To demonstrate the efficacy of the proposed model, experiments were conducted on our own as well as standard datasets for both text line segmentation and document age classification. The results show that the proposed method outperforms the existing methods for text line segmentation in terms of precision, recall, F-measure, and document age classification in terms of average classification rate.

...read moreread less