scispace - formally typeset
Search or ask a question

Showing papers in "Pattern Recognition and Image Analysis in 2022"



DOI
TL;DR: The graph theory based segmentation method has been proposed, because it has flexibility representing any complex structure and is equivalent to shortest path in medical imaging field.

6 citations






DOI
TL;DR: Four of the mostly used clustering algorithms are compared by segmenting the leaves of the rosette plants, with an ideal theoretical segmentation, to see how the results differ and which one tends to create the most accurate segment.

3 citations


DOI
TL;DR: Improved Cuckoo Search variant based on the rough set theory is presented and shows that proposed rough cuckoo search outperforms the other tested nature-inspired optimization algorithm in terms of optimization ability and consistency.

3 citations


DOI
TL;DR: This article studies the problem of classifying images containing text inserts and the results for joint classification by text and image outperform the image classification algorithm, and the text classification algorithm by 8%.

3 citations



Journal ArticleDOI
TL;DR: In this paper , a method for detecting objects in high-resolution images is proposed that is based on representing an image as a set of its copies of decreasing scale, splitting it into blocks with overlap at each level of the image pyramid except for the top one and analyzing objects at the boundaries of adjacent blocks to merge them.
Abstract: A method for detecting objects in high-resolution images is proposed that is based on representing an image as a set of its copies of decreasing scale, splitting it into blocks with overlap at each level of the image pyramid except for the top one, detecting objects in the blocks, and analyzing objects at the boundaries of adjacent blocks to merge them. The number of pyramid layers is determined by the size of the image and the input layer of the convolutional neural network (CNN). At all levels except for the top one, a block splitting is performed, and the use of overlap allows one to improve the correct classification of objects, which are divided into fragments and located in adjacent blocks. The decision to merge such fragments is made based on the analysis of the metric of intersection over union and membership in the same class. The proposed approach is evaluated for 4K and 8K images. To carry out experiments, a database is prepared with objects of two classes, person and vehicle, marked in such images. Networks of the You Only Look Once (YOLO) family of the third and fourth versions are used as CNNs. A quantitative assessment of the detection efficiency of objects is performed using the mAP metric for various combinations of parameters such as the degree of threshold confidence of the CNN and the percentage of intersection of blocks in the hierarchical representation of images. The results of the investigations are presented.

DOI
TL;DR: The proposed algorithm made it possible to overclock the GPU computing of the NVIDIA RTX 2080 Super video card by 4 times, up to almost 30 frames per second, however, processing frames using Intel Open Visual Inference and Neural network Optimization (OpenVINO) toolkit allows us to achieve similar performance on the CPU without optical flow and speed extrapolation.

DOI
TL;DR: Experimental results show that the proposed WGAN single image inpainting method can mine the global correlation of the image itself better than the compared methods in quantitative as well as qualitative assessments.





DOI
TL;DR: A quantitative assessment of the detection efficiency of objects is performed using the mAP metric for various combinations of parameters such as the degree of threshold confidence of the CNN and the percentage of intersection of blocks in the hierarchical representation of images.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a detection and tracking algorithm that uses the tracking-by-detection paradigm and convolutional neural networks (CNNs) to detect and track people.
Abstract: The automatic detection and tracking of appearance and behavior anomalies in video surveillance systems is one of the promising areas for the development and implementation of artificial intelligence. In this paper, we present a formalization of these problems. Based on the proposed generalization, a detection and tracking algorithm that uses the tracking-by-detection paradigm and convolutional neural networks (CNNs) is developed. At the first stage, people are detected using the YOLOv5 CNN and are marked with bounding boxes. Then, their faces in the selected regions are detected and the presence or absence of face masks is determined. Our approach to face-mask detection also uses YOLOv5 as a detector and classifier. For this problem, we generate a training dataset by combining the Kaggle dataset and a modified Wider Face dataset, in which face masks were superimposed on half of the images. To ensure a high accuracy of tracking and trajectory construction, the CNN features of the images are included in a composite descriptor, which also contains geometric and color features, to describe each person detected in the current frame and compare this person with all people detected in the next frame. The results of the experiments are presented, including some examples of frames from processed video sequences with visualized trajectories for loitering and falls.

DOI
TL;DR: This paper aims to explore possible research directions and define new fusion approaches based on ensembling, and shows favorable results with an increment in accuracy regarding the number of operations needed in training and inference.




DOI
TL;DR: A new cancelable deep feature extraction method (C-PCANet) using chaotic maps is proposed that can effectively provide lightweight and cancelableDeep biometric features that can employed in a variety of high-security applications.


DOI
TL;DR: A detection and tracking algorithm that uses the tracking-by-detection paradigm and convolutional neural networks (CNNs) is developed, which uses YOLOv5 as a detector and classifier for face-mask detection.




DOI
TL;DR: The classification of indoor scenes using three deep learning models, namely, ResNet, MobileNet, and EfficientNet is attempted, and the influence of activation functions on classification accuracy is being explored.