scispace - formally typeset
Search or ask a question

Showing papers on "3D single-object recognition published in 2023"


Journal ArticleDOI
TL;DR: In this paper , an Object Recognition model using Tiny YOLO and FPGA is presented, where object recognition is done with Image Classification, Object Localization and Object Detection that combines these two tasks and localizes and classifies an object in an image.
Abstract: The objective of this paper is to build an Object Recognition model using ‘TINY YOLO’ and FPGA. Object Recognition is a CV (i. e., Computer Vision) technique for identifying and detecting objects in images or videos. Object recognition is a key output of DL (i. e., Deep Learning) and ML (i. e., Machine Learning). When we humans look at an image or video, we can readily spot people, object, scene and visual details. The aim of this is to teach a computer to do what comes innately to a human being. Object recognition is done with Image Classification, Object Localization and Object Detection that combines these two tasks and localizes and classifies an object in an image. Object Recognition has become a pivotal technology for driverless cars, security purpose, traffic surveillance etc., It is also used in a various application such as disease detection in bio imaging, industrial examination, and robotic vision. Hence, in a nutshell, ‘Object Recognition is the staple of automation industry’.

Proceedings ArticleDOI
17 Mar 2023
TL;DR: In this article , the authors discuss current knowledge and suggest future directions for study in the field of facial recognition and discuss the technology utilized in face recognition, as well as the databases of methods that utilize this technology.
Abstract: The study of computer vision and pattern recognition is growing because of the various commercial and practical applications of these disciplines. Identification of individuals in a multitude, access control, forensics, and human-computer interactions are only among the topics studied by these areas. However, analyzing unconstrained face recognition poses ethical issues and privacy concerns. Many recent proposals employ Holistic Methods, Geometric Approach and Local-Texture Approach, methods and databases like ORL, FERET and AR Dataset to study constrained face recognition. At least some understanding of 2D perspective was achieved. This occurred in highly controlled environments where parameters such as camera angles, lighting and distance were strictly regulated. However, significant degradation in recognition performance occurred if the environment changed or the subject smiled or frowned. This critique discusses the technology utilized in face recognition, as well as the databases of methods that utilize this technology. To help guide future research, this article discusses current knowledge and suggests future directions for study in the field of facial recognition.

Proceedings ArticleDOI
05 Apr 2023
TL;DR: In this paper , a real-time object recognition system that makes use of local image features using Scale Invariant Feature Transform (SIFT) was proposed, which are partial changes in illumination and affine, and they are invariant to rotation, image scaling, and translation.
Abstract: In this paper, we propose a real-time object recognition system that makes use of local image features using Scale Invariant Feature Transform (SIFT). The features are partial changes in illumination and affine, and they are invariant to rotation, image scaling, and translation. Similar characteristics are shared by these features and the inferior temporal cortex neurons involved in object recognition. Features can be efficiently detected using a phased filtering technique in scale space. Image keys generally support the local geometric information representing the blurred image gradient at different scales and orientation planes. An indexing technique called nearest-neighbor search (NSS) uses the keys as input to determine the object matches. Based on the number of descriptors we categorize the object is recognized or not, and the final output is displayed in the microcontroller-based display unit.

Proceedings ArticleDOI
07 Apr 2023
TL;DR: In this article , an object detection system has been proposed which can detect various objects, in fact it can detect almost any object wrt. training given to the model, the proposed methodology for object detection in the report is You Look Only Once (YOLO).
Abstract: Object detection has been studied by many researchers for important applications in the industry like detecting a road object for self-driving cars, medical research for detecting particular diseases, gesture control, etc. Object detection and recognition is incredibly very important wrt security purposes. As computers and models can work 24/7 it can watch for video surveillance in secure areas. Humans can quickly detect or make out what items are there in photos and photographs, where these images and pictures are located, and how they interact with systems when they see them. [1]. Object identification and tracking is a key challenge in CV systems and interactions, such as visual surveillance and human computer vision systems. Human visual systems are quick and precise, allowing them to handle complicated activities such as driving. Computers will be able to drive automobiles using improvised and speedy errorfree object identification algorithms, yet they will require specialized sensors and auxiliary gadgets to relay real-time scenarios. [1]Using exact object recognition and picture classification approaches, strategies, and methodologies, it is critical and essential for deciding autonomous driving in metropolitan situations. Many big companies are currently working on this and achieving their goals day by day. In this report a object detection system has been proposed which can detect various objects, in fact it can detect almost any object wrt. training given to the model. Proposed methodology for object detection in the report is You Look Only Once (YOLO).

Journal ArticleDOI
TL;DR: The Principal Component Analysis (PCA) has been used for face recognition in this paper , where the eigenface approach and the Fisher face method are two common face recognition pattern algorithms.
Abstract: Abstract Faces are one of the simplest methods to determine a person's identity. Face recognition is a unique identifying method that uses an individual's traits to determine the identity of that individual. The proposed recognition process is divided into two stages: face recognition and object recognition. Unless the item is very close, this procedure is very rapid for humans. The recognition of human faces is introduced next. The stage is then reproduced and used as a model for facial image recognition (face recognition). That's one of the professionally created and well-researched biometrics procedures. The eigenface approach and the Fisher face method are two common face recognition pattern algorithms that have been developed. Recognition of facial images The Eigenface approach is based on the reduction of face dimensional space for facial traits using Principal Component Analysis (PCA). The major goal of applying PCA on face recognition was to generate Eigen faces (face space) by identifying the eigenvector corresponding to the face image's biggest eigenvalue. Image processing and security systems are areas of interest in this research face recognition integrated into a security system. Keywords: face recognition, security systems, camera, python;

Proceedings ArticleDOI
19 Apr 2023
TL;DR: In this article , the authors compare and contrast the most extensively used methodologies and approaches to achieving object detection in real-time, including Faster R-CNN, SSD, and YOLO.
Abstract: One of the major technological advances that aims to more accurately perceive the real world through digital images and videos is computer vision. Amongst the many innovative breakthroughs in this field of study is object detection, a branch of computer vision. This methodology finds instances of specific objects in visual content. The terms object recognition, object identification, and image detection are all mutually inclusive for object detection. By identifying objects in visual images or videos, it enables information systems to "view" their surroundings. This paper compares and contrasts the most extensively used methodologies and approaches to achieving object detection in real time. Faster R-CNN, SSD, and YOLO are three algorithms used in this cultural shift that simplify the process for programmers to identify an instance in an image. The processes that enable the model's parameters to be trained, calculated, approximated, and quantified are compared with the conventional methodologies. The convenience and limitations of each of these techniques are further discussed, analyzed, and tabulated.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed CNN-TransNet, a novel end-to-end Transformer-based architecture with convolutional neural networks (CNNs) for RGB-D object recognition.
Abstract: Object recognition, one of the main goals of robot vision, is a vital prerequisite for service robots to perform domestic tasks. Thanks to the rich sense of information provided by RGB-D sensors, RGB-D-based object recognition has received increasing attention. However, the existing works focus on collaborative RGB and depth data for object recognition, while ignoring the influence of depth image quality on recognition performance. Moreover, in real-world scenarios, there are many objects with strong similarity from certain observation angles, which poses a challenge for the service robot to recognize objects accurately. In this paper, we propose CNN-TransNet, a novel end-to-end Transformer-based architecture with convolutional neural networks (CNNs) for RGB-D object recognition. In order to deal with the effect of high inter-class similarity, discriminative multi-modal feature representations are generated by learning and relating multi-modal features at multiple levels. Besides, we employ a multi-modal fusion and projection (MMFP) module to reweight the contribution of each modality to address the problem of poor-quality depth image. Our proposed approach achieves state-of-the-art performance on three datasets (including Washington RGB-D Object Dataset, JHUIT-50, and Object Clutter Indoor Dataset), with accuracy of 95.4%, 98.1%, and 94.7%, respectively. The results demonstrate the effectiveness and superiority of the proposed model in RGB-D object recognition task.

Book ChapterDOI
01 Jan 2023
TL;DR: In this article , a neural network architecture that includes the method of scale-invariant feature transformation is proposed to solve the problem of identifying singular points that characterize a person's face.
Abstract: Solving the problem of pattern recognition is one of the areas of research in the field of digital video signal processing. Recognition of a person’s face in a real-time video data stream requires the use of advanced algorithms. Traditional recognition methods include neural network architectures for pattern recognition. To solve the problem of identifying singular points that characterize a person’s face, this paper proposes a neural network architecture that includes the method of scale-invariant feature transformation. Experimental modeling showed an increase in recognition accuracy and a decrease in the time required for training in comparison with the known neural network architecture. Software simulation showed reliable recognition of a person’s face at various angles of head rotation and overlapping of a person’s face. The results obtained can be effectively applied in various video surveillance, control and other systems that require recognition of a person’s face.

Proceedings ArticleDOI
28 Mar 2023
TL;DR: Based on detailed literature research and analysis, the authors provides a comprehensive evaluation of the research progress of occluded face recognition and summarizes the main challenges and gives an outlook on the future research development of object detection.
Abstract: The domain of computer vision's most popular study area has always been face recognition which aims to identify different face images and predict the corresponding identity information through feature analysis and modeling. In practical applications, a number of variables, including lighting, posture, and clarity, have an impact on facial recognition accuracy. Among them, the most challenging scene is occlusion face recognition, which will cause feature loss, local coherence and alignment errors to greatly inhibit the accuracy and generalization ability of the face model. Based on detailed literature research and analysis, this paper provides a comprehensive evaluation of the research progress of occluded face recognition. Specifically, based on the introduction of classical face recognition technology, we further discuss the design ideas, basic framework, advantages and disadvantages of representative occlusion face recognition methods from two aspects: robust feature extraction and robust classifier. Finally, we summarize the main challenges and give an outlook on the future research development of object detection.

Proceedings ArticleDOI
17 Mar 2023
TL;DR: In this article , a vision-based system was used to recognize hand sign language and convert it to text, forming structured sentences which will be easy to understand and communicate, which has applications such as communication systems to shorten the bridge gap for people with speech disabilities.
Abstract: Computer vision is not just a concept of deep learning; it has wide applications such as motion recognition, object recognition, video indexing, video media understanding, and recognition-based intelligence. -However, vision-based systems are a challenging field for research and accurate results. Recent areas of interest are human action recognition or human hands gesture recognition techniques using video data set, still, an image data set, spatiotemporal methods, features in RGB, deep learning methods. Hand action recognition has applications such as communication systems to shorten the bridge gap for people with speech disabilities by using a vision-based system to recognize hand sign language and convert it to text, forming structured sentences which will be easy to understand and communicate.

Book ChapterDOI
01 Jan 2023
TL;DR: In this paper , an object recognition approach was proposed by eliminating the "color blindness" of key point extraction methods by using a combination of SIFT, color histograms and contour detection algorithms.
Abstract: Abstract Object recognition is well known to have a high importance in various fields. Example applications are anomaly detection and object sorting. Common methods for object recognition in images divide into neural and non-neural approaches: Neural-based concepts, e.g. using deep learning techniques, require a lot of training data and involve a resource intensive learning process. Additionally, when working with a small number of images, the development effort increases. Common non-neural feature detection approaches, such as SIFT, SURF or AKAZE, do not require these steps for preparation. They are computationally less expensive and often more efficient than the neural-based concepts. On the downside, these algorithms usually require grey-scale images as an input. Thus, information about the color of the reference image cannot be considered as a determinant for recognition. Our objective is to achieve an object recognition approach by eliminating the “color blindness” of key point extraction methods by using a combination of SIFT, color histograms and contour detection algorithms. This approach is evaluated in context of object recognition on a conveyor belt. In this scenario, objects can only be recorded while passing the camera’s field of vision. The approach is divided into three stages: In the first step, Otsu’s method is applied among other computer vision algorithms to perform automatic edge detection for object localization. Within the subsequent second stage, SIFT extracts key points out of the previously identified region of interest. In the last step, color histograms of the specified region are created to distinguish between objects that feature a high similarity in the extracted key points. Only one image is sufficient to serve as a template. We are able to show that developing and applying a concept with a combination of SIFT, histograms and edge detection algorithms successfully compensates the color blindness of the SIFT algorithm. Promising results in the conducted proof of concept are achieved without the need for implementing complex and time consuming methods.

Proceedings ArticleDOI
29 Apr 2023
TL;DR: Zhang et al. as discussed by the authors proposed a shape feature recognition algorithm based on machine vision for real-time target feature recognition using a Baxter robot, which achieved a high accuracy of 94.37% and a running time of 11.97ms.
Abstract: With the rapid development of artificial intelligence and computer technology, Machine Vision (MV) technology has quietly and deeply affected all aspects of our lives. At the same time, Baxter robots are becoming more and more intelligent and automated. MV technology, especially the shape feature recognition algorithm, is applied to the Baxter robot system to provide technical guarantee for accurate and real-time target feature recognition. Based on the subject of "Baxter Robot Shape Feature Recognition Algorithm Based on MV", this paper focuses on the study of shape feature recognition technology. This paper takes Baxter robot as an example to study the key technology of shape feature recognition when the target in the environment is known. This paper designs the Baxter robot vision system, analyzes the system's workflow and functions, and introduces the Baxter robot system modules and related algorithms. In this paper, the algorithm is used to test the shape characteristics of the target object. The results show that the average recognition accuracy rate is 94.37%, and the running time is 11.97ms. From the specific results of target object shape feature detection, the Baxter robot shape feature recognition algorithm based on machine vision achieves real-time and high accuracy of target object shape feature recognition, and has good stability. In practical applications, compared to traditional shape feature recognition algorithms, the Baxter robot shape feature recognition algorithm based on machine vision is more feasible and can effectively assist people in completing specific tasks and targets for shape feature recognition in practical applications.