scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Image and Graphics in 2021"


Journal ArticleDOI
Lei Wang, Biao Liu, Shaohua Xu, Ji Pan, Qi Zhou 
TL;DR: A Deep Learning (DL) method to assistant radiologists quickly and accurately labeling and classifying the lesions of the breast ultrasound images and shows that the classification accuracy, sensitivity and specificity are achieved.
Abstract: In this paper we developed a Deep Learning (DL) method to assistant radiologists quickly and accurately labeling and classifying the lesions of the breast ultrasound images. A faster R-CNN detector was trained to label and classify the lesions with the Breast Imaging Reporting and Data System (BI-RADS). The initial trained model used 2000 labeled images. From the testing results with 6000 images, we got poor accuracy. Therefore, we developed the second DL model with 4294-image set in which the images of BI-RADS 4 were removed. Then the second DL model was tested by 1000 images and used to classify 1836 images of BI-RADS 4. The results show that the classification accuracy, sensitivity and specificity are achieved as 92.37%, 98.34%, and 82.46%, respectively when it used to classify the BI-DADS 4 images into 4A and 4B, and 98.10%, 97.78% and 98.13%, respectively when it is used for breast cancer screening.

18 citations


Journal ArticleDOI
TL;DR: A novel method for automatic abdominal multi-organ segmentation by introducing spatial information in the process of supervoxel classification is proposed, which achieves a higher accuracy of segmentation compares to the previous model-based method.
Abstract: Multi-organ segmentation is a critical step in Computer-Aided Diagnosis (CAD) system. We proposed a novel method for automatic abdominal multi-organ segmentation by introducing spatial information in the process of supervoxel classification. Supervoxels with boundaries adjacent to anatomical edges are separated from the image by using the Simple Linear Iterative Clustering (SLIC) from the images. Then a random forest classifier is built to predict the labels of the supervoxels according to their spatial and intensity features. Thirty abdominal CT images are used in the experiment of segmentation task for spleen, right kidney, left kidney, and liver region. The experiment result shows that the proposed method achieves a higher accuracy of segmentation compares to our previous model-based method.

15 citations


Journal ArticleDOI
TL;DR: A 3D-based facial microexpression analysis method based on high-speed structured light is proposed, and a 4D descriptor is introduced to describe the timing characteristic of facial micro-expression.
Abstract: Facial micro-expressions play a pivotal role in human non-verbal emotional expression, so it can be used in many fields such as criminal interrogation, clinical diagnosis and animation. Conventional means are usually based on 2D image analysis means, which has shown its disadvantage in real applications. In this paper, a 3D-based facial microexpression analysis method is firstly proposed. The 3D acquisition equipment based on high-speed structured light is established to capture dense and accurate 3D facial shapes with a maximum speed of 300Hz. Facial micro-expression detection method is put forward to extract the onset and offset of micro-expression sequence. Finally, a 4D descriptor is introduced to describe the timing characteristic of facial micro-expression. Experiments on real human faces are used to verify feasibility of the proposed system and method.

10 citations


Journal ArticleDOI
TL;DR: A method to generate lesion images by a Conditional Generative Adversarial Networks (CGAN) is proposed and the effectiveness of the proposed method is shown by the accuracy of liver cancer detection from CT images.
Abstract: In the diagnosis of the abdomen, CT images taken under various conditions are visually checked by multiple doctors. Since diagnosing CT images requires doctors to take time and effort, a Computer-Aided Diagnosis system (CAD) based on a machine learning technique is expected. It is, however, difficult to collect a large number of case images for machine learning. In this paper, we propose a method to generate lesion images by a Conditional Generative Adversarial Networks (CGAN) and show the effectiveness of the proposed method by the accuracy of liver cancer detection from CT images. A CGAN which generates pseudo lesion images is trained with real lesion images labeled with “edge” and “non-edge” of the liver. We confirmed that the proposed method achieved the detection rate of 0.85 and the false positives per case of 0.20. The detection accuracy was higher than that of a conventional method.

9 citations


Journal ArticleDOI
TL;DR: A review of the latest research methods for multi-person human anomalous action recognition is presented, which deeply analyzes the calculation method of activities and briefly describes popular datasets.
Abstract: In recent years, computer vision technology capable of detecting human behavior has attracted more and more attention. Although it has been widely used in many applications, accurate and effective human motion recognition is still a challenge in the field of computer vision. This paper presents a review of the latest research methods for multi-person human anomalous action recognition. The article deeply analyzes the calculation method of activities and briefly describes popular datasets. Discussing the unresolved issues, it provides new ideas for future research.

5 citations


Journal ArticleDOI
TL;DR: This paper proposes a method to generate training data which is effective for learning models for outlier correction and corrects outliers included in pose estimation results.
Abstract: Human pose estimation is an active research topic since for decades, and it has immediate applications in various tasks such as action understanding. Although accurate pose estimation is an important requirement, joint occlusion and various gestures of a person often result in deviated pose predictions. In this paper, we aim to correct such outliers included in pose estimation results. We propose a method to generate training data which is effective for learning models for outlier correction.

4 citations


Journal ArticleDOI
TL;DR: In this research, the face detection APIs from five of the top public cloud vendors of facial recognition software have been tested and evaluated to establish which vendor performs the best for accuracy and to find any significant differences between the vendor APIs.
Abstract: The ability to process human face information is crucial in many areas of government, business, and social media. Facial recognition provides businesses with the ability to provide services that include security, robotics, analysis, human resources, mobile applications, and user interfaces. Users can access their accounts and sign off transactions online just by taking a ‘selfie’. Machine Learning algorithms have been developed for face detection in media such as picture images. To recognise a face, the camera software must first detect it and identify the features before making an identification. Face detection is the first step of face recognition. In this research, the face detection APIs from five of the top public cloud vendors of facial recognition software have been tested and evaluated to establish which vendor performs the best for accuracy and to find any significant differences between the vendor APIs. The attributes tested were ‘Gender’ and ‘Age’. Surprisingly, the vendor Amazon Rekognition, IBM and FaceX only offered the attribute age as a range value rather than committing to an exact age. This immediately diminishes the accuracy of their respective APIs. The research proves the weaknesses in API accuracy by testing the resilience of the vendor APIs against degraded images. Azure was the overall winner with Rekognition in second place, Kairos in third, fourth place was IBM and FaceX took last place.

4 citations


Journal ArticleDOI
TL;DR: This paper investigates eight binary descriptors and eight interest point detector and evaluated the performance of the pairwise combination of different detectors and descriptors under different image transformations.
Abstract: Detecting image correspondences by feature matching forms the basis of numerous computer vision applications. Several detectors and descriptors have been presented in the past, addressing the efficient generation of features from interest points (keypoints) in an image. In this paper, we investigate eight binary descriptors (AKAZE, BoostDesc, BRIEF, BRISK, FREAK, LATCH, LUCID, and ORB) and eight interest point detector (AGAST, AKAZE, BRISK, FAST, HarrisLapalce, KAZE, ORB, and StarDetector). We have decoupled the detection and description phase to analyze the interest point detectors and then evaluate the performance of the pairwise combination of different detectors and descriptors. We conducted experiments on a standard dataset and analyzed the comparative performance of each method under different image transformations. We observed that: (1) the FAST, AGAST, ORB detectors were faster and detected more keypoints, (2) the AKAZE and KAZE detectors performed better under photometric changes while ORB was more robust against geometric changes, (3) in general, descriptors performed better when paired with the KAZE and AKAZE detectors, (4) the BRIEF, LUCID, ORB descriptors were relatively faster, and (5) none of the descriptors did particularly well under geometric transformations, only BRISK, FREAK, and AKAZE showed reasonable resiliency.

3 citations


Journal ArticleDOI
TL;DR: In this paper, the influence of colour psychology on the production of media façade to enhance its narrative ability is analyzed, and the common feature of colour composition is expressed in time conversion and seasonal changes.
Abstract: In this paper, we propose to interrelate the scene’s colour composition design of media façade with colour psychology, to intuitively reach a more expressive form, richer artistic value, and enhance its narrative ability. It can also solve the problem of visual fatigue caused by the excessive colour of the media façade. With the advent of media facade technology, is getting widely used in the field of media technology as a ubiquitous technique. Due to the influence of the external environment, the content is mainly displayed to the audience through various basic colour pattern transformations. However, the case of the narrative media facade is often endowed with certain artistic value and thus is more acceptable to the public. Therefore, how to use a single colour transformation to express the narrative way as delicate as a movie is a problem worth exploring. Aiming at this situation, this paper analyzes the influence of colour psychology on the production of media façade to enhance its narrative. Through the comparison of two cases with the same attribute at different times, the common feature of colour composition is expressed in time conversion and seasonal changes.

3 citations


Journal ArticleDOI
TL;DR: This study aims to actualize the autonomous movement of a robot using a navigation system with a camera, instead of expensive external sensors such as light detection and ranging, and focuses on the accuracy improvement of the intersection detection and recognition.
Abstract: This study is an attempt to actualize the autonomous movement of a robot using a navigation system with a camera, instead of expensive external sensors such as light detection and ranging. The present implementation of our approach basically consists of road-following, intersection detection, and intersection recognition, using the results of semantic segmentation. In this study, we focus on the accuracy improvement of the intersection detection and recognition. Classifiers are constructed for these tasks using deep neural networks. We evaluated the proposed classifier using three-dimensional computer graphics generated from the CARLA simulator and the Ikuta dataset composed of actual images that we took. The Experimental results demonstrated that the proposed system could detect and recognize intersections accurately; the F measure exceeded 0.96 for detection, and the actual images were recognized and classified with perfect accuracy.

3 citations


Journal Article
TL;DR: In this paper, a lane detection model based on vision and spatial distribution is proposed to eliminate the influence of environment on lane detection, which can provide accurate lane information for the development of intelligent driving system.
Abstract: Objective: Intelligent connected vehicles are an important direction in intelligent transportation in China In the development of intelligent networked vehicle systems, the detection of lane markings in complex environments is a key link The safety of drug delivery, meal transport, and medical waste recovery can be guaranteed if the unmanned driving and intelligent network connected vehicle technology can be applied to epidemic prevention and control, especially in the epidemic of COVID-2019 The frequency of contact between medical staff and patients and the risk of cross infection of virus can be reduced However, the current lane detection algorithms are mostly based on visual feature information, such as color, gray level, and edge The accuracy of model detection is greatly affected by the environment This condition makes the accuracy of existing lane detection algorithms difficult to meet the performance requirements of intelligent networked vehicles The length, width, and direction of lanes have strong regularity, and they have the characteristics of serialization and structure association These characteristics are not affected by visibility, weather, and obstacles Vision-based lane detection method has high accuracy in scenes with high definition and without obstacles For this reason, a lane detection model based on vision and spatial distribution is proposed to eliminate the influence of environment on lane detection Our research can provide accurate lane information for the development of intelligent driving system Method: When a traffic image set is transformed into a bird's eye view, its original scale changes, and the lane interval is short The you only look once v3 (YOLO v3) algorithm has significant advantages in speed and accuracy of detecting small objects Thus, it is used as lane detector in this study However, the distribution density of lane in the longitudinal direction is greater than that in the horizontal direction The network structure of YOLO v3 is improved by increasing the vertical detection density to reduce the influence of the change in aspect ratio on target detection The image is divided into S × 2S grids during lane detection, and the obtained YOLO v3 (S × 2S) is suitable for lane detection However, the YOLO v3 (S × 2S) lane detection algorithm ignores the spatial information of lane In the case of poor light and vehicle occlusion, the accuracy of lane detection is poor Bidirectional gated recurrent unit-lane, (BGRU-L), a lane detection model based on lane distribution law, is proposed by considering that the spatial distribution of lane is unaffected by the environment This model is used to improve the generalization ability of the lane detection model in complex scenes This study combines visual information and spatial distribution relationship to avoid the large error of single lane detector and effectively reduce the uncertainty of the system A confidence-based Dempster-Shafer (D-S) algorithm is used to fuse the detection results of YOLO v3 (S×2S) and BGRU-L detection results for guaranteeing the output of the optimal lane position Result: Karlsruhe Institute of Technology and Toyoko Technological Institute(KITTI) is a commonly used traffic dataset and includes scenes, such as sunny, cloudy, highway, and urban roads The scenes are increased under complicated working conditions, such as rain, tunnel, and night, to ensure coverage In this study, the scene in a game, Euro Truck Simulator 2 (ETS2), is used as a supplement dataset ETS2 is divided into two categories: conventional scene ETS2_conv (sunny, cloudy) and comprehensive scene ETS2_comp (sunny, cloudy, night, rain, and tunnel), to accurately evaluate the effectiveness of the algorithm On the KITTI dataset, the accuracy of YOLO v3 (S×2S) detection is improved with the increase in detection grid density of YOLO v3, with mean average precision (mAP) of 88 39% BGRU-L uses the spatial distribution relationship of the lane sequence to detect the location of lane, and the mAP is 76 14% The reliabilit -based D-S algorithm is used to fuse the lane detection results of YOLO v3 (S×2S) and BGRU-L, and the final mAP of lane detection is raised to 90 28% On the ETS2 dataset, the mAP values in the ETS2_conv (Euro Truck Simulator 2 convention, ETS2_conv) and ETS2_complex (Euro Truck Simulator 2 complex, ETS2_complex) scenarios are 92 49% and 91 73%, respectively, by using the lane detection model that combines visual information and spatial distribution relationships Conclusion: This study explores the detection schemes based on machine vision and the spatial distribution relationship of lane to address the difficulty in accurately detecting lanes in complex scenes On the basis of the characteristics of inconsistent distribution density of lane in bird's eye view, the obtained model, YOLO v3 (S×2S), is suitable for the detection of small-size and large aspect ratio targets by improving the grid density of YOLO v3 model Experimental results show that the YOLO v3 (S×2S) is significantly higher than YOLO v3 in terms of lane detection accuracy The lane detection model based on visual information has certain limitations and cannot achieve high-precision detection requirements in complex scenes However, the length, width, and direction of lane have strong regularity and has the characteristics of serialization and structural correlation BGRU-L, a lane prediction model based on the spatial distribution of lane, is unaffected by the environment and has strong generalization ability in rain, night, tunnel, and other scenarios This study uses the D-S algorithm based on confidence to fuse the detection results of YOLO v3 (S×2S) and BGRU-L to avoid the large errors that may exist in the single lane detection model and effectively reduce the uncertainty of the system The results of lane detection in complex scenes can meet the requirements of intelligent vehicles © 2021, Editorial and Publishing Board of Journal of Image and Graphics All right reserved

Journal ArticleDOI
TL;DR: The goal was to observe to what extent the Health Information Technology Acceptance Model patterns and outlines EU use of AT are observed and whether the obtained model can be an initial quantitative working primary tool for designers using the EU design method.
Abstract: ‘Extreme Users’ (EU) is a design method in Human Computer Interaction, which allows user-centered design in design groups. ‘Acceptance Models’ is a theory in Information Systems, which models how users accept and use technology. We conducted a study to explore the relationships of the factors influencing Extreme Athletes in the acceptance and use of Activity Trackers (AT). The data was collected from a cross-sectional survey conducted using a self-selected convenience sample of 206. The research rendered an exploration and an examination of the factors affecting trail-running athletes. The results were analyzed using several statistical techniques including Structural Equation Analysis. Our goal was to observe to what extent the Health Information Technology Acceptance Model patterns and outlines EU use of AT. This contribution, to the best of our knowledge, is new given that the obtained model can be an initial quantitative working primary tool for designers using the EU design method.

Journal ArticleDOI
TL;DR: It is revealed that the intrinsic connection of iris texture features under the large scale of the features of neural networks is highly valuable for effectively eliminating the interference of textured contact lenses.
Abstract: Iris recognition systems suffer from a new challenge brought by various textured contact lenses, as they can change the appearance of iris texture. To deal with this challenge, conventional methods use Gray-Gradient Matrix and Gray-Level Run-Length Matrix (GLRLM) to extract iris texture features, and use Support Vector Machines (SVM) for authenticity classification. These methods only pay attention to the statistical value of feature matrix, but they ignore the details of texture features and isolate inherent connections between these texture details. This paper reveals that the intrinsic connection of iris texture features under the large scale of the features of neural networks is highly valuable for effectively eliminating the interference of textured contact lenses. Under this premise, we propose a novel iris anti-counterfeit detection method based on an improved Gray Level Cooccurrence Matrix (Modified-GLCM) combined with a binary classification neural network. The experimental results show that the proposed method outperforms the conventional texture analysis methods using feature statistical characteristics and the best result of LivDet-Iris2017. What’s more, we analysis and verify the potential threat of the iris adversarial sample on the iris presentation attack detection algorithm through iris texture extraction.