An improved Yolov3 based on dual path network for cherry tomatoes detection

doi:10.1111/JFPE.13803

Home
/
Papers
/
An improved Yolov3 based on dual path network for cherry tomatoes detection

Journal Article•DOI•

An improved Yolov3 based on dual path network for cherry tomatoes detection

Jiqing Chen¹, Zhikui Wang¹, Jiahua Wu¹, Qiang Hu¹, Zhao Chaoyang¹, Tan Chengzhi¹, Long Teng¹, Tian Luo¹ - Show less +4 more•Institutions (1)

Guangxi University¹

12 Jul 2021-Journal of Food Process Engineering (John Wiley & Sons, Ltd)-Vol. 44, Iss: 10

About: This article is published in Journal of Food Process Engineering.The article was published on 2021-07-12. It has received 20 citations till now.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Benchmark of Deep Learning and a Proposed HSV Colour Space Models for the Detection and Classification of Greenhouse Tomato

[...]

Germano Moreira, Sandro Augusto Magalhães, Tatiana Pinho, Filipe Neves dos Santos, Mário Cunha - Show less +1 more

31 Jan 2022-Agronomy

TL;DR: This paper proposes the use of DL models (SSD MobileNet v2 and YOLOv4) to efficiently detect the tomatoes and compares those systems with a proposed histogram-based HSV colour space model to classify each tomato and determine its ripening stage, through two image datasets acquired.

...read moreread less

Abstract: The harvesting operation is a recurring task in the production of any crop, thus making it an excellent candidate for automation. In protected horticulture, one of the crops with high added value is tomatoes. However, its robotic harvesting is still far from maturity. That said, the development of an accurate fruit detection system is a crucial step towards achieving fully automated robotic harvesting. Deep Learning (DL) and detection frameworks like Single Shot MultiBox Detector (SSD) or You Only Look Once (YOLO) are more robust and accurate alternatives with better response to highly complex scenarios. The use of DL can be easily used to detect tomatoes, but when their classification is intended, the task becomes harsh, demanding a huge amount of data. Therefore, this paper proposes the use of DL models (SSD MobileNet v2 and YOLOv4) to efficiently detect the tomatoes and compare those systems with a proposed histogram-based HSV colour space model to classify each tomato and determine its ripening stage, through two image datasets acquired. Regarding detection, both models obtained promising results, with the YOLOv4 model standing out with an F1-Score of 85.81%. For classification task the YOLOv4 was again the best model with an Macro F1-Score of 74.16%. The HSV colour space model outperformed the SSD MobileNet v2 model, obtaining results similar to the YOLOv4 model, with a Balanced Accuracy of 68.10%.

...read moreread less

27 citations

Journal Article•DOI•

A visual identification method for the apple growth forms in the orchard

[...]

Ji-Dong Lv, Hao Xu, Ying Han, W.X. Lu, Liming Xu, Hailong Rong, Biao Yang, Ling Zou, Zhenghua Ma - Show less +5 more

01 Jun 2022-Computers and Electronics in Agriculture

TL;DR: In this article , an improved YOLOv5 deep learning algorithm was used to propose a visual identification method for the growth forms of apples in the orchard, which achieved high accuracy and real-time performance, with the map reaching 98.4% and the F1 value of 0.928.

...read moreread less

13 citations

Journal Article•DOI•

Tomato Young Fruits Detection Method under Near Color Background Based on Improved Faster R-CNN with Attention Mechanism

[...]

Peng Wang, Tong Niu, Dongjian He

28 Oct 2021-Agriculture

TL;DR: Wang et al. as mentioned in this paper proposed a method for detecting tomato young fruits with near color background based on improved Faster R-CNN with an attention mechanism, and the results show that the mean Average Precision (mAP) of the proposed method reaches 98.46%, and the average detection time per image is only 0.084 s.

...read moreread less

Abstract: The information of tomato young fruits acquisition has an important impact on monitoring fruit growth, early control of pests and diseases and yield estimation. It is of great significance for timely removing young fruits with abnormal growth status, improving the fruits quality, and maintaining high and stable yields. Tomato young fruits are similar in color to the stems and leaves, and there are interference factors, such as fruits overlap, stems and leaves occlusion, and light influence. In order to improve the detection accuracy and efficiency of tomato young fruits, this paper proposes a method for detecting tomato young fruits with near color background based on improved Faster R-CNN with an attention mechanism. First, ResNet50 is used as the feature extraction backbone, and the feature map extracted is optimized through Convolutional Block Attention Module (CBAM). Then, Feature Pyramid Network (FPN) is used to integrate high-level semantic features into low-level detailed features to enhance the model sensitivity of scale. Finally, Soft Non-Maximum Suppression (Soft-NMS) is used to reduce the missed detection rate of overlapping fruits. The results show that the mean Average Precision (mAP) of the proposed method reaches 98.46%, and the average detection time per image is only 0.084 s, which can achieve the real-time and accurate detection of tomato young fruits. The research shows that the method in this paper can efficiently identify tomato young fruits, and provides a better solution for the detection of fruits with near color background.

...read moreread less

10 citations

Journal Article•DOI•

Visual recognition of cherry tomatoes in plant factory based on improved deep instance segmentation

[...]

Penghui Xu, Nan Fang, Na Liu, Fengshan Lin, Shuqin Yang, Jifeng Ning - Show less +2 more

01 Jun 2022-Computers and Electronics in Agriculture

TL;DR: Wang et al. as mentioned in this paper proposed an improved Mask R-CNN for visual recognition of cherry tomatoes, which achieved an accuracy of 93.76% for fruit recognition, which is 11.53% and 15.34% for stem recognition, respectively.

...read moreread less

9 citations

Journal Article•DOI•

Tomato Maturity Classification Based on SE-YOLOv3-MobileNetV1 Network under Nature Greenhouse Environment

[...]

Fei Su, Yanping Zhao, Gang Wang, Pingzeng Liu, Yinfa Yan, Linlu Zu - Show less +2 more

08 Jul 2022-Agronomy

TL;DR: The proposed SE-YOLOv3-MobileNetV1 model is able to distinguish tomatoes in four kinds of maturity, the mean average precision value of tomato reaches 97.5% and provides a reference for tomato maturity classification of tomato harvesting robot.

...read moreread less

Abstract: The maturity level of tomato is a key factor of tomato picking, which directly determines the transportation distance, storage time, and market freshness of postharvest tomato. In view of the lack of studies on tomato maturity classification under nature greenhouse environment, this paper proposes a SE-YOLOv3-MobileNetV1 network to classify four kinds of tomato maturity. The proposed maturity classification model is improved in terms of speed and accuracy: (1) Speed: Depthwise separable convolution is used. (2) Accuracy: Mosaic data augmentation, K-means clustering algorithm, and the Squeeze-and-Excitation attention mechanism module are used. To verify the detection performance, the proposed model is compared with the current mainstream models, such as YOLOv3, YOLOv3-MobileNetV1, and YOLOv5 in terms of accuracy and speed. The SE-YOLOv3-MobileNetV1 model is able to distinguish tomatoes in four kinds of maturity, the mean average precision value of tomato reaches 97.5%. The detection speed of the proposed model is 278.6 and 236.8 ms faster than the YOLOv3 and YOLOv5 model. In addition, the proposed model is considerably lighter than YOLOv3 and YOLOv5, which meets the need of embedded development, and provides a reference for tomato maturity classification of tomato harvesting robot.

...read moreread less

8 citations

1
2
3
4
…

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

123,388 citations

Proceedings Article•DOI•

Densely Connected Convolutional Networks

[...]

Gao Huang¹, Zhuang Liu², Laurens van der Maaten³, Kilian Q. Weinberger¹•Institutions (3)

Cornell University¹, Tsinghua University², Facebook³

21 Jul 2017

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

...read moreread less

Abstract: Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections—one between each layer and its subsequent layer—our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet.

...read moreread less

27,821 citations

Proceedings Article•DOI•

You Only Look Once: Unified, Real-Time Object Detection

[...]

Joseph Redmon¹, Santosh K. Divvala², Ross Girshick³, Ali Farhadi²•Institutions (3)

University of Washington¹, Allen Institute for Artificial Intelligence², Facebook³

27 Jun 2016

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Abstract: We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

27,256 citations

Journal Article•DOI•

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

[...]

Shaoqing Ren¹, Kaiming He², Ross Girshick³, Jian Sun²•Institutions (3)

University of Science and Technology of China¹, Microsoft², Facebook³

01 Jun 2017-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

Abstract: State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features—using the recently popular terminology of neural networks with ’attention’ mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3] , our detection system has a frame rate of 5 fps ( including all steps ) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

...read moreread less

26,458 citations

Posted Content•

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

[...]

Shaoqing Ren¹, Kaiming He², Ross Girshick³, Jian Sun²•Institutions (3)

University of Science and Technology of China¹, Microsoft², Facebook³

04 Jun 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

Abstract: State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features---using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

...read moreread less

23,183 citations