scispace - formally typeset
Search or ask a question
Author

Bipul Neupane

Bio: Bipul Neupane is an academic researcher from Sirindhorn International Institute of Technology. The author has contributed to research in topics: Computer science & Artificial intelligence. The author has an hindex of 2, co-authored 5 publications receiving 39 citations.

Papers
More filters
Journal ArticleDOI
17 Oct 2019-PLOS ONE
TL;DR: A deep learning (DL) based method to precisely detect and count banana plants on a farm exclusive of other plants, using high resolution RGB aerial images collected from Unmanned Aerial Vehicle (UAV).
Abstract: The production of banana—one of the highly consumed fruits—is highly affected due to loss of certain number of banana plants in an early phase of vegetation. This affects the ability of farmers to forecast and estimate the production of banana. In this paper, we propose a deep learning (DL) based method to precisely detect and count banana plants on a farm exclusive of other plants, using high resolution RGB aerial images collected from Unmanned Aerial Vehicle (UAV). An attempt to detect the plants on the normal RGB images resulted less than 78.8% recall for our sample images of a commercial banana farm in Thailand. To improve this result, we use three image processing methods—Linear Contrast Stretch, Synthetic Color Transform and Triangular Greenness Index—to enhance the vegetative properties of orthomosaic, generating multiple variants of orthomosaic. Then we separately train a parameter-optimized Convolutional Neural Network (CNN) on manually interpreted banana plant samples seen on each image variants, to produce multiple results of detection on our region of interest. 96.4%, 85.1% and 75.8% of plants were correctly detected on three of our dataset collected from multiple altitude of 40, 50 and 60 meters, of same farm. Further discussion on results obtained from combination of multiple altitude variants are also discussed later in the research, in an attempt to find better altitude combination for data collection from UAV for the detection of banana plants. The results showed that merging the detection results of 40 and 50 meter dataset could detect the plants missed by each other, increasing recall upto 99%.

81 citations

Journal ArticleDOI
TL;DR: In this paper, a review and meta-analysis of deep learning-based semantic segmentation in urban remote sensing images is presented. But, the focus of this paper is on urban sensing images.
Abstract: Availability of very high-resolution remote sensing images and advancement of deep learning methods have shifted the paradigm of image classification from pixel-based and object-based methods to deep learning-based semantic segmentation. This shift demands a structured analysis and revision of the current status on the research domain of deep learning-based semantic segmentation. The focus of this paper is on urban remote sensing images. We review and perform a meta-analysis to juxtapose recent papers in terms of research problems, data source, data preparation methods including pre-processing and augmentation techniques, training details on architectures, backbones, frameworks, optimizers, loss functions and other hyper-parameters and performance comparison. Our detailed review and meta-analysis show that deep learning not only outperforms traditional methods in terms of accuracy, but also addresses several challenges previously faced. Further, we provide future directions of research in this domain.

56 citations

Journal ArticleDOI
01 May 2022-Sensors
TL;DR: This work trained and applied transfer learning-based fine-tuning on several state-of-the-art YOLO (You Only Look Once) networks and proposed a multi-vehicle tracking algorithm that obtains the per-lane count, classification, and speed of vehicles in real time.
Abstract: Accurate vehicle classification and tracking are increasingly important subjects for intelligent transport systems (ITSs) and for planning that utilizes precise location intelligence. Deep learning (DL) and computer vision are intelligent methods; however, accurate real-time classification and tracking come with problems. We tackle three prominent problems (P1, P2, and P3): the need for a large training dataset (P1), the domain-shift problem (P2), and coupling a real-time multi-vehicle tracking algorithm with DL (P3). To address P1, we created a training dataset of nearly 30,000 samples from existing cameras with seven classes of vehicles. To tackle P2, we trained and applied transfer learning-based fine-tuning on several state-of-the-art YOLO (You Only Look Once) networks. For P3, we propose a multi-vehicle tracking algorithm that obtains the per-lane count, classification, and speed of vehicles in real time. The experiments showed that accuracy doubled after fine-tuning (71% vs. up to 30%). Based on a comparison of four YOLO networks, coupling the YOLOv5-large network to our tracking algorithm provided a trade-off between overall accuracy (95% vs. up to 90%), loss (0.033 vs. up to 0.036), and model size (91.6 MB vs. up to 120.6 MB). The implications of these results are in spatial information management and sensing for intelligent transport planning.

16 citations

Proceedings ArticleDOI
01 Jul 2020
TL;DR: This study uses deep neural network (DNN) to extract face and facial landmarks, followed by calculation of Eye Aspect Ratio, Mouth Aspect ratio and pose estimation using 3D location and intrinsic parameters to detect whether or not driver is focused into driving.
Abstract: Automated analysis of driver's attention to driving is crucial to reduce traffic accidents Real-time detection of initial signs of drowsiness or distraction of driver has yet not been most reliable This study is therefore oriented towards improving this problem by strategically detecting some of the necessary parameters like closing of eye, yawning, and head orientation of driver We use deep neural network (DNN) to extract face and facial landmarks, followed by calculation of Eye Aspect Ratio (EAR), Mouth Aspect Ratio (MAR) and pose estimation using 3D location and intrinsic parameters to detect whether or not driver is focused into driving Our experimental results shows better real-time performance than traditional methods in many aspects, which is why we further deploy the algorithm as a mobile application

4 citations

Journal ArticleDOI
TL;DR: The overall method of fine-tuning the modified U-Net reduces the number of training parameters by 300 times and training time by 2.5 times while preserving the precision of segmentation.
Abstract: Abstract. Earth observation data including very high-resolution (VHR) imagery from satellites and unmanned aerial vehicles (UAVs) are the primary sources for highly accurate building footprint segmentation and extraction. However, with the increase in spatial resolution, smaller objects are prominently visible in the images, and using intelligent approaches like deep learning (DL) suffers from several problems. In this paper, we outline four prominent problems while using DL-based methods (P1, P2, P3, and P4): (P1) lack of contextual features, (P2) requirement of a large training dataset, (P3) domain-shift problem, and (P4) computational expense. In tackling P1, we modify a commonly used DL architecture called U-Net to increase the contextual feature information. Likewise, for P2 and P3, we use transfer learning to fine-tune the DL model on a smaller dataset utilising the knowledge previously gained from a larger dataset. For P4, we study the trade-off between the network’s performance and computational expense with reduced training parameters and optimum learning rates. Our experiments on a case study from the City of Melbourne show that the modified U-Net is highly robust than the original U-Net and SegNet, and the dataset we develop is significantly more robust than an existing benchmark dataset. Furthermore, the overall method of fine-tuning the modified U-Net reduces the number of training parameters by 300 times and training time by 2.5 times while preserving the precision of segmentation.

3 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This review introduces the principles of CNN and distils why they are particularly suitable for vegetation remote sensing, including considerations about spectral resolution, spatial grain, different sensors types, modes of reference data generation, sources of existing reference data, as well as CNN approaches and architectures.
Abstract: Identifying and characterizing vascular plants in time and space is required in various disciplines, e.g. in forestry, conservation and agriculture. Remote sensing emerged as a key technology revealing both spatial and temporal vegetation patterns. Harnessing the ever growing streams of remote sensing data for the increasing demands on vegetation assessments and monitoring requires efficient, accurate and flexible methods for data analysis. In this respect, the use of deep learning methods is trend-setting, enabling high predictive accuracy, while learning the relevant data features independently in an end-to-end fashion. Very recently, a series of studies have demonstrated that the deep learning method of Convolutional Neural Networks (CNN) is very effective to represent spatial patterns enabling to extract a wide array of vegetation properties from remote sensing imagery. This review introduces the principles of CNN and distils why they are particularly suitable for vegetation remote sensing. The main part synthesizes current trends and developments, including considerations about spectral resolution, spatial grain, different sensors types, modes of reference data generation, sources of existing reference data, as well as CNN approaches and architectures. The literature review showed that CNN can be applied to various problems, including the detection of individual plants or the pixel-wise segmentation of vegetation classes, while numerous studies have evinced that CNN outperform shallow machine learning methods. Several studies suggest that the ability of CNN to exploit spatial patterns particularly facilitates the value of very high spatial resolution data. The modularity in the common deep learning frameworks allows a high flexibility for the adaptation of architectures, whereby especially multi-modal or multi-temporal applications can benefit. An increasing availability of techniques for visualizing features learned by CNNs will not only contribute to interpret but to learn from such models and improve our understanding of remotely sensed signals of vegetation. Although CNN has not been around for long, it seems obvious that they will usher in a new era of vegetation remote sensing.

473 citations

Journal ArticleDOI
TL;DR: Combining high resolution satellite imagery data with advanced machine learning (ML) models through the use of mobile apps could help detect and classify banana plants and provide more information on its overall health status and has high potential to provide a decision support system for major banana diseases in Africa.
Abstract: Front-line remote sensing tools, coupled with machine learning (ML), have a significant role in crop monitoring and disease surveillance. Crop type classification and a disease early warning system are some of these remote sensing applications that provide precise, timely, and cost-effective information at different spatial, temporal, and spectral resolutions. To our knowledge, most disease surveillance systems focus on a single-sensor based solutions and lagging the integration of multiple information sources. Moreover, monitoring larger landscapes using unmanned aerial vehicles (UAV) are challenging, and, therefore combining high resolution satellite imagery data with advanced machine learning (ML) models through the use of mobile apps could help detect and classify banana plants and provide more information on its overall health status. In this study, we classified banana under mixed-complex African landscapes through pixel-based classifications and ML models derived from multi-level satellite images (Sentinel 2, PlanetScope and WorldView-2) and UAV (MicaSense RedEdge) platforms. Our pixel-based classification from random forest (RF) model using combined features of vegetation indices (VIs) and principal component analysis (PCA) showed up to 97% overall accuracy (OA) with less than 10% omission and commission errors (OE and CE) and Kappa coefficient of 0.96 in high resolution multispectral images. We used UAV-RGB aerial images from DR Congo and Republic of Benin fields to develop a mixed-model system combining object detection model (RetinaNet) and a custom classifier for simultaneous banana localization and disease classification. Their accuracies were tested using different performance metrics. Our UAV-RGB mixed-model revealed that the developed object detection and classification model successfully classified healthy and diseased plants with 99.4%, 92.8%, 93.3% and 90.8% accuracy for the four classes: banana bunchy top disease (BBTD), Xanthomonas Wilt of Banana (BXW), healthy banana cluster and individual banana plants, respectively. These approaches of aerial image-based ML models have high potential to provide a decision support system for major banana diseases in Africa.

92 citations

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors used Mask R-CNN to detect tree crown and height of Chinese fir in a plantation image using a DJI Phantom4-Multispectral Unmanned Aerial Vehicle.
Abstract: Tree-crown and height are primary tree measurements in forest inventory. Convolutional neural networks (CNNs) are a class of neural networks, which can be used in forest inventory; however, no prior studies have developed a CNN model to detect tree crown and height simultaneously. This study is the first-of-its-kind that explored training a mask region-based convolutional neural network (Mask R-CNN) for automatically and concurrently detecting discontinuous tree crown and height of Chinese fir (Cunninghamia lanceolata (Lamb) Hook) in a plantation. A DJI Phantom4-Multispectral Unmanned Aerial Vehicle (UAV) was used to obtain high-resolution images of the study site, Shunchang County, China. Tree crown and height of Chinese fir was manually delineated and derived from this UAV imagery. A portion of the ground-truthed tree height values were used as a test set, and the remaining measurements were used as the model training data. Six different band combinations and derivations of the UAV imagery were used to detect tree crown and height, respectively (Multi band-DSM, RGB-DSM, NDVI-DSM, Multi band-CHM, RGB-CHM, and NDVI-CHM combination). The Mask R-CNN model with the NDVI-CHM combination achieved superior performance. The accuracy of Chinese fir’s individual tree-crown detection was considerable (F1 score = 84.68%), the Intersection over Union (IoU) of tree crown delineation was 91.27%, and tree height estimates were highly correlated with the height from UAV imagery (R2 = 0.97, RMSE = 0.11 m, rRMSE = 4.35%) and field measurement (R2 = 0.87, RMSE = 0.24 m, rRMSE = 9.67%). Results demonstrate that the input image with an CHM achieves higher accuracy of tree crown delineation and tree height assessment compared to an image with a DSM. The accuracy and efficiency of Mask R-CNN has a great potential to assist the application of remote sensing in forests.

59 citations

Journal ArticleDOI
TL;DR: In this paper, a review and meta-analysis of deep learning-based semantic segmentation in urban remote sensing images is presented. But, the focus of this paper is on urban sensing images.
Abstract: Availability of very high-resolution remote sensing images and advancement of deep learning methods have shifted the paradigm of image classification from pixel-based and object-based methods to deep learning-based semantic segmentation. This shift demands a structured analysis and revision of the current status on the research domain of deep learning-based semantic segmentation. The focus of this paper is on urban remote sensing images. We review and perform a meta-analysis to juxtapose recent papers in terms of research problems, data source, data preparation methods including pre-processing and augmentation techniques, training details on architectures, backbones, frameworks, optimizers, loss functions and other hyper-parameters and performance comparison. Our detailed review and meta-analysis show that deep learning not only outperforms traditional methods in terms of accuracy, but also addresses several challenges previously faced. Further, we provide future directions of research in this domain.

56 citations

Journal ArticleDOI
TL;DR: In this paper , a review of the DL-based single image super-resolution (SISR) methods on optical remote sensing images is presented, including DL models, commonly used remote sensing datasets, loss functions, and performance evaluation metrics.

52 citations