scispace - formally typeset
Search or ask a question

Showing papers on "Distance transform published in 2021"


Journal ArticleDOI
TL;DR: In this article, a multi-task deep learning distance predictor (DeepDist) based on new residual convolutional network architectures was proposed to simultaneously predict real-value inter-residue distances and classify them into multiple distance intervals.
Abstract: Driven by deep learning, inter-residue contact/distance prediction has been significantly improved and substantially enhanced ab initio protein structure prediction. Currently, most of the distance prediction methods classify inter-residue distances into multiple distance intervals instead of directly predicting real-value distances. The output of the former has to be converted into real-value distances to be used in tertiary structure prediction. To explore the potentials of predicting real-value inter-residue distances, we develop a multi-task deep learning distance predictor (DeepDist) based on new residual convolutional network architectures to simultaneously predict real-value inter-residue distances and classify them into multiple distance intervals. Tested on 43 CASP13 hard domains, DeepDist achieves comparable performance in real-value distance prediction and multi-class distance prediction. The average mean square error (MSE) of DeepDist’s real-value distance prediction is 0.896 A2 when filtering out the predicted distance ≥ 16 A, which is lower than 1.003 A2 of DeepDist’s multi-class distance prediction. When distance predictions are converted into contact predictions at 8 A threshold (the standard threshold in the field), the precision of top L/5 and L/2 contact predictions of DeepDist’s multi-class distance prediction is 79.3% and 66.1%, respectively, higher than 78.6% and 64.5% of its real-value distance prediction and the best results in the CASP13 experiment. DeepDist can predict inter-residue distances well and improve binary contact prediction over the existing state-of-the-art methods. Moreover, the predicted real-value distances can be directly used to reconstruct protein tertiary structures better than multi-class distance predictions due to the lower MSE. Finally, we demonstrate that predicting the real-value distance map and multi-class distance map at the same time performs better than predicting real-value distances alone.

28 citations


Book ChapterDOI
27 Sep 2021
TL;DR: Wang et al. as discussed by the authors proposed a multi-level boundary shape and distance aware joint learning framework, named BSDA-Net, for FAZ segmentation and diagnostic classification from OCTA images.
Abstract: Optical coherence tomography angiography (OCTA) is a novel non-invasive imaging technique that allows visualizations of vasculature and foveal avascular zone (FAZ) across retinal layers. Clinical researches suggest that the morphology and contour irregularity of FAZ are important biomarkers of various ocular pathologies. Therefore, precise segmentation of FAZ has great clinical interest. Also, there is no existing research reporting that FAZ features can improve the performance of deep diagnostic classification networks. In this paper, we propose a novel multi-level boundary shape and distance aware joint learning framework, named BSDA-Net, for FAZ segmentation and diagnostic classification from OCTA images. Two auxiliary branches, namely boundary heatmap regression and signed distance map reconstruction branches, are constructed in addition to the segmentation branch to improve the segmentation performance, resulting in more accurate FAZ contours and fewer outliers. Moreover, both low-level and high-level features from the aforementioned three branches, including shape, size, boundary, and signed directional distance map of FAZ, are fused hierarchically with features from the diagnostic classifier. Through extensive experiments, the proposed BSDA-Net is found to yield state-of-the-art segmentation and classification results on the OCTA-500, OCTAGON, and FAZID datasets.

22 citations


Journal ArticleDOI
TL;DR: Through a set of computational and optimization efficiencies, the approach is able to apply in complex images comprised of a number of overlapped regions and shows superior accuracy and flexibility of the method in ellipse recognition, relative to other methods.
Abstract: Recognition of overlapping objects is required in many applications in the field of computer vision. Examples include cell segmentation, bubble detection and bloodstain pattern analysis. This paper presents a method to identify overlapping objects by approximating them with ellipses. The method is intended to be applied to complex-shaped regions which are believed to be composed of one or more overlapping objects. The method has two primary steps. First, a pool of candidate ellipses are generated by applying the Euclidean distance transform on a compressed image and the pool is filtered by an overlaying method. Second, the concave points on the contour of the region of interest are extracted by polygon approximation to divide the contour into segments. Then, the optimal ellipses are selected from among the candidates by choosing a minimal subset that best fits the identified segments. We propose the use of the adjusted Rand index, commonly applied in clustering, to compare the fitting result with ground truth. Through a set of computational and optimization efficiencies, we are able to apply our approach in complex images comprised of a number of overlapped regions. Experimental results on a synthetic data set, two types of cell images and bloodstain patterns show superior accuracy and flexibility of our method in ellipse recognition, relative to other methods.

17 citations


Journal ArticleDOI
TL;DR: The design of grade-separated road and railway crossings can be complicated and time-consuming since the intersection point as well as the road and rail alignments should jointly optimize an object as mentioned in this paper.
Abstract: The design of grade-separated road and railway crossings can be complicated and time-consuming since the intersection point as well as the road and rail alignments should jointly optimize an object...

13 citations


Journal ArticleDOI
TL;DR: In this article, a 3D-DT-based concurrent optimization method is proposed for railway alignment and station locations in a complex mountainous region, which can find high quality alternatives satisfying multiple coupling constraints.
Abstract: The design of railway alignment and station locations involves two intertwined problems, which makes it a complex and time-consuming task. Especially in mountainous regions, the large 3-dimensional (3D) search spaces, complex terrain conditions, coupling constraints and infinite numbers of potential alternatives of this problem pose many challenges. However, most current optimization methods emphasize either alignment optimization or station locations optimization independently. Only a few methods consider coordinated optimization of alignment and stations, but optimize them sequentially. This paper proposes a concurrent optimization method based on a 3-dimensional distance transform algorithm (3D-DT) to solve this problem. It includes the following components: (1) To optimize the location of stations within specified spacing intervals, a novel perceptual search strategy is proposed and incorporated into the basic 3D-DT optimization process. (2) A combined-alignment-station 3D search neighboring mask is developed and employed to search for both the alignment and stations. In order to implement the perceptual process, two additional kinds of 3D reverse perceptual neighboring masks are also developed and employed in the algorithm. (3) Multiple coupling constraints between alignment and stations are also formulated and addressed during the search process. In this study, the effectiveness of the method is verified through a real-world case study in a complex mountainous region. The optimization results show that the proposed method can find high-quality alternatives satisfying multiple coupling constraints.

11 citations


Journal ArticleDOI
TL;DR: The experimental results show that the proposed method for humanaction recognition is comparable to other state-of-the-art human action recognition methods.
Abstract: Human action recognition based on silhouette images has wide applications in computer vision, human computer interaction and intelligent surveillance. It is a challenging task due to the complex actions in nature. In this paper, a human action recognition method is proposed which is based on the distance transform and entropy features of human silhouettes. In the first stage, background subtraction is performed by applying correlation coefficient based frame difference technique to extract silhouette images. In the second stage, distance transform based features and entropy features are extracted from the silhouette images. The distance transform based features and entropy features provide the shape and local variation information. These features are given as input to neural networks to recognize various human actions. The proposed method is tested on three different datasets viz., Weizmann, KTH and UCF50. The proposed method obtains an accuracy of 92.5%, 91.4% and 80% for Weizmann, KTH and UCF50 datasets respectively. The experimental results show that the proposed method for human action recognition is comparable to other state-of-the-art human action recognition methods.

10 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, an adaptive spherical matching method is proposed to account for each input fisheye camera's resolving power concerning spherical distortion, and a fast inter-scale bilateral cost volume filtering method is used to refine distance in noisy and textureless regions with optimal complexity of O(n).
Abstract: A set of cameras with fisheye lenses have been used to capture a wide field of view. The traditional scan-line stereo algorithms based on epipolar geometry are directly inapplicable to this non-pinhole camera setup due to optical characteristics of fisheye lenses; hence, existing complete 360° RGB-D imaging systems have rarely achieved realtime performance yet. In this paper, we introduce an efficient sphere-sweeping stereo that can run directly on multiview fisheye images without requiring additional spherical rectification. Our main contributions are: First, we introduce an adaptive spherical matching method that accounts for each input fisheye camera’s resolving power concerning spherical distortion. Second, we propose a fast inter-scale bilateral cost volume filtering method that refines distance in noisy and textureless regions with optimal complexity of O(n). It enables real-time dense distance estimation while preserving edges. Lastly, the fisheye color and distance images are seamlessly combined into a complete 360° RGB-D image via fast inpainting of the dense distance map. We demonstrate an embedded 360° RGB-D imaging prototype composed of a mobile GPU and four fisheye cameras. Our prototype is capable of capturing complete 360° RGB-D videos with a resolution of two megapixels at 29 fps. Results demonstrate that our real-time method outperforms traditional omnidirectional stereo and learning-based omnidirectional stereo in terms of accuracy and performance.

9 citations


Journal ArticleDOI
Zheng Tao1, Duan Zhizhao1, Wang Jin1, Lu Guodong1, Shengjie Li1, Yu Zhiyong1 
15 Feb 2021-Sensors
TL;DR: In this article, a new approach to obtain the semantic labels of 2D lidar room maps by combining distance transform watershed-based pre-segmentation and a skillfully designed neural network lidar information sampling classification is proposed.
Abstract: Semantic segmentation of room maps is an essential issue in mobile robots' execution of tasks. In this work, a new approach to obtain the semantic labels of 2D lidar room maps by combining distance transform watershed-based pre-segmentation and a skillfully designed neural network lidar information sampling classification is proposed. In order to label the room maps with high efficiency, high precision and high speed, we have designed a low-power and high-performance method, which can be deployed on low computing power Raspberry Pi devices. In the training stage, a lidar is simulated to collect the lidar detection line maps of each point in the manually labelled map, and then we use these line maps and the corresponding labels to train the designed neural network. In the testing stage, the new map is first pre-segmented into simple cells with the distance transformation watershed method, then we classify the lidar detection line maps with the trained neural network. The optimized areas of sparse sampling points are proposed by using the result of distance transform generated in the pre-segmentation process to prevent the sampling points selected in the boundary regions from influencing the results of semantic labeling. A prototype mobile robot was developed to verify the proposed method, the feasibility, validity, robustness and high efficiency were verified by a series of tests. The proposed method achieved higher scores in its recall, precision. Specifically, the mean recall is 0.965, and mean precision is 0.943.

8 citations


Proceedings ArticleDOI
22 Mar 2021
TL;DR: Wang et al. as mentioned in this paper proposed a deep end-to-end network, which use a single encoder and two parallel decoders along with performing the mask predictions also perform distance map estimation.
Abstract: Building extraction from very high resolution (VHR) imagery plays an important role in urban planning, disaster management, navigation, updating geographic databases, and several other geospatial applications. The automatic generation of buildings from satellite images presents a considerable challenge due to the complexity of building shapes. Compared with the traditional building extraction approaches, deep learning networks have shown outstanding performance in this task by using both high-level and low-level feature maps. Recently, many deep networks derived from U-Net has been extensively used in various buildings segmentation tasks. However, in most of the cases, U-net produce coarse and non-smooth segmentations with lots of discontinuities. To improve and refine the performance of U-Net network, we propose a deep end-to-end network, which use a single encoder and two parallel decoders along with performing the mask predictions also perform distance map estimation. The distance map aid in ensuring smoothness in the segmentation predictions. We also propose a new joint loss function for the proposed architecture. Experimental results based on public international society for photogrammetry and remote sensing (ISPRS) datasets with only (RGB) images demonstrated that the proposed framework can significantly improve the quality of building segmentation.

8 citations


Journal ArticleDOI
TL;DR: Experiments indicate that the guidance of boundary information in the semantic segmentation algorithm can effectively reduce the interference of nonweld zones.
Abstract: The segmentation effect of welding zone is vulnerable to image quality and target shape, which brings difficulty for the radiographic inspection of water-cooled pipe. A boundary-aware semantic segmentation algorithm is proposed to improve the precision and robustness of the welding zone inspection system. First, a convolutional neural network and atrous spatial pyramid pooling are employed to generate the feature map of the input image. Second, a boundary distance field (BDF) regression module is designed to predict the distance fields of the inner and outer boundaries, which can reflect the continuous boundary information of welding zone. Then, spatial attention weight is calculated according to the regressed boundary distance fields, and the calculated weight is fused into the feature map to enhance the significance of welding zone. Finally, the segmentation mask is predicted based on the fused feature map. Experiments indicate that the guidance of boundary information in the semantic segmentation algorithm can effectively reduce the interference of nonweld zones. The segmentation accuracy of the welding zone is 93.70%, and average detection time of a single image is less than 11 ms.

8 citations


Proceedings ArticleDOI
15 Feb 2021
TL;DR: Wang et al. as mentioned in this paper proposed an intestinal region reconstruction method from CT volumes of ileus cases, which utilizes the 3D U-Net to estimate the distance map, which is high only at the centerlines of the intestines, to obtain regions around the centrelines.
Abstract: This paper proposes an intestinal region reconstruction method from CT volumes of ileus cases. Binarized intestine segmentation results often contain incorrect contacts or loops. We utilize the 3D U-Net to estimate the distance map, which is high only at the centerlines of the intestines, to obtain regions around the centerlines. Watershed algorithm is utilized with local maximums of the distance maps as seeds for obtaining “intestine segments”. Those intestine segments are connected as graphs, for removing incorrect contacts and loops and to extract “intestine paths”, which represent how intestines are running. Experimental results using 19 CT volumes showed that our proposed method properly estimated intestine paths. These results were intuitively visualized for understanding the shape of the intestines and finding obstructions.

Journal ArticleDOI
TL;DR: The proposed hand gesture image classification system captures the real-time hand gestures, a physical movement of human hand, as a digital image and recognizes them with the pre stored set of hand gestures.
Abstract: Purpose This paper aims to propose a novel methodology for classifying the gestures using support vector machine (SVM) classification method. Initially, the Red Green Blue color hand gesture image is converted into YCbCr image in preprocessing stage and then palm with finger region is segmented by threshold process. Then, distance transformation method is applied on the palm with finger segmented image. Further, the center point (centroid) of palm region is detected and the fingertips are detected using SVM classification algorithm based on the detected centroids of the detected palm region. Design/methodology/approach Gesture is a physical indication of the body to convey information. Though any bodily movement can be considered a gesture, generally it originates from the movement of hand or face or combination of both. Combined gestures are quiet complex and difficult for a machine to classify. This paper proposes a novel methodology for classifying the gestures using SVM classification method. Initially, the color hand gesture image is converted into YCbCr image in preprocessing stage and then palm with finger region is segmented by threshold process. Then, distance transformation method is applied on the palm with finger segmented image. Further, the center point of the palm region is detected and the fingertips are detected using SVM classification algorithm. The proposed hand gesture image classification system is applied and tested on “Jochen Triesch,” “Sebastien Marcel” and “11Khands” data set hand gesture images to evaluate the efficiency of the proposed system. The performance of the proposed system is analyzed with respect to sensitivity, specificity, accuracy and recognition rate. The simulation results of the proposed method on these different data sets are compared with the conventional methods. Findings This paper proposes a novel methodology for classifying the gestures using SVM classification method. Distance transform method is used to detect the center point of the segmented palm region. The proposed hand gesture detection methodology achieves 96.5% of sensitivity, 97.1% of specificity, 96.9% of accuracy and 99.3% of recognition rate on “Jochen Triesch” data set. The proposed hand gesture detection methodology achieves 94.6% of sensitivity, 95.4% of specificity, 95.3% of accuracy and 97.8% of recognition rate on “Sebastien Marcel” data set. The proposed hand gesture detection methodology achieves 97% of sensitivity, 98% of specificity, 98.1% of accuracy and 98.8% of recognition rate on “11Khands” data set. The proposed hand gesture detection methodology consumes 0.52 s as recognition time on “Jochen Triesch” data set images, 0.71 s as recognition time on “Sebastien Marcel” data set images and 0.22 s as recognition time on “11Khands” data set images. It is very clear that the proposed hand gesture detection methodology consumes less recognition rate on “11Khands” data set when compared with other data set images. Hence, this data set is very suitable for real-time hand gesture applications with multi background environments. Originality/value The modern world requires more numbers of automated systems for improving our daily routine activities in an efficient manner. This present day technology emerges touch screen methodology for operating or functioning many devices or machines with or without wire connections. This also makes impact on automated vehicles where the vehicles can be operated without any interfacing with the driver. This is possible through hand gesture recognition system. This hand gesture recognition system captures the real-time hand gestures, a physical movement of human hand, as a digital image and recognizes them with the pre stored set of hand gestures.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a spatial feature vector based on the distance transform of the pixels with respect to the dominant edges in the input hyperspectral images (HSIs) from remote sensing data.
Abstract: Pixel-wise classification of hyperspectral images (HSIs) from remote sensing data is a common approach for extracting information about scenes In recent years, approaches based on deep learning techniques have gained wide applicability An HSI dataset can be viewed either as a collection of images, each one captured at a different wavelength, or as a collection of spectra, each one associated with a specific point (pixel) Enhanced classification accuracy is enabled if the spectral and spatial information are combined in the input vector This allows simultaneous classification according to spectral type but also according to geometric relationships In this study, we proposed a novel spatial feature vector which improves accuracies in pixel-wise classification Our proposed feature vector is based on the distance transform of the pixels with respect to the dominant edges in the input HSI In other words, we allow the location of pixels within geometric subdivisions of the dataset to modify the contribution of each pixel to the spatial feature vector Moreover, we used the extended multi attribute profile (EMAP) features to add more geometric features to the proposed spatial feature vector We have performed experiments with three hyperspectral datasets In addition to the Salinas and University of Pavia datasets, which are commonly used in HSI research, we include samples from our Surrey BC dataset Our proposed method results compares favorably to traditional algorithms as well as to some recently published deep learning-based algorithms

Journal ArticleDOI
23 Feb 2021
TL;DR: The Log-Gaussian Process Implicit Surface (Log-GPIS) as discussed by the authors is a continuous and probabilistic mapping representation suitable for surface reconstruction and local navigation, where the regularised Eikonal equation can be simply solved by applying the logarithmic transformation to a GPIS formulation to recover the accurate Euclidean distance field (EDF) and, at the same time, the implicit surface.
Abstract: In this letter, we introduce the Log-Gaussian Process Implicit Surface (Log-GPIS), a novel continuous and probabilistic mapping representation suitable for surface reconstruction and local navigation. Our key contribution is the realisation that the regularised Eikonal equation can be simply solved by applying the logarithmic transformation to a GPIS formulation to recover the accurate Euclidean distance field (EDF) and, at the same time, the implicit surface. To derive the proposed representation, Varadhan's formula is exploited to approximate the non-linear Eikonal partial differential equation (PDE) of the EDF by the logarithm of a linear PDE. We show that members of the Matern covariance family directly satisfy this linear PDE. The proposed approach does not require post-processing steps to recover the EDF. Moreover, unlike sampling-based methods, Log-GPIS does not use sample points inside and outside the surface as the derivative of the covariance allow direct estimation of the surface normals and distance gradients. We benchmarked the proposed method on simulated and real data against state-of-the-art mapping frameworks that also aim at recovering both the surface and a distance field. Our experiments show that Log-GPIS produces the most accurate results for the EDF and comparable results for surface reconstruction and its computation time still allows online operations.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, a real-time multiview shape-from-shading (SfS) method is proposed to acquire high-detail geometry by bridging volumetric fusion and multi-view SfS in two steps.
Abstract: Multiview shape-from-shading (SfS) has achieved high-detail geometry, but its computation is expensive for solving a multiview registration and an ill-posed inverse rendering problem. Therefore, it has been mainly used for offline methods. Volumetric fusion enables real-time scanning using a conventional RGB-D camera, but its geometry resolution has been limited by the grid resolution of the volumetric distance field and depth registration errors. In this paper, we propose a real-time scanning method that can acquire high-detail geometry by bridging volumetric fusion and multiview SfS in two steps. First, we pro-pose the first real-time acquisition of photometric normals stored in texture space to achieve high-detail geometry. We also introduce geometry-aware texture mapping, which progressively refines geometric registration between the texture space and the volumetric distance field by means of normal texture, achieving real-time multiview SfS. We demonstrate our scanning of high-detail geometry using an RGB-D cam-era at ∼20 fps. Results verify that the geometry quality of our method is strongly competitive with that of offline multi-view SfS methods.

Journal ArticleDOI
TL;DR: In this paper, a depth-based Euclidean distance field mapping strategy is integrated with a rapid-exploration random tree to construct a collisionavoidance system, which has a robust performance at high flight speeds in challenging dynamic environments.
Abstract: Collision-avoidance is a crucial research topic in robotics. Designing a collision-avoidance algorithm is still a challenging and open task, because of the requirements for navigating in unstructured and dynamic environments using limited payload and computing resources on board micro aerial vehicles. This article presents a novel depth-based collision-avoidance method for aerial robots, enabling high-speed flights in dynamic environments. First of all, a depth-based Euclidean distance field mapping algorithm is generated. Then, the proposed Euclidean distance field mapping strategy is integrated with a rapid-exploration random tree to construct a collision-avoidance system. The experimental results show that the proposed collision-avoidance algorithm has a robust performance at high flight speeds in challenging dynamic environments. The experimental results show that the proposed collision-avoidance algorithm can perform faster collision-avoidance maneuvers when compared to the state-of-art algorithms (the average computing time of the collision maneuver is 25.4 ms, while the minimum computing time is 10.4 ms). The average computing time is six times faster than one baseline algorithm. Additionally, fully autonomous flight experiments are also conducted for validating the presented collision-avoidance approach.

Posted Content
16 Feb 2021
TL;DR: Zhang et al. as discussed by the authors proposed a Reciprocal distance transform (R-DT) map for dense crowd counting and people localization, which can accurately describe the people location without overlap between nearby heads in dense regions.
Abstract: In this paper, we propose a novel map for dense crowd counting and people localization. Most crowd counting methods utilize convolution neural networks (CNN) to regress a density map, achieving significant progress recently. However, these regression-based methods are often unable to provide a precise location for each people, attributed to two crucial reasons: 1) the density map consists of a series of blurry Gaussian blobs, 2) severe overlaps exist in the dense region of the density map. To tackle this issue, we propose a novel Reciprocal Distance Transform (R-DT) map for crowd counting. Compared with the density maps, the R-DT maps accurately describe the people location, without overlap between nearby heads in dense regions. We simultaneously implement crowd counting and people localization with a simple network by replacing density maps with R-DT maps. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art localization-based methods in crowd counting and people localization tasks, achieving very competitive performance compared with the regression-based methods in counting tasks. In addition, the proposed method achieves a good generalization performance under cross dataset validation, which further verifies the effectiveness of the R-DT map. The code and models are available at this https URL.

Journal ArticleDOI
TL;DR: This work proposes to help the learning process by adding structural information with specific distance transform to the input image data to handle cases with limited number of training samples, and validate the approaches on two image datasets, corresponding to two different tasks.

Posted Content
TL;DR: In this paper, a bottom-up differentiable relaxation of the process of drawing points, lines and curves into a pixel raster is proposed, which allows for drawing operations to be composed in ways that can mimic the physical reality of drawing rather than being tied to modern computer graphics.
Abstract: We present a bottom-up differentiable relaxation of the process of drawing points, lines and curves into a pixel raster. Our approach arises from the observation that rasterising a pixel in an image given parameters of a primitive can be reformulated in terms of the primitive's distance transform, and then relaxed to allow the primitive's parameters to be learned. This relaxation allows end-to-end differentiable programs and deep networks to be learned and optimised and provides several building blocks that allow control over how a compositional drawing process is modelled. We emphasise the bottom-up nature of our proposed approach, which allows for drawing operations to be composed in ways that can mimic the physical reality of drawing rather than being tied to, for example, approaches in modern computer graphics. With the proposed approach we demonstrate how sketches can be generated by directly optimising against photographs and how auto-encoders can be built to transform rasterised handwritten digits into vectors without supervision. Extensive experimental results highlight the power of this approach under different modelling assumptions for drawing tasks.


Posted Content
TL;DR: In this article, a sharpness loss regularized generative adversarial network was proposed to generate histopathology images with clear nuclei contours for overlapped and touching nuclei.
Abstract: Existing deep learning-based approaches for histopathology image analysis require large annotated training sets to achieve good performance; but annotating histopathology images is slow and resource-intensive. Conditional generative adversarial networks have been applied to generate synthetic histopathology images to alleviate this issue, but current approaches fail to generate clear contours for overlapped and touching nuclei. In this study, We propose a sharpness loss regularized generative adversarial network to synthesize realistic histopathology images. The proposed network uses normalized nucleus distance map rather than the binary mask to encode nuclei contour information. The proposed sharpness loss enhances the contrast of nuclei contour pixels. The proposed method is evaluated using four image quality metrics and segmentation results on two public datasets. Both quantitative and qualitative results demonstrate that the proposed approach can generate realistic histopathology images with clear nuclei contours.

Journal ArticleDOI
TL;DR: This paper presents a fast image blending approach for combining a set of registered images into a composite mosaic with no visible seams and minimal texture distortion on mobile phones using run-length encoding scheme.
Abstract: This paper presents a fast image blending approach for combining a set of registered images into a composite mosaic with no visible seams and minimal texture distortion on mobile phones. A unique seam image is generated using two-pass nearest distance transform, which is independent on the order of input images and has good scalability. Each individual mask can be extracted from this seam image quickly. To promote blending speed and reduce memory usage in building high resolution image mosaics on mobile phones, the seam image and mask images are compressed using run-length encoding, and all the following mask operations are built on run-length encoding scheme. Moreover, single instruction multiple data instruction set is used in Gaussian and Laplacian pyramids construction to improve the blending speed further. The use of run-length encoding for masks processing leads to reduced memory requirements and a compact storage of the mask data, and the use of single instruction multiple data instruction set achieves better parallelism and faster execution speed on mobile phones.

Book ChapterDOI
22 Oct 2021
TL;DR: In this paper, a topological and semantic segmentation algorithm is proposed to divide a grid map into single rooms or similar meaningful semantic units with a collision-free path to connect them.
Abstract: The current tendency in mobile robot indoor navigation is to move from the representation environment as a geometric grid map to a topological and semantic map closer to the way how humans reason. The topological and semantic map enables a robot to understand the environment. This paper presents a topological and semantic segmentation algorithm that divides a grid map into single rooms or similar meaningful semantic units with a collision-free path to connect them. First, a topological map is build based on the distance transform of the grid map. Then a semantic map is build based on the distance transform of the grid map and a circular kernel. Finally, we filter and prune the topological map by merging the nodes which represent the same room. The segmented performance of the proposed planning framework is verified on multiple maps. The experiment results show that the proposed method can accurately segment rooms and generate topological semantic maps.

Journal ArticleDOI
TL;DR: In this paper, the authors propose a functional-based hybrid representation called HFRep for modeling volumetric heterogeneous objects, which allows for obtaining a continuous smooth distance field in Euclidean space and preserves the advantages of the conventional representations based on scalar fields without their drawbacks.
Abstract: Heterogeneous object modelling is an emerging area where geometric shapes are considered in concert with their internal physically-based attributes. This paper describes a novel theoretical and practical framework for modelling volumetric heterogeneous objects on the basis of a novel unifying functionally-based hybrid representation called HFRep. This new representation allows for obtaining a continuous smooth distance field in Euclidean space and preserves the advantages of the conventional representations based on scalar fields of different kinds without their drawbacks. We systematically describe the mathematical and algorithmic basics of HFRep. The steps of the basic algorithm are presented in detail for both geometry and attributes. To solve some problematic issues, we have suggested several practical solutions, including a new algorithm for solving the eikonal equation on hierarchical grids. Finally, we show the practicality of the approach by modelling several representative heterogeneous objects, including those of a time-variant nature.

Posted Content
TL;DR: Zhang et al. as discussed by the authors proposed a novel Focal Inverse Distance Transform (FIDT) map for dense crowd localization and counting, which accurately describes the people's location, without overlap between nearby heads in dense regions.
Abstract: In this paper, we propose a novel map for dense crowd localization and crowd counting. Most crowd counting methods utilize convolution neural networks (CNN) to regress a density map, achieving significant progress recently. However, these regression-based methods are often unable to provide a precise location for each person, attributed to two crucial reasons: 1) the density map consists of a series of blurry Gaussian blobs, 2) severe overlaps exist in the dense region of the density map. To tackle this issue, we propose a novel Focal Inverse Distance Transform (FIDT) map for crowd localization and counting. Compared with the density maps, the FIDT maps accurately describe the people's location, without overlap between nearby heads in dense regions. We simultaneously implement crowd localization and counting by regressing the FIDT map. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art localization-based methods in crowd localization tasks, achieving very competitive performance compared with the regression-based methods in counting tasks. In addition, the proposed method presents strong robustness for the negative samples and extremely dense scenes, which further verifies the effectiveness of the FIDT map. The code and models are available at this https URL.

Journal ArticleDOI
TL;DR: A C++-Python package for 3D mechanical simulations of granular geomaterials, seen as a collection of particles being in contact interaction one with another while showing complex grain shapes, and a Fast Marching Method to construct such a distance field for a wide class of surfaces is proposed.

Proceedings ArticleDOI
15 Feb 2021
TL;DR: A distance ordinal regression loss is proposed for an improved nuclei instance segmentation in digitized tissue specimen images by adopting a distance-decreasing discretization strategy and recast the problem of the distance prediction as an ordinals regression problem.
Abstract: In digital pathology, nuclei segmentation still remains a challenging task due to the high heterogeneity and variability in the characteristics of nuclei, in particular, the clustered and overlapping nuclei. We propose a distance ordinal regression loss for an improved nuclei instance segmentation in digitized tissue specimen images. A convolutional neural network with two decoder branches is built. The first decoder branch conducts the nuclear pixel prediction and the second branch predicts the distance to the nuclear center, which is utilized to identify the nuclear boundary and to separate out overlapping nuclei. Adopting a distance-decreasing discretization strategy, we recast the problem of the distance prediction as an ordinal regression problem. To evaluate the proposed method, we conduct experiments on multiple independent multitissue histology image datasets. The experimental results on the multi-tissue datasets demonstrate the effectiveness of the proposed model.

Book ChapterDOI
01 Jan 2021
TL;DR: In this paper, a differentiable renderer was proposed to generate 3D models from real 2D fluoroscopy images of the pelvis, which is an ideal anatomical structure for patient registration.
Abstract: Many minimally invasive interventional procedures still rely on 2D fluoroscopic imaging. Generating a patient-specific 3D model from these X-ray data would improve the procedural workflow, e.g., by providing assistance functions such as automatic positioning. To accomplish this, two things are required. First, a statistical human shape model of the human anatomy and second, a differentiable X-ray renderer. We propose a differentiable renderer by deriving the distance travelled by a ray inside mesh structures to generate a distance map. To demonstrate its functioning, we use it for simulating X-ray images from human shape models. Then we show its application by solving the inverse problem, namely reconstructing 3D models from real 2D fluoroscopy images of the pelvis, which is an ideal anatomical structure for patient registration. This is accomplished by an iterative optimization strategy using gradient descent. With the majority of the pelvis being in the fluoroscopic field of view, we achieve a mean Hausdorff distance of 30mm between the reconstructed model and the ground truth segmentation.

Posted Content
TL;DR: In this article, a pipeline for parametric wireframe extraction from densely sampled point clouds is presented, which detects corners, constructs curve segmentation, and builds a topological graph fitted to the wireframe.
Abstract: We present a pipeline for parametric wireframe extraction from densely sampled point clouds. Our approach processes a scalar distance field that represents proximity to the nearest sharp feature curve. In intermediate stages, it detects corners, constructs curve segmentation, and builds a topological graph fitted to the wireframe. As an output, we produce parametric spline curves that can be edited and sampled arbitrarily. We evaluate our method on 50 complex 3D shapes and compare it to the novel deep learning-based technique, demonstrating superior quality.

Posted Content
TL;DR: In this paper, a continuous representation called Keypoint Distance Field (KDF) is proposed for projected 2D keypoint locations, where each element stores the 2D Euclidean distance between the corresponding image pixel and a specified projected two-dimensional keypoint, and a fully convolutional neural network is used to regress the KDF for each keypoint.
Abstract: We present KDFNet, a novel method for 6D object pose estimation from RGB images. To handle occlusion, many recent works have proposed to localize 2D keypoints through pixel-wise voting and solve a Perspective-n-Point (PnP) problem for pose estimation, which achieves leading performance. However, such voting process is direction-based and cannot handle long and thin objects where the direction intersections cannot be robustly found. To address this problem, we propose a novel continuous representation called Keypoint Distance Field (KDF) for projected 2D keypoint locations. Formulated as a 2D array, each element of the KDF stores the 2D Euclidean distance between the corresponding image pixel and a specified projected 2D keypoint. We use a fully convolutional neural network to regress the KDF for each keypoint. Using this KDF encoding of projected object keypoint locations, we propose to use a distance-based voting scheme to localize the keypoints by calculating circle intersections in a RANSAC fashion. We validate the design choices of our framework by extensive ablation experiments. Our proposed method achieves state-of-the-art performance on Occlusion LINEMOD dataset with an average ADD(-S) accuracy of 50.3% and TOD dataset mug subset with an average ADD accuracy of 75.72%. Extensive experiments and visualizations demonstrate that the proposed method is able to robustly estimate the 6D pose in challenging scenarios including occlusion.