scispace - formally typeset
Search or ask a question

Showing papers on "Kernel (image processing) published in 2022"


Journal ArticleDOI
TL;DR: This work proposes a rapider OpenPose model (ROpenPose) to solve the posture detection problem of astronauts in a space capsule in a weightless environment and uses MobileNets instead of VGG-19 to achieve lighter calculations while ensuring the accuracy of model recognition.
Abstract: This article proposes a rapider OpenPose model (ROpenPose) to solve the posture detection problem of astronauts in a space capsule in a weightless environment. The ROpenPose model has three innovations as follows: 1) It uses MobileNets instead of VGG-19 to achieve lighter calculations while ensuring the accuracy of model recognition. 2) Three small convolution kernels replace the large convolution kernel of the original OpenPose, which significantly reduces the computational complexity of the model. 3) Through the parameter sharing of a convolution process, the original two-branch structure is changed to a single-branch structure, which obviously improves the calculation speed of the model. A residual network is proposed to suppress the hidden danger of gradient disappearance. The deployment of ROpenPose greatly improves astronauts’ detection efficiency while ensuring their high detection performance, and thereby realizing the real-time monitoring of their operation attitude. Experimental results show that ROpenPose runs at speed higher than and detection performance comparable to a number of the existing state-of-the-art models.

17 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose a dynamic tile size scheme to adaptively distribute the computational data across GPU threads to improve GPU utilization and to hide the memory access latency, which can reduce the number of memory operations performed on the width and the height dimensions of the 2D convolution.
Abstract: The depthwise separable convolution is commonly seen in convolutional neural networks (CNNs), and is widely used to reduce the computation overhead of a standard multi-channel 2D convolution. Existing implementations of depthwise separable convolutions target accelerating model training with large batch sizes with a large number of samples to be processed at once. Such approaches are inadequate for small-batch-sized model training and the typical scenario of model inference where the model takes in a few samples at once. This article aims to bridge the gap of optimizing depthwise separable convolutions by targeting the GPU architecture. We achieve this by designing two novel algorithms to improve the column and row reuse of the convolution operation to reduce the number of memory operations performed on the width and the height dimensions of the 2D convolution. Our approach employs a dynamic tile size scheme to adaptively distribute the computational data across GPU threads to improve GPU utilization and to hide the memory access latency. We apply our approach on two GPU platforms: an NVIDIA RTX 2080Ti GPU and an embedded NVIDIA Jetson AGX Xavier GPU, and two data types: 32-bit floating point (FP32) and 8-bit integer (INT8). We compared our approach against cuDNN that is heavily tuned for the NVIDIA GPU architecture. Experimental results show that our approach delivers over $2\times$ 2 × (up to $3\times$ 3 × ) performance improvement over cuDNN. We show that, when using a moderate batch size, our approach averagely reduces the end-to-end training time of MobileNet and EfficientNet by 9.7 and 7.3 percent respectively, and reduces the end-to-end inference time of MobileNet and EfficientNet by 12.2 and 11.6 percent respectively.

16 citations


Journal ArticleDOI
TL;DR: This contribution reduces the precision of weakly correlated locations to single- or half- precision based on distance, and exploits mathematical structure to migrate MLE to a three-precision approximation that takes advantage of contemporary architectures offering BLAS3-like operations in a single instruction that are extremely fast for reduced precision.
Abstract: Geostatistical modeling, one of the prime motivating applications for exascale computing, is a technique for predicting desired quantities from geographically distributed data, based on statistical models and optimization of parameters. Spatial data are assumed to possess properties of stationarity or non-stationarity via a kernel fitted to a covariance matrix. A primary workhorse of stationary spatial statistics is Gaussian maximum log-likelihood estimation (MLE), whose central data structure is a dense, symmetric positive definite covariance matrix of the dimension of the number of correlated observations. Two essential operations in MLE are the application of the inverse and evaluation of the determinant of the covariance matrix. These can be rendered through the Cholesky decomposition and triangular solution. In this contribution, we reduce the precision of weakly correlated locations to single- or half- precision based on distance. We thus exploit mathematical structure to migrate MLE to a three-precision approximation that takes advantage of contemporary architectures offering BLAS3-like operations in a single instruction that are extremely fast for reduced precision. We illustrate application-expected accuracy worthy of double-precision from a majority half-precision computation, in a context where uniform single-precision is by itself insufficient. In tackling the complexity and imbalance caused by the mixing of three precisions, we deploy the PaRSEC runtime system. PaRSEC delivers on-demand casting of precisions while orchestrating tasks and data movement in a multi-GPU distributed-memory environment within a tile-based Cholesky factorization. Application-expected accuracy is maintained while achieving up to $1.59X$ 1 . 59 X by mixing FP64/FP32 operations on 1536 nodes of HAWK or 4096 nodes of Shaheen II , and up to $2.64X$ 2 . 64 X by mixing FP64/FP32/FP16 operations on 128 nodes of Summit , relative to FP64-only operations. This translates into up to 4.5, 4.7, and 9.1 (mixed) PFlop/s sustained performance, respectively, demonstrating a synergistic combination of exascale architecture, dynamic runtime software, and algorithmic adaptation applied to challenging environmental problems.

15 citations


Journal ArticleDOI
TL;DR: GeoConv as mentioned in this paper embeds 3D manifold information into 2D convolutions, which is negatively correlated to geodesic distances on a coarsely reconstructed 3D morphable face model.

10 citations


Journal ArticleDOI
TL;DR: In this article, a progressive kernel pruning with salient mapping of input-output channels is proposed to improve the performance of the pruning model, where the input and output channels are simultaneously considered in pruning process.

7 citations


Journal ArticleDOI
TL;DR: Adaptive Dilation Convolutional Neural Networks (ADCNN), a light-weighted extension that allows convolutional kernels to adjust their dilation value based on different contents at the pixel level, is proposed.

7 citations


Journal ArticleDOI
TL;DR: In this paper, a generalized fractional differential (GFD) mask was proposed for image enhancement and image denoising applications, which incorporates various fractional-order kernels such as power-law kernel, exponentional kernel, and Mittag-Leffler kernel, corresponding to Riemann-Liouville (RL)/Caputo, Caputo-Fabrizio (CF), and Atangana-Baleanu (AB) fractional order differentials, respectively.

6 citations


Journal ArticleDOI
TL;DR: A variational model for single image blind restoration that utilizes non-convex and non-smooth quaternion FTV with Lp quasinorm to constrain the image and L1-norm to constraining the blur kernel is proposed.

2 citations


Journal ArticleDOI
TL;DR: In this paper, an encoder-decoder-based framework, called piece-wise kernel encoding network (PKNet), is proposed for missing data imputation of the vegetation index (VI) curves derived from time-series image data.
Abstract: The high spatial, spectral, and temporal resolutions of the Vegetation and Environment monitoring New Micro-Satellite (VENµS) satellite data facilitate field-level phenological analysis of crops. This study proposes deep learning (DL) based approaches to resolve the issues prevalent in crop phenology-based fingerprint estimation at field-level using VENµS satellite data. An encoder-decoder-based framework, called piece-wise kernel encoding network (PKNet), is proposed for missing data imputation of the vegetation index (VI) curves derived from time-series image data. PKNet adopts interpolation-based convolution, dynamic time wrapping (DTW) based layer formulation, and imputation-specific constraints for optimal smoothing of the irregularly sampled VI curves. Besides, PKNet learns kernel parameters dynamically. A variational encoding framework called a dynamic-projection-based generalization network (DPGNet), is proposed to generalize the pixel-level VI curves to synthesize a representative VI curve for a given field. DPGNet is more effective than the use of multiple moments as it is resilient to outliers and learns normally distributed latent space with a small number of samples. The current research also proposes a classifier, called dynamic time wrapping based capsule network (DTCapsNet), which learns a discriminative latent space and accurately models the VI curve features. The DTCapsNet considers the time-series nature of the input using DTW-based convolution layers. The feature characterization improves generalizability and gives good results, even with a limited number of training samples. Experiments using the ground truth information and satellite images, acquired over two farms in Israel, illustrate that the proposed frameworks give better results than the commonly-used existing approaches.

2 citations


Journal ArticleDOI
TL;DR: In this paper, the comparison between Sinc method in combination with double exponential transformations (DE) and approximation by means of differential transform method (DTM) for nonlinear Hammerstein integral equations is considered.
Abstract: Here, the comparison between Sinc method in combination with double exponential transformations (DE) and approximation by means of differential transform method (DTM) for nonlinear Hammerstein integral equations is considered. Convergence analysis is presented. Detection of effectiveness from various aspects such as run time, different norms, condition number are highlighted and plotted graphically. Results of two schemes are practically well, but in manner of separable kernel, DTM solution is more accurate and so fast.

1 citations


Journal ArticleDOI
TL;DR: In this paper, the authors developed new fast methods for computing these sums using the H 2 hierarchical matrix representation, for open and for periodic boundary conditions, for simulations of large numbers of small, spherical particles in a Stokes flow.

Journal ArticleDOI
Xianyu Ge1, Jieqing Tan1, Li Zhang1, Jing Liu1, Dandan Hu1 
TL;DR: It is shown that GC regularization is also effective for blind image deblurring when combines it with L 0 -norm of image gradients, and Gaussian curvature filter (GCF) and the half-quadratic splitting strategy are adopted to solve the optimization problem.
Abstract: This paper presents a blind image deblurring algorithm by utilizing Gaussian curvature (GC) of the image surface. GC is an intrinsic geometric feature and related to the developability of the surface. In recent years, numerous variational models based on GC for image denoising and image reconstruction have been proposed. In this paper, we show that GC regularization is also effective for blind image deblurring when combines it with L 0 -norm of image gradients. By minimizing the combined regularization, our algorithm gradually preserves sharp edges and removes detrimental structures and noises in intermediate latent images. The sharp latent images can then accurately guide the estimation of the blur kernel. However, a complicated optimization problem will occur once the proposed regularization has been involved. As we know, traditional diffusion methods for minimizing the GC regularization not only converges slowly but also requires the differentiability of signals. Moreover, the L 0 -norm of image gradients is non-convex. Consequently, Gaussian curvature filter (GCF) and the half-quadratic splitting strategy are adopted to solve the optimization problem. Extensive experimental results show that the proposed deblurring method achieves state-of-the-art results on benchmark datasets and performs favorably on real-world blurry images.

Journal ArticleDOI
TL;DR: In this article, the authors proposed an automatic frame work for detecting COVID-19 at the early stage using chest X-ray image and achieved 99.6% accuracy in detecting the virus at its early stage.
Abstract: This research article proposes an automatic frame work for detecting COVID -19 at the early stage using chest X-ray image. It is an undeniable fact that coronovirus is a serious disease but the early detection of the virus present in human bodies can save lives. In recent times, there are somany research solutions that have been presented for early detection, but there is still a lack in need of right and even rich technology for its early detection. The proposed deep learning model analysis the pixels of every image and adjudges the presence of virus. The classifier is designed in such a way so that, it automatically detects the virus present in lungs using chest image. This approach uses an image texture analysis technique called granulometric mathematical model. Selected features are heuristically processed for optimization using novel multi scaling deep learning called light weight residual-atrous spatial pyramid pooling (LightRES-ASPP-Unet) Unet model. The proposed deep LightRES-ASPPUnet technique has a higher level of contracting solution by extracting major level of image features. Moreover, the corona virus has been detected using high resolution output. In the framework, atrous spatial pyramid pooling (ASPP) method is employed at its bottom level for incorporating the deep multi scale features in to the discriminative mode. The architectural working starts from the selecting the features from the image using granulometric mathematical model and the selected features are optimized using LightRESASPP- Unet. ASPP in the analysis of images has performed better than the existing Unet model. The proposed algorithm has achieved 99.6% of accuracy in detecting the virus at its early stage. © 2022 Tech Science Press. All rights reserved.