scispace - formally typeset
Search or ask a question
Author

Cheng Chen

Other affiliations: University of Iowa
Bio: Cheng Chen is an academic researcher from Google. The author has contributed to research in topics: Codec & Data compression. The author has an hindex of 11, co-authored 33 publications receiving 467 citations. Previous affiliations of Cheng Chen include University of Iowa.

Papers
More filters
Proceedings ArticleDOI
24 Jun 2018
TL;DR: A brief technical overview of key coding techniques in AV1 is provided along with preliminary compression performance comparison against VP9 and HEVC.
Abstract: AV1 is an emerging open-source and royalty-free video compression format, which is jointly developed and finalized in early 2018 by the Alliance for Open Media (AOMedia) industry consortium. The main goal of AV1 development is to achieve substantial compression gain over state-of-the-art codecs while maintaining practical decoding complexity and hardware feasibility. This paper provides a brief technical overview of key coding techniques in AV1 along with preliminary compression performance comparison against VP9 and HEVC.

260 citations

Journal ArticleDOI
26 Feb 2021
TL;DR: A technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility is provided.
Abstract: The AV1 video compression format is developed by the Alliance for Open Media consortium. It achieves more than a 30% reduction in bit rate compared to its predecessor VP9 for the same decoded video quality. This article provides a technical overview of the AV1 codec design that enables the compression performance gains with considerations for hardware feasibility.

95 citations

Journal ArticleDOI
Dakai Jin1, Krishna S. Iyer1, Cheng Chen1, Eric A. Hoffman1, Punam K. Saha1 
TL;DR: A new robust and efficient curve skeletonization algorithm for three-dimensional (3-D) elongated fuzzy objects using a minimum cost path approach, which avoids spurious branches without requiring post-pruning.

57 citations

Journal ArticleDOI
TL;DR: A current challenge in osteoporosis is identifying patients at risk of bone fracture and identifying patients with a high likelihood of fracture.
Abstract: BACKGROUND A current challenge in osteoporosis is identifying patients at risk of bone fracture. PURPOSE To identify the machine learning classifiers that predict best osteoporotic bone fractures and, from the data, to highlight the imaging features and the anatomical regions that contribute most to prediction performance. STUDY TYPE Prospective (cross-sectional) case-control study. POPULATION Thirty-two women with prior fragility bone fractures, of mean age = 61.6 and body mass index (BMI) = 22.7 kg/m2 , and 60 women without fractures, of mean age = 62.3 and BMI = 21.4 kg/m2 . Field Strength/ Sequence: 3D FLASH at 3T. ASSESSMENT Quantitative MRI outcomes by software algorithms. Mechanical and topological microstructural parameters of the trabecular bone were calculated for five femoral regions, and added to the vector of features together with bone mineral density measurement, fracture risk assessment tool (FRAX) score, and personal characteristics such as age, weight, and height. We fitted 15 classifiers using 200 randomized cross-validation datasets. Statistical Tests: Data: Kolmogorov-Smirnov test for normality. Model Performance: sensitivity, specificity, precision, accuracy, F1-test, receiver operating characteristic curve (ROC). Two-sided t-test, with P < 0.05 for statistical significance. RESULTS The top three performing classifiers are RUS-boosted trees (in particular, performing best with head data, F1 = 0.64 ± 0.03), the logistic regression and the linear discriminant (both best with trochanteric datasets, F1 = 0.65 ± 0.03 and F1 = 0.67 ± 0.03, respectively). A permutation of these classifiers comprised the best three performers for four out of five anatomical datasets. After averaging across all the anatomical datasets, the score for the best performer, the boosted trees, was F1 = 0.63 ± 0.03 for All-features dataset, F1 = 0.52 ± 0.05 for the no-MRI dataset, and F1 = 0.48 ± 0.06 for the no-FRAX dataset. Data Conclusion: Of many classifiers, the RUS-boosted trees, the logistic regression, and the linear discriminant are best for predicting osteoporotic fracture. Both MRI and FRAX independently add value in identifying osteoporotic fractures. The femoral head, greater trochanter, and inter-trochanter anatomical regions within the proximal femur yielded better F1-scores for the best three classifiers. LEVEL OF EVIDENCE 2 Technical Efficacy: Stage 2 J. Magn. Reson. Imaging 2019;49:1029-1038.

53 citations

Journal ArticleDOI
23 Feb 2020
TL;DR: A technical overview of key coding techniques in AV1 is provided and the coding performance gains are validated by video compression tests performed with the libaom AV1 encoder against the libvpx VP9 encoder.
Abstract: In 2018, the Alliance for Open Media (AOMedia) finalized its first video compression format AV1, which is jointly developed by the industry consortium of leading video technology companies. The main goal of AV1 is to provide an open source and royalty-free video coding format that substantially outperforms state-of-the-art codecs available on the market in compression efficiency while remaining practical decoding complexity as well as being optimized for hardware feasibility and scalability on modern devices. To give detailed insights into how the targeted performance and feasibility is realized, this paper provides a technical overview of key coding techniques in AV1. Besides, the coding performance gains are validated by video compression tests performed with the libaom AV1 encoder against the libvpx VP9 encoder. Preliminary comparison with two leading HEVC encoders, x265 and HM, and the reference software of VVC is also conducted on AOM's common test set and an open 4k set.

44 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a semi-supervised deep learning approach to recover high-resolution (HR) CT images from low resolution (LR) counterparts by enforcing the cycle-consistency in terms of the Wasserstein distance.
Abstract: In this paper, we present a semi-supervised deep learning approach to accurately recover high-resolution (HR) CT images from low-resolution (LR) counterparts. Specifically, with the generative adversarial network (GAN) as the building block, we enforce the cycle-consistency in terms of the Wasserstein distance to establish a nonlinear end-to-end mapping from noisy LR input images to denoised and deblurred HR outputs. We also include the joint constraints in the loss function to facilitate structural preservation. In this process, we incorporate deep convolutional neural network (CNN), residual learning, and network in network techniques for feature extraction and restoration. In contrast to the current trend of increasing network depth and complexity to boost the imaging performance, we apply a parallel ${1}\times {1}$ CNN to compress the output of the hidden layer and optimize the number of layers and the number of filters for each convolutional layer. The quantitative and qualitative evaluative results demonstrate that our proposed model is accurate, efficient and robust for super-resolution (SR) image restoration from noisy LR input images. In particular, we validate our composite SR networks on three large-scale CT datasets, and obtain promising results as compared to the other state-of-the-art methods.

257 citations

Journal ArticleDOI
TL;DR: In this article, a semi-supervised deep learning approach was proposed to recover high-resolution (HR) CT images from low resolution (LR) counterparts by enforcing the cycle-consistency in terms of Wasserstein distance to establish a nonlinear end-to-end mapping from noisy LR input images to denoised and deblurred HR outputs.
Abstract: Computed tomography (CT) is widely used in screening, diagnosis, and image-guided therapy for both clinical and research purposes. Since CT involves ionizing radiation, an overarching thrust of related technical research is development of novel methods enabling ultrahigh quality imaging with fine structural details while reducing the X-ray radiation. In this paper, we present a semi-supervised deep learning approach to accurately recover high-resolution (HR) CT images from low-resolution (LR) counterparts. Specifically, with the generative adversarial network (GAN) as the building block, we enforce the cycle-consistency in terms of the Wasserstein distance to establish a nonlinear end-to-end mapping from noisy LR input images to denoised and deblurred HR outputs. We also include the joint constraints in the loss function to facilitate structural preservation. In this deep imaging process, we incorporate deep convolutional neural network (CNN), residual learning, and network in network techniques for feature extraction and restoration. In contrast to the current trend of increasing network depth and complexity to boost the CT imaging performance, which limit its real-world applications by imposing considerable computational and memory overheads, we apply a parallel $1\times1$ CNN to compress the output of the hidden layer and optimize the number of layers and the number of filters for each convolutional layer. Quantitative and qualitative evaluations demonstrate that our proposed model is accurate, efficient and robust for super-resolution (SR) image restoration from noisy LR input images. In particular, we validate our composite SR networks on three large-scale CT datasets, and obtain promising results as compared to the other state-of-the-art methods.

242 citations

A. Jain1
01 Sep 1976
TL;DR: The Karhunter-Loeve transform for a class of signals is proven to be a set of periodic sine functions and this Karhunen- Loeve series expansion can be obtained via an FFT algorithm, which could be useful in data compression and other mean-square signal processing applications.
Abstract: The Karhunen-Loeve transform for a class of signals is proven to be a set of periodic sine functions and this Karhunen-Loeve series expansion can be obtained via an FFT algorithm. This fast algorithm obtained could be useful in data compression and other mean-square signal processing applications.

211 citations

Posted Content
TL;DR: CompressAI is presented, a platform that provides custom operations, layers, models and tools to research, develop and evaluate end-to-end image and video compression codecs and is intended to be soon extended to the video compression domain.
Abstract: This paper presents CompressAI, a platform that provides custom operations, layers, models and tools to research, develop and evaluate end-to-end image and video compression codecs. In particular, CompressAI includes pre-trained models and evaluation tools to compare learned methods with traditional codecs. Multiple models from the state-of-the-art on learned end-to-end compression have thus been reimplemented in PyTorch and trained from scratch. We also report objective comparison results using PSNR and MS-SSIM metrics vs. bit-rate, using the Kodak image dataset as test set. Although this framework currently implements models for still-picture compression, it is intended to be soon extended to the video compression domain.

175 citations

Journal ArticleDOI
TL;DR: The skeletonization algorithm and convolutional neural network (CNN) for the recognition algorithm reduce the impact of shooting angle and environment on recognition effect, and improve the accuracy of gesture recognition in complex environments.
Abstract: In the field of human-computer interaction, vision-based gesture recognition methods are widely studied. However, its recognition effect depends to a large extent on the performance of the recognition algorithm. The skeletonization algorithm and convolutional neural network (CNN) for the recognition algorithm reduce the impact of shooting angle and environment on recognition effect, and improve the accuracy of gesture recognition in complex environments. According to the influence of the shooting angle on the same gesture recognition, the skeletonization algorithm is optimized based on the layer-by-layer stripping concept, so that the key node information in the hand skeleton diagram is extracted. The gesture direction is determined by the spatial coordinate axis of the hand. Based on this, gesture segmentation is implemented to overcome the influence of the environment on the recognition effect. In order to further improve the accuracy of gesture recognition, the ASK gesture database is used to train the convolutional neural network model. The experimental results show that compared with SVM method, dictionary learning + sparse representation, CNN method and other methods, the recognition rate reaches 96.01%.

136 citations