scispace - formally typeset
Search or ask a question
Author

Huiyu Duan

Bio: Huiyu Duan is an academic researcher from Shanghai Jiao Tong University. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 8, co-authored 16 publications receiving 242 citations.

Papers
More filters
Proceedings ArticleDOI
27 May 2018
TL;DR: A perceptual omnidirectional image quality assessment (IQA) study since it is extremely important to provide a good quality of experience under the VR environment, and some new observations different from traditional IQA are made.
Abstract: Omnidirectional images and videos can provide immersive experience of real-world scenes in Virtual Reality (VR) environment. We present a perceptual omnidirectional image quality assessment (IQA) study in this paper since it is extremely important to provide a good quality of experience under the VR environment. We first establish an omnidirectional IQA (OIQA) database, which includes 16 source images and 320 distorted images degraded by 4 commonly encountered distortion types, namely JPEG compression, JPEG2000 compression, Gaussian blur and Gaussian noise. Then a subjective quality evaluation study is conducted on the OIQA database in the VR environment. Considering that humans can only see a part of the scene at one movement in the VR environment, visual attention becomes extremely important. Thus we also track head and eye movement data during the quality rating experiments. The original and distorted omnidirectional images, subjective quality ratings, and the head and eye movement data together constitute the OIQA database. State-of-the-art full-reference (FR) IQA measures are tested on the OIQA database, and some new observations different from traditional IQA are made. The OIQA database will be released to facilitate further research.

98 citations

Journal ArticleDOI
TL;DR: This paper builds a compressed VR image quality (CVIQ) database, and proposes a multi-channel convolution neural network (CNN) for blind 360-degree image quality assessment (MC360IQA), which achieves the best performance among the state-of-art full-reference and no-reference image quality Assessment (IQA) models on the CVIQ database and other available360-degree IQA database.
Abstract: 360-degree images/videos have been dramatically increasing in recent years. The characteristic of omnidirectional-view results in high resolution of 360-degree images/videos, which makes them difficult to be transported and stored. To deal with the problem, video coding technologies are used to compress the omnidirectional content but they will introduce the compression distortion. Therefore, it is important to study how popular coding technologies affect the quality of 360-degree images. In this paper, we present a study on both subjective and objective quality assessment of compressed virtual reality (VR) images. We first build a compressed VR image quality (CVIQ) database including 16 reference images and 528 compressed ones with three prevailing coding technologies. Then, we propose a multi-channel convolution neural network (CNN) for blind 360-degree image quality assessment (MC360IQA). To be consistent with the visual content seen in the VR device, we project each 360-degree image into six viewport images, which are adopted as inputs of the proposed model. MC360IQA consists of two parts, a multi-channel CNN and an image quality regressor. The multi-channel CNN includes six parallel hyper-ResNet34 networks, where the hyper structure is used to incorporate the features from intermediate layers. The image quality regressor fuses the features and regresses them to final scores. The experimental results show that our model achieves the best performance among the state-of-art full-reference (FR) and no-reference (NR) image quality assessment (IQA) models on the CVIQ database and other available 360-degree IQA database.

91 citations

Journal ArticleDOI
05 Jan 2018-PLOS ONE
TL;DR: The overall performance of the proposed technique is acceptable for BR and HR estimations during nighttime and almost all the data points located within their own 95% limits of agreement are demonstrated.
Abstract: To achieve the simultaneous and unobtrusive breathing rate (BR) and heart rate (HR) measurements during nighttime, we leverage a far-infrared imager and an infrared camera equipped with IR-Cut lens and an infrared lighting array to develop a dual-camera imaging system. A custom-built cascade face classifier, containing the conventional Adaboost model and fully convolutional network trained by 32K images, was used to detect the face region in registered infrared images. The region of interest (ROI) inclusive of mouth and nose regions was afterwards confirmed by the discriminative regression and coordinate conversions of three selected landmarks. Subsequently, a tracking algorithm based on spatio-temporal context learning was applied for following the ROI in thermal video, and the raw signal was synchronously extracted. Finally, a custom-made time-domain signal analysis approach was developed for the determinations of BR and HR. A dual-mode sleep video database, including the videos obtained under environment where illumination intensity ranged from 0 to 3 Lux, was constructed to evaluate the effectiveness of the proposed system and algorithms. In linear regression analysis, the determination coefficient (R2) of 0.831 had been observed for the measured BR and reference BR, and this value was 0.933 for HR measurement. In addition, the Bland-Altman plots of BR and HR demonstrated that almost all the data points located within their own 95% limits of agreement. Consequently, the overall performance of the proposed technique is acceptable for BR and HR estimations during nighttime.

61 citations

Proceedings ArticleDOI
18 Jun 2019
TL;DR: Based on this dataset, researchers could analyze the visual traits of children with ASD and design specialized visual attention models to promote research in related fields, as well as design specialized models to identify the individuals with ASD.
Abstract: Social difficulties are the hallmark features of Autism Spectrum Disorder (ASD) and can lead to atypical visual attention towards stimuli. Eye movements encode rich information about attention and psychological factors of an individual, which could help to characterize the traits of ASD. Learning atypical eye movements of the individuals with ASD towards various stimuli is important and has many application scenarios. However, due to the lack of open datasets, research in this sense is still limited. In this work, we present an open dataset of eye movements of children with Autism Spectrum Disorder. It consists of 300 natural scene images and the corresponding eye movement data collected from 14 children with ASD and 14 healthy controls. In particular, fixation maps and scanpaths are available in the dataset. Based on this dataset, researchers could analyze the visual traits of children with ASD and design specialized visual attention models to promote research in related fields, as well as design specialized models to identify the individuals with ASD. The dataset can be accessed in http://doi.org/10.5281/zenodo.2647418

58 citations

Proceedings ArticleDOI
Huiyu Duan1, Guangtao Zhai1, Xiaokang Yang1, Li Duo1, Wenhan Zhu1 
01 May 2017
TL;DR: A new database, Immersive Video Quality Assessment Database 2017 (IVQAD 2017), intended for immersive video quality assessment in virtual reality environment, which contains 10 raw videos and 150 distorted videos.
Abstract: This paper presents a new database, Immersive Video Quality Assessment Database 2017 (IVQAD 2017), intended for immersive video quality assessment in virtual reality environment. Video quality assessment (VQA) plays an important role in video research fields. Nowadays virtual reality technology have been widely used and playing videos in virtual reality visual system is becoming more and more popular. However, existing research in VQA fields mainly focus on traditional videos. In this paper, we build the IVQAD which contains 10 raw videos and 150 distorted videos. Bit rate, frame rate and resolution were considered as quality degradation factors. All the videos were encoded with MPEG-4. Subjects were asked to assess the video under virtual reality environment and mean opinion score (MOS) was derived by computing. Using IVQAD 2017, researchers can explore the influence of resolution, video compression and video packet loss on immersive videos' quality.

43 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This survey provides a general overview of classical algorithms and recent progresses in the field of perceptual image quality assessment and describes the performances of the state-of-the-art quality measures for visual signals.
Abstract: Perceptual quality assessmentplays a vital role in the visual communication systems owing to theexistence of quality degradations introduced in various stages of visual signalacquisition, compression, transmission and display.Quality assessment for visual signals can be performed subjectively andobjectively, and objective quality assessment is usually preferred owing to itshigh efficiency and easy deployment. A large number of subjective andobjective visual quality assessment studies have been conducted during recent years.In this survey, we give an up-to-date and comprehensivereview of these studies.Specifically, the frequently used subjective image quality assessment databases are firstreviewed, as they serve as the validation set for the objective measures.Second, the objective image quality assessment measures are classified and reviewed according to the applications and the methodologies utilized in the quality measures.Third, the performances of the state-of-the-artquality measures for visual signals are compared with an introduction of theevaluation protocols.This survey provides a general overview of classical algorithms andrecent progresses in the field of perceptual image quality assessment.

281 citations

Journal ArticleDOI
TL;DR: This article reviews both datasets and visual attention modelling approaches for 360° video/image, which either utilize the spherical characteristics or visual attention models, and overviews the compression approaches.
Abstract: Nowadays, 360° video/image has been increasingly popular and drawn great attention. The spherical viewing range of 360° video/image accounts for huge data, which pose the challenges to 360° video/image processing in solving the bottleneck of storage, transmission, etc. Accordingly, the recent years have witnessed the explosive emergence of works on 360° video/image processing. In this article, we review the state-of-the-art works on 360° video/image processing from the aspects of perception, assessment and compression. First, this article reviews both datasets and visual attention modelling approaches for 360° video/image. Second, we survey the related works on both subjective and objective visual quality assessment (VQA) of 360° video/image. Third, we overview the compression approaches for 360° video/image, which either utilize the spherical characteristics or visual attention models. Finally, we summarize this overview article and outlook the future research trends on 360° video/image processing.

191 citations

Posted Content
TL;DR: A novel and efficient Respiratory Simulation Model (RSM) is proposed to fill the gap between the large amount of training data and scarce real-world data and have the great potential to be extended to large scale applications such as public places, sleep scenario, and office environment.
Abstract: Research significance: The extended version of this paper has been accepted by IEEE Internet of Things journal (DOI: 10.1109/JIOT.2020.2991456), please cite the journal version. During the epidemic prevention and control period, our study can be helpful in prognosis, diagnosis and screening for the patients infected with COVID-19 (the novel coronavirus) based on breathing characteristics. According to the latest clinical research, the respiratory pattern of COVID-19 is different from the respiratory patterns of flu and the common cold. One significant symptom that occurs in the COVID-19 is Tachypnea. People infected with COVID-19 have more rapid respiration. Our study can be utilized to distinguish various respiratory patterns and our device can be preliminarily put to practical use. Demo videos of this method working in situations of one subject and two subjects can be downloaded online. Research details: Accurate detection of the unexpected abnormal respiratory pattern of people in a remote and unobtrusive manner has great significance. In this work, we innovatively capitalize on depth camera and deep learning to achieve this goal. The challenges in this task are twofold: the amount of real-world data is not enough for training to get the deep model; and the intra-class variation of different types of respiratory patterns is large and the outer-class variation is small. In this paper, considering the characteristics of actual respiratory signals, a novel and efficient Respiratory Simulation Model (RSM) is first proposed to fill the gap between the large amount of training data and scarce real-world data. The proposed deep model and the modeling ideas have the great potential to be extended to large scale applications such as public places, sleep scenario, and office environment.

166 citations

Journal ArticleDOI
TL;DR: This paper builds a compressed VR image quality (CVIQ) database, and proposes a multi-channel convolution neural network (CNN) for blind 360-degree image quality assessment (MC360IQA), which achieves the best performance among the state-of-art full-reference and no-reference image quality Assessment (IQA) models on the CVIQ database and other available360-degree IQA database.
Abstract: 360-degree images/videos have been dramatically increasing in recent years. The characteristic of omnidirectional-view results in high resolution of 360-degree images/videos, which makes them difficult to be transported and stored. To deal with the problem, video coding technologies are used to compress the omnidirectional content but they will introduce the compression distortion. Therefore, it is important to study how popular coding technologies affect the quality of 360-degree images. In this paper, we present a study on both subjective and objective quality assessment of compressed virtual reality (VR) images. We first build a compressed VR image quality (CVIQ) database including 16 reference images and 528 compressed ones with three prevailing coding technologies. Then, we propose a multi-channel convolution neural network (CNN) for blind 360-degree image quality assessment (MC360IQA). To be consistent with the visual content seen in the VR device, we project each 360-degree image into six viewport images, which are adopted as inputs of the proposed model. MC360IQA consists of two parts, a multi-channel CNN and an image quality regressor. The multi-channel CNN includes six parallel hyper-ResNet34 networks, where the hyper structure is used to incorporate the features from intermediate layers. The image quality regressor fuses the features and regresses them to final scores. The experimental results show that our model achieves the best performance among the state-of-art full-reference (FR) and no-reference (NR) image quality assessment (IQA) models on the CVIQ database and other available 360-degree IQA database.

91 citations

Posted Content
TL;DR: Key studies conducted with the aid of DL networks to distinguish ASD are investigated and important challenges in the automated detection and rehabilitation of ASD are presented and proposed.
Abstract: Accurate diagnosis of Autism Spectrum Disorder (ASD) is essential for its management and rehabilitation Neuroimaging techniques that are non-invasive are disease markers and may be leveraged to aid ASD diagnosis Structural and functional neuroimaging techniques provide physicians substantial information about the structure (anatomy and structural connectivity) and function (activity and functional connectivity) of the brain Due to the intricate structure and function of the brain, diagnosing ASD with neuroimaging data without exploiting artificial intelligence (AI) techniques is extremely challenging AI techniques comprise traditional machine learning (ML) approaches and deep learning (DL) techniques Conventional ML methods employ various feature extraction and classification techniques, but in DL, the process of feature extraction and classification is accomplished intelligently and integrally In this paper, studies conducted with the aid of DL networks to distinguish ASD were investigated Rehabilitation tools provided by supporting ASD patients utilizing DL networks were also assessed Finally, we presented important challenges in this automated detection and rehabilitation of ASD

87 citations