scispace - formally typeset
Search or ask a question
Author

Li-Jia Dong

Bio: Li-Jia Dong is an academic researcher from Huaqiao University. The author has contributed to research in topics: Artificial intelligence & Feature (linguistics). The author has co-authored 2 publications.

Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a learning and fusion network of multiple hidden substages is proposed to assess athletic performance by segmenting videos into five substages by a temporal semantic segmentation, and a fully-connected-network-based hidden regression model is built to predict the score of each substage, fusing these scores into the overall score.
Abstract: Many of the existing methods for action quality assessment implement single-stage score regression networks that lack pertinence and rationality for the evaluation task. In this work, our target is to find a reasonable action quality assessment method for sports competitions that conforms to objective evaluation rules and field experience. To achieve this goal, three assessment scenarios, i.e., the overall-score-guided scenario, execution-score-guided scenario, and difficulty-level-based overall-score-guided scenario, are defined. A learning and fusion network of multiple hidden substages is proposed to assess athletic performance by segmenting videos into five substages by a temporal semantic segmentation. The feature of each video segment is extracted from the five feature backbone networks with shared weights, and a fully-connected-network-based hidden regression model is built to predict the score of each substage, fusing these scores into the overall score. We evaluate the proposed method on the UNLV-Diving dataset. The comparison results show that the proposed method based on objective evaluation rules of sports competitions outperforms the regression model directly trained on the overall score. The proposed multiple-substage network is more accurate than the single-stage score regression network and achieves state-of-the-art performance by leveraging objective evaluation rules and field experience that are beneficial for building an accurate and reasonable action quality assessment model.

7 citations

Journal ArticleDOI
TL;DR: This work proposes a new motion assessment system based on light camera that was regarded as pattern regression problem of skeleton joint trajectory, and the regression model is built to assess the motion quality.
Abstract: Most existing methods of motion assessment system used the contact sensor, infrared sensor, and depth sensor, and few works provided the solution of digital camera To solve this problem, the authors propose a new motion assessment system based on light camera In this work, the motion assessment was regarded as pattern regression problem of skeleton joint trajectory Firstly, the system uses the camera to capture the image sequences The pose estimation method is used to obtain body skeleton from image Secondly, due to the difference of motion frequency of each person, the length of the image sequences is different and the length of each joint trajectory also will be different Fourier transform is applied to normalise the trajectory and use the coefficients of Fourier transform as the joint trajectory feature Finally, the regression model is built to assess the motion quality Some experimental results and discussion on action video data are used to verify the effectiveness of the system

1 citations

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a Gaussian guided frame sequence encoder network for action quality assessment (AQA), where the image feature of each video frame is extracted by ResNet model.
Abstract: Abstract Can a computer evaluate an athlete’s performance automatically? Many action quality assessment (AQA) methods have been proposed in recent years. Limited by the randomness of video sampling and the simple strategy of model training, the performance of the existing AQA methods can still be further improved. To achieve this goal, a Gaussian guided frame sequence encoder network is proposed in this paper. In the proposed method, the image feature of each video frame is extracted by Resnet model. And then, a frame sequence encoder network is applied to model temporal information and generate action quality feature. Finally, a fully connected network is designed to predict action quality score. To train the proposed method effectively, inspired by the final score calculation rule in Olympic game, Gaussian loss function is employed to compute the error between the predicted score and the label score. The proposed method is implemented on the AQA-7 and MTL–AQA datasets. The experimental results confirm that compared with the state-of-the-art methods, our proposed method achieves the better performance. And detailed ablation experiments are conducted to verify the effectiveness of each component in the module.
Journal ArticleDOI
TL;DR: In this article , a label-reconstruction-based pseudo-subscore learning (PSL) method is proposed for action quality assessment in sporting events, where the overall score of an action is not only regarded as a quality label but also used as a feature of the training set.
Abstract: Most existing action quality assessment (AQA) methods provide only an overall quality score for the input video and lack an evaluation of each substage of the movement process; thus, these methods cannot provide detailed feedback for users. Moreover, the existing datasets do not provide labels for substage quality assessment. To address these problems, in this work, a new label-reconstruction-based pseudo-subscore learning (PSL) method is proposed for AQA in sporting events. In the proposed method, the overall score of an action is not only regarded as a quality label but also used as a feature of the training set. A label-reconstruction-based learning algorithm is built to generate pseudo-subscore labels for the training set. Moreover, based on the pseudo-subscore labels and overall score labels, a multi-substage AQA model is fine-tuned from the PSL model to predict the action quality score of each substage and the overall score for an athlete. Several ablation experiments are performed to verify the effectiveness of each module. The experimental results show that our approach achieves state-of-the-art performance.

Cited by
More filters
Journal ArticleDOI
TL;DR: In this article , a dataset for vision-based autonomous functional movement screen (FMS) is presented from 45 human subjects of different ages (18-59 years old) executing the following movements: deep squat, hurdle step, in-line lunge, shoulder mobility, active straight raise, trunk stability push-up and rotary stability.
Abstract: This paper presents a dataset for vision-based autonomous Functional Movement Screen (FMS) collected from 45 human subjects of different ages (18-59 years old) executing the following movements: deep squat, hurdle step, in-line lunge, shoulder mobility, active straight raise, trunk stability push-up and rotary stability. Specifically, shoulder mobility was performed only once by different subjects, while the other movements were repeated for three episodes each. Each episode was saved as one record and was annotated from 0 to 3 by three FMS experts. The main strength of our database is twofold. One is the multimodal data provided, including color images, depth images, quaternions, 3D human skeleton joints and 2D pixel trajectories of 32 joints. The other is the multiview data collected from the two synchronized Azure Kinect sensors in front of and on the side of the subjects. Finally, our dataset contains a total of 1812 recordings, with 3624 episodes. The size of the dataset is 190 GB. This dataset provides the opportunity for automatic action quality evaluation of FMS.

7 citations

Journal ArticleDOI
TL;DR: In this paper , a dataset for vision-based autonomous functional movement screen (FMS) is presented from 45 human subjects of different ages (18-59 years old) executing the following movements: deep squat, hurdle step, in-line lunge, shoulder mobility, active straight raise, trunk stability push-up and rotary stability.
Abstract: This paper presents a dataset for vision-based autonomous Functional Movement Screen (FMS) collected from 45 human subjects of different ages (18-59 years old) executing the following movements: deep squat, hurdle step, in-line lunge, shoulder mobility, active straight raise, trunk stability push-up and rotary stability. Specifically, shoulder mobility was performed only once by different subjects, while the other movements were repeated for three episodes each. Each episode was saved as one record and was annotated from 0 to 3 by three FMS experts. The main strength of our database is twofold. One is the multimodal data provided, including color images, depth images, quaternions, 3D human skeleton joints and 2D pixel trajectories of 32 joints. The other is the multiview data collected from the two synchronized Azure Kinect sensors in front of and on the side of the subjects. Finally, our dataset contains a total of 1812 recordings, with 3624 episodes. The size of the dataset is 190 GB. This dataset provides the opportunity for automatic action quality evaluation of FMS.

6 citations

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a skeleton-based deep pose feature learning method to automatically evaluate the complicated activities in long-duration sports videos, such as figure skating and artistic gymnastic.

3 citations

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper explored the future development path of the college health education and health education's impact on students' sports exercise by combining artificial intelligence (AI) algorithm with intelligent robotics technology to acquire and analyze students' sport exercise behaviors.
Abstract: This study aims to explore the future development path of the college health education and health education's impact on students' sports exercise. Specifically, artificial intelligence (AI) algorithm is combined with intelligent robotics technology to acquire and analyze students' sports exercise behaviors. As a result, a new development model is formulated for college health education. First, it explores students' sports exercise and health education situation in Chinese higher institutions and uncovers the underlying problems. Then it puts forward the corresponding modification suggestions. Second, the AI algorithm and the Kinect sensor-mounted intelligent robot capture the human skeleton features to obtain smooth skeleton joint points data. At the same time, a visual perception human motion recognition (HMR) algorithm is established based on the Hidden Markov Model (HMM). Afterward, the proposed HMM-based HMR algorithm is used to recognize students' sports exercise motions by analyzing human motion skeleton images. The experimental outcomes suggest that the maximum reconstruction error of the HMR algorithm is 10 mm, and the compression ratio is between 5 and 10; the HMR rate is more than 96%. Compared with similar algorithms, the proposed visual perception HMR algorithm depends less on the number of training samples. It can achieve a high recognition rate given only a relatively few samples. Therefore, the proposed (AI + intelligent robot)-enabled HMM-based HMR algorithm can effectively identify the behavior characteristics of students in sports exercise. This study can provide a reference for exploring college students' health education development path.

1 citations

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a Gaussian guided frame sequence encoder network for action quality assessment (AQA), where the image feature of each video frame is extracted by ResNet model.
Abstract: Abstract Can a computer evaluate an athlete’s performance automatically? Many action quality assessment (AQA) methods have been proposed in recent years. Limited by the randomness of video sampling and the simple strategy of model training, the performance of the existing AQA methods can still be further improved. To achieve this goal, a Gaussian guided frame sequence encoder network is proposed in this paper. In the proposed method, the image feature of each video frame is extracted by Resnet model. And then, a frame sequence encoder network is applied to model temporal information and generate action quality feature. Finally, a fully connected network is designed to predict action quality score. To train the proposed method effectively, inspired by the final score calculation rule in Olympic game, Gaussian loss function is employed to compute the error between the predicted score and the label score. The proposed method is implemented on the AQA-7 and MTL–AQA datasets. The experimental results confirm that compared with the state-of-the-art methods, our proposed method achieves the better performance. And detailed ablation experiments are conducted to verify the effectiveness of each component in the module.