scispace - formally typeset
Search or ask a question

Showing papers in "PeerJ in 2022"


Journal ArticleDOI
12 Apr 2022-PeerJ
TL;DR: By comparing the proposed methods with other existed methods that are tested using the same database, the proposed method solves the lack of in-depth interactive information and interpretability in the model design which would be inspirational for future improving and studying of natural language reasoning.
Abstract: The whole sentence representation reasoning process simultaneously comprises a sentence representation module and a semantic reasoning module. This paper combines the multi-layer semantic representation network with the deep fusion matching network to solve the limitations of only considering a sentence representation module or a reasoning model. It proposes a joint optimization method based on multi-layer semantics called the Semantic Fusion Deep Matching Network (SCF-DMN) to explore the influence of sentence representation and reasoning models on reasoning performance. Experiments on text entailment recognition tasks show that the joint optimization representation reasoning method performs better than the existing methods. The sentence representation optimization module and the improved optimization reasoning model can promote reasoning performance when used individually. However, the optimization of the reasoning model has a more significant impact on the final reasoning results. Furthermore, after comparing each module’s performance, there is a mutual constraint between the sentence representation module and the reasoning model. This condition restricts overall performance, resulting in no linear superposition of reasoning performance. Overall, by comparing the proposed methods with other existed methods that are tested using the same database, the proposed method solves the lack of in-depth interactive information and interpretability in the model design which would be inspirational for future improving and studying of natural language reasoning.

73 citations


Journal ArticleDOI
健 韩1
21 Mar 2022-PeerJ
TL;DR: A review of the state-of-the-art in deep learning for computational bioacoustics can be found in this article , where the authors propose a subjective but principled roadmap for computational biology with deep learning, in order to make the most of future developments in AI and informatics.
Abstract: Animal vocalisations and natural soundscapes are fascinating objects of study, and contain valuable evidence about animal behaviours, populations and ecosystems. They are studied in bioacoustics and ecoacoustics, with signal processing and analysis an important component. Computational bioacoustics has accelerated in recent decades due to the growth of affordable digital sound recording devices, and to huge progress in informatics such as big data, signal processing and machine learning. Methods are inherited from the wider field of deep learning, including speech and image processing. However, the tasks, demands and data characteristics are often different from those addressed in speech or music analysis. There remain unsolved problems, and tasks for which evidence is surely present in many acoustic signals, but not yet realised. In this paper I perform a review of the state of the art in deep learning for computational bioacoustics, aiming to clarify key concepts and identify and analyse knowledge gaps. Based on this, I offer a subjective but principled roadmap for computational bioacoustics with deep learning: topics that the community should aim to address, in order to make the most of future developments in AI and informatics, and to use audio data in answering zoological and ecological questions.

51 citations


Journal ArticleDOI
25 Feb 2022-PeerJ
TL;DR: Markerless methods increase motion capture data versatility, enabling datasets to be re-analyzed using updated pose estimation algorithms and may even provide clinicians with the capability to collect data while patients are wearing normal clothing.
Abstract: Background Markerless motion capture has the potential to perform movement analysis with reduced data collection and processing time compared to marker-based methods. This technology is now starting to be applied for clinical and rehabilitation applications and therefore it is crucial that users of these systems understand both their potential and limitations. This literature review aims to provide a comprehensive overview of the current state of markerless motion capture for both single camera and multi-camera systems. Additionally, this review explores how practical applications of markerless technology are being used in clinical and rehabilitation settings, and examines the future challenges and directions markerless research must explore to facilitate full integration of this technology within clinical biomechanics. Methodology A scoping review is needed to examine this emerging broad body of literature and determine where gaps in knowledge exist, this is key to developing motion capture methods that are cost effective and practically relevant to clinicians, coaches and researchers around the world. Literature searches were performed to examine studies that report accuracy of markerless motion capture methods, explore current practical applications of markerless motion capture methods in clinical biomechanics and identify gaps in our knowledge that are relevant to future developments in this area. Results Markerless methods increase motion capture data versatility, enabling datasets to be re-analyzed using updated pose estimation algorithms and may even provide clinicians with the capability to collect data while patients are wearing normal clothing. While markerless temporospatial measures generally appear to be equivalent to marker-based motion capture, joint center locations and joint angles are not yet sufficiently accurate for clinical applications. Pose estimation algorithms are approaching similar error rates of marker-based motion capture, however, without comparison to a gold standard, such as bi-planar videoradiography, the true accuracy of markerless systems remains unknown. Conclusions Current open-source pose estimation algorithms were never designed for biomechanical applications, therefore, datasets on which they have been trained are inconsistently and inaccurately labelled. Improvements to labelling of open-source training data, as well as assessment of markerless accuracy against gold standard methods will be vital next steps in the development of this technology.

49 citations


Journal ArticleDOI
20 Jan 2022-PeerJ
TL;DR: A detailed survey on the latest developments in spam text detection and classification in social media involving Machine Learning, Deep Learning, and text-based approaches is presented.
Abstract: The presence of spam content in social media is tremendously increasing, and therefore the detection of spam has become vital. The spam contents increase as people extensively use social media, i.e., Facebook, Twitter, YouTube, and E-mail. The time spent by people using social media is overgrowing, especially in the time of the pandemic. Users get a lot of text messages through social media, and they cannot recognize the spam content in these messages. Spam messages contain malicious links, apps, fake accounts, fake news, reviews, rumors, etc. To improve social media security, the detection and control of spam text are essential. This paper presents a detailed survey on the latest developments in spam text detection and classification in social media. The various techniques involved in spam detection and classification involving Machine Learning, Deep Learning, and text-based approaches are discussed in this paper. We also present the challenges encountered in the identification of spam with its control mechanisms and datasets used in existing works involving spam detection.

31 citations


Journal ArticleDOI
20 Jan 2022-PeerJ
TL;DR: This article showed that having few random effects levels does not strongly influence the parameter estimates or uncertainty around those estimates for fixed effects terms, at least in the case presented here, and that the coverage probability of fixed effects estimates is sample size dependent.
Abstract: As linear mixed-effects models (LMMs) have become a widespread tool in ecology, the need to guide the use of such tools is increasingly important. One common guideline is that one needs at least five levels of the grouping variable associated with a random effect. Having so few levels makes the estimation of the variance of random effects terms (such as ecological sites, individuals, or populations) difficult, but it need not muddy one’s ability to estimate fixed effects terms—which are often of primary interest in ecology. Here, I simulate datasets and fit simple models to show that having few random effects levels does not strongly influence the parameter estimates or uncertainty around those estimates for fixed effects terms—at least in the case presented here. Instead, the coverage probability of fixed effects estimates is sample size dependent. LMMs including low-level random effects terms may come at the expense of increased singular fits, but this did not appear to influence coverage probability or RMSE, except in low sample size (N = 30) scenarios. Thus, it may be acceptable to use fewer than five levels of random effects if one is not interested in making inferences about the random effects terms (i.e. when they are ‘nuisance’ parameters used to group non-independent data), but further work is needed to explore alternative scenarios. Given the widespread accessibility of LMMs in ecology and evolution, future simulation studies and further assessments of these statistical methods are necessary to understand the consequences both of violating and of routinely following simple guidelines.

28 citations


Journal ArticleDOI
03 Mar 2022-PeerJ
TL;DR: This study compares eleven machine learning algorithms to determine which one produces the best accuracy for predicting thyroid risk accurately and shows that the ANN Classifier with an F1-score of 0.957 outperforms the other nine algorithms in terms of accuracy.
Abstract: Thyroid disease is the general concept for a medical problem that prevents one’s thyroid from producing enough hormones. Thyroid disease can affect everyone—men, women, children, adolescents, and the elderly. Thyroid disorders are detected by blood tests, which are notoriously difficult to interpret due to the enormous amount of data necessary to forecast results. For this reason, this study compares eleven machine learning algorithms to determine which one produces the best accuracy for predicting thyroid risk accurately. This study utilizes the Sick-euthyroid dataset, acquired from the University of California, Irvine’s machine learning repository, for this purpose. Since the target variable classes in this dataset are mostly one, the accuracy score does not accurately indicate the prediction outcome. Thus, the evaluation metric contains accuracy and recall ratings. Additionally, the F1-score produces a single value that balances the precision and recall when an uneven distribution class exists. Finally, the F1-score is utilized to evaluate the performance of the employed machine learning algorithms as it is one of the most effective output measurements for unbalanced classification problems. The experiment shows that the ANN Classifier with an F1-score of 0.957 outperforms the other nine algorithms in terms of accuracy.

28 citations


Journal ArticleDOI
03 Jun 2022-PeerJ
TL;DR: A formulaic language identification model based on GCN fusing associated information in which each sentence is constructed into a graph in which the nodes are part-of-speech features and semantic features of the words in the sentence and the edges between nodes are constructed according to mutual information and dependency syntactic relation.
Abstract: Formulaic language is a general term for ready-made structures in a language. It usually has fixed grammatical structure, stable language expression meaning and specific use context. The use of formulaic language can coordinate sentence generation in the process of writing and communication, and can significantly improve the idiomaticity and logic of machine translation, intelligent question answering and so on. New formulaic language is generated almost every day, and how to accurately identify them is a topic worthy of research. To this end, this article proposes a formulaic language identification model based on GCN fusing associated information. The innovation is that each sentence is constructed into a graph in which the nodes are part-of-speech features and semantic features of the words in the sentence and the edges between nodes are constructed according to mutual information and dependency syntactic relation. On this basis, the graph convolutional neural network is adopted to extract the associated information between words to mine deeper grammatical features. Therefore, it can improve the accuracy of formulaic language identification. The experimental results show that the model in this article is superior to the classical formulaic language identification model in terms of accuracy, recall and F1-score. It lays a foundation for the follow-up research of formulaic language identification tasks.

27 citations


Journal ArticleDOI
29 Apr 2022-PeerJ
TL;DR: Experimental results prove that the proposed improved firefly algorithm has significant potential in tackling machine learning hyper-parameters optimisation challenge and that it can be used for improving classification accuracy and average precision of network intrusion detection systems.
Abstract: The research proposed in this article presents a novel improved version of the widely adopted firefly algorithm and its application for tuning and optimising XGBoost classifier hyper-parameters for network intrusion detection. One of the greatest issues in the domain of network intrusion detection systems are relatively high false positives and false negatives rates. In the proposed study, by using XGBoost classifier optimised with improved firefly algorithm, this challenge is addressed. Based on the established practice from the modern literature, the proposed improved firefly algorithm was first validated on 28 well-known CEC2013 benchmark instances a comparative analysis with the original firefly algorithm and other state-of-the-art metaheuristics was conducted. Afterwards, the devised method was adopted and tested for XGBoost hyper-parameters optimisation and the tuned classifier was tested on the widely used benchmarking NSL-KDD dataset and more recent USNW-NB15 dataset for network intrusion detection. Obtained experimental results prove that the proposed metaheuristics has significant potential in tackling machine learning hyper-parameters optimisation challenge and that it can be used for improving classification accuracy and average precision of network intrusion detection systems.

25 citations


Journal ArticleDOI
18 Apr 2022-PeerJ
TL;DR: This research work proposed a real-time method for detecting eye blinks in a video series and it is seen that the suggested approach is more efficient than the state-of-the-art technique.
Abstract: Blink detection is an important technique in a variety of settings, including facial movement analysis and signal processing. However, automatic blink detection is very challenging because of the blink rate. This research work proposed a real-time method for detecting eye blinks in a video series. Automatic facial landmarks detectors are trained on a real-world dataset and demonstrate exceptional resilience to a wide range of environmental factors, including lighting conditions, face emotions, and head position. For each video frame, the proposed method calculates the facial landmark locations and extracts the vertical distance between the eyelids using the facial landmark positions. Our results show that the recognizable landmarks are sufficiently accurate to determine the degree of eye-opening and closing consistently. The proposed algorithm estimates the facial landmark positions, extracts a single scalar quantity by using Modified Eye Aspect Ratio (Modified EAR) and characterizing the eye closeness in each frame. Finally, blinks are detected by the Modified EAR threshold value and detecting eye blinks as a pattern of EAR values in a short temporal window. According to the results from a typical data set, it is seen that the suggested approach is more efficient than the state-of-the-art technique.

21 citations


Journal ArticleDOI
19 Jul 2022-PeerJ
TL;DR: In this paper , the state-of-the-art research directions in self-supervised learning approaches for image data with a concentration on their applications in the field of medical imaging analysis are reviewed.
Abstract: The scarcity of high-quality annotated medical imaging datasets is a major problem that collides with machine learning applications in the field of medical imaging analysis and impedes its advancement. Self-supervised learning is a recent training paradigm that enables learning robust representations without the need for human annotation which can be considered an effective solution for the scarcity of annotated medical data. This article reviews the state-of-the-art research directions in self-supervised learning approaches for image data with a concentration on their applications in the field of medical imaging analysis. The article covers a set of the most recent self-supervised learning methods from the computer vision field as they are applicable to the medical imaging analysis and categorize them as predictive, generative, and contrastive approaches. Moreover, the article covers 40 of the most recent research papers in the field of self-supervised learning in medical imaging analysis aiming at shedding the light on the recent innovation in the field. Finally, the article concludes with possible future research directions in the field.

21 citations


Journal ArticleDOI
04 Feb 2022-PeerJ
TL;DR: A UAV-based offloading strategy is proposed where first, the IoT devices are dynamically clustered considering the limited energy of UAVs, and task delays, and second, the UAV hovers over each cluster head to process the offloaded tasks.
Abstract: Internet of Things (IoT) tasks are offloaded to servers located at the edge network for improving the power consumption of IoT devices and the execution times of tasks. However, deploying edge servers could be difficult or even impossible in hostile terrain or emergency areas where the network is down. Therefore, edge servers are mounted on unmanned aerial vehicles (UAVs) to support task offloading in such scenarios. However, the challenge is that the UAV has limited energy, and IoT tasks are delay-sensitive. In this paper, a UAV-based offloading strategy is proposed where first, the IoT devices are dynamically clustered considering the limited energy of UAVs, and task delays, and second, the UAV hovers over each cluster head to process the offloaded tasks. The optimization problem of dynamically determining the optimal number of clusters, specifying the member tasks of each cluster, is modeled as a mixed-integer, nonlinear constraint optimization. A discrete differential evolution (DDE) algorithm with new mutation and crossover operators is proposed for the formulated optimization problem, and compared with the particle swarm optimization (PSO) and genetic algorithm (GA) meta-heuristics. Further, the ant colony optimization (ACO) algorithm is employed to identify the shortest path over the cluster heads for the UAV to traverse. The simulation results validate the effectiveness of the proposed offloading strategy in terms of tasks delays and UAV energy consumption.

Journal ArticleDOI
07 Jan 2022-PeerJ
TL;DR: This paper presents a framework in which different machine learning classification schemes are employed to detect various types of network attack categories, and shows improvement in the accuracy of classification models is observed when Synthetic Minority Oversampling Technique (SMOTE) is applied to address the class imbalance problem.
Abstract: The expeditious growth of the World Wide Web and the rampant flow of network traffic have resulted in a continuous increase of network security threats. Cyber attackers seek to exploit vulnerabilities in network architecture to steal valuable information or disrupt computer resources. Network Intrusion Detection System (NIDS) is used to effectively detect various attacks, thus providing timely protection to network resources from these attacks. To implement NIDS, a stream of supervised and unsupervised machine learning approaches is applied to detect irregularities in network traffic and to address network security issues. Such NIDSs are trained using various datasets that include attack traces. However, due to the advancement in modern-day attacks, these systems are unable to detect the emerging threats. Therefore, NIDS needs to be trained and developed with a modern comprehensive dataset which contains contemporary common and attack activities. This paper presents a framework in which different machine learning classification schemes are employed to detect various types of network attack categories. Five machine learning algorithms: Random Forest, Decision Tree, Logistic Regression, K-Nearest Neighbors and Artificial Neural Networks, are used for attack detection. This study uses a dataset published by the University of New South Wales (UNSW-NB15), a relatively new dataset that contains a large amount of network traffic data with nine categories of network attacks. The results show that the classification models achieved the highest accuracy of 89.29% by applying the Random Forest algorithm. Further improvement in the accuracy of classification models is observed when Synthetic Minority Oversampling Technique (SMOTE) is applied to address the class imbalance problem. After applying the SMOTE, the Random Forest classifier showed an accuracy of 95.1% with 24 selected features from the Principal Component Analysis method.

Journal ArticleDOI
15 Mar 2022-PeerJ
TL;DR: CIAlign as mentioned in this paper is a trimming tool with multiple visualisation options for multiple sequence alignment (MSAs) trimming, which aims to give intervention power to the user by offering various options and outputs graphical representations of the alignment before and after processing.
Abstract: Background Throughout biology, multiple sequence alignments (MSAs) form the basis of much investigation into biological features and relationships. These alignments are at the heart of many bioinformatics analyses. However, sequences in MSAs are often incomplete or very divergent, which can lead to poor alignment and large gaps. This slows down computation and can impact conclusions without being biologically relevant. Cleaning the alignment by removing common issues such as gaps, divergent sequences, large insertions and deletions and poorly aligned sequence ends can substantially improve analyses. Manual editing of MSAs is very widespread but is time-consuming and difficult to reproduce. Results We present a comprehensive, user-friendly MSA trimming tool with multiple visualisation options. Our highly customisable command line tool aims to give intervention power to the user by offering various options, and outputs graphical representations of the alignment before and after processing to give the user a clear overview of what has been removed. The main functionalities of the tool include removing regions of low coverage due to insertions, removing gaps, cropping poorly aligned sequence ends and removing sequences that are too divergent or too short. The thresholds for each function can be specified by the user and parameters can be adjusted to each individual MSA. CIAlign is designed with an emphasis on solving specific and common alignment problems and on providing transparency to the user. Conclusion CIAlign effectively removes problematic regions and sequences from MSAs and provides novel visualisation options. This tool can be used to fine-tune alignments for further analysis and processing. The tool is aimed at anyone who wishes to automatically clean up parts of an MSA and those requiring a new, accessible way of visualising large MSAs.

Journal ArticleDOI
17 Feb 2022-PeerJ
TL;DR: PCAtest is an R package that implements permutation-based statistical tests to evaluate the overall significance of a PCA, the significance of each PC axis, and of contributions of each observed variable to the significant axes.
Abstract: Principal Component Analysis (PCA) is one of the most broadly used statistical methods for the ordination and dimensionality-reduction of multivariate datasets across many scientific disciplines. Trivial PCs can be estimated from data sets without any correlational structure among the original variables, and traditional criteria for selecting non-trivial PC axes are difficult to implement, partially subjective or based on ad hoc thresholds. PCAtest is an R package that implements permutation-based statistical tests to evaluate the overall significance of a PCA, the significance of each PC axis, and of contributions of each observed variable to the significant axes. Based on simulation and empirical results, I encourage R users to routinely apply PCAtest to test the significance of their PCA before proceeding with the direct interpretation of PC axes and/or the utilization of PC scores in subsequent evolutionary and ecological analyses.

Journal ArticleDOI
31 Jan 2022-PeerJ
TL;DR: This work proposes a set of modifications that utilize temporal information from previous frames and provide new neural network architectures for monocular depth estimation in a self-supervised manner and shows that proposed modifications can be an effective tool for exploiting temporal information in a depth prediction pipeline.
Abstract: Depth estimation has been an essential task for many computer vision applications, especially in autonomous driving, where safety is paramount. Depth can be estimated not only with traditional supervised learning but also via a self-supervised approach that relies on camera motion and does not require ground truth depth maps. Recently, major improvements have been introduced to make self-supervised depth prediction more precise. However, most existing approaches still focus on single-frame depth estimation, even in the self-supervised setting. Since most methods can operate with frame sequences, we believe that the quality of current models can be significantly improved with the help of information about previous frames. In this work, we study different ways of integrating recurrent blocks and attention mechanisms into a common self-supervised depth estimation pipeline. We propose a set of modifications that utilize temporal information from previous frames and provide new neural network architectures for monocular depth estimation in a self-supervised manner. Our experiments on the KITTI dataset show that proposed modifications can be an effective tool for exploiting temporal information in a depth prediction pipeline.

Journal ArticleDOI
22 Apr 2022-PeerJ
TL;DR: A set of baseline classifiers such as machine learning algorithms (Random forest (RF), Decision tree (J48), Sequential minimal optimization (SMO), AdaBoostM1, and Bagging), deep-learning algorithms (Convolutional Neural Networks (1D-CNN), Long short-term memory (LSTM), and LSTM with CNN features) and transformer-based baseline (BERT) are built.
Abstract: Urdu is a widely used language in South Asia and worldwide. While there are similar datasets available in English, we created the first multi-label emotion dataset consisting of 6,043 tweets and six basic emotions in the Urdu Nastalíq script. A multi-label (ML) classification approach was adopted to detect emotions from Urdu. The morphological and syntactic structure of Urdu makes it a challenging problem for multi-label emotion detection. In this paper, we build a set of baseline classifiers such as machine learning algorithms (Random forest (RF), Decision tree (J48), Sequential minimal optimization (SMO), AdaBoostM1, and Bagging), deep-learning algorithms (Convolutional Neural Networks (1D-CNN), Long short-term memory (LSTM), and LSTM with CNN features) and transformer-based baseline (BERT). We used a combination of text representations: stylometric-based features, pre-trained word embedding, word-based n-grams, and character-based n-grams. The paper highlights the annotation guidelines, dataset characteristics and insights into different methodologies used for Urdu based emotion classification. We present our best results using micro-averaged F1, macro-averaged F1, accuracy, Hamming loss (HL) and exact match (EM) for all tested methods.

Journal ArticleDOI
06 Apr 2022-PeerJ
TL;DR: This systematic review provides a comprehensive assessment of the video violence detection problems that have been described in state-of-the-art researches and presents public datasets for testing video based violence detection methods’ performance and compares their results.
Abstract: We investigate and analyze methods to violence detection in this study to completely disassemble the present condition and anticipate the emerging trends of violence discovery research. In this systematic review, we provide a comprehensive assessment of the video violence detection problems that have been described in state-of-the-art researches. This work aims to address the problems as state-of-the-art methods in video violence detection, datasets to develop and train real-time video violence detection frameworks, discuss and identify open issues in the given problem. In this study, we analyzed 80 research papers that have been selected from 154 research papers after identification, screening, and eligibility phases. As the research sources, we used five digital libraries and three high ranked computer vision conferences that were published between 2015 and 2021. We begin by briefly introducing core idea and problems of video-based violence detection; after that, we divided current techniques into three categories based on their methodologies: conventional methods, end-to-end deep learning-based methods, and machine learning-based methods. Finally, we present public datasets for testing video based violence detectionmethods’ performance and compare their results. In addition, we summarize the open issues in violence detection in videoand evaluate its future tendencies.

Journal ArticleDOI
31 Mar 2022-PeerJ
TL;DR: Results indicated web-based online/computer delivered-interventions were effective or at least partially effective at decressing depression, anxiety, stress and eating disorder symptoms, however, digital mental health interventions using virtual reality and relaxation, exposure-based therapy was inconclusive.
Abstract: Background Poor mental health among university students remains a pressing public health issue. Over the past few years, digital health interventions have been developed and considered promising in increasing psychological wellbeing among university students. Therefore, this umbrella review aims to synthesize evidence on digital health interventions targeting university students and to evaluate their effectiveness. Methods A systematic literature search was performed in April 2021 searching PubMed, Psychology and Behavioural Science Collection, Web of Science, ERIC, and Scopus for systematic reviews and meta-analyses on digital mental health interventions targeting university students. The review protocol was registered in the International Prospective Register of Systematic Reviews PROSPERO [CRD42021234773]. Results The initital literature search resulted in 806 records of which seven remained after duplicates were removed and evaluated against the inclusion criteria. Effectiveness was reported and categorized into the following six delivery types: (a) web-based, online/computer-delivered interventions (b) computer-based Cognitive Behavior Therapy (CBT), (c) mobile applications and short message service (d) virtual reality interventions (e) skills training (f) relaxation and exposure-based therapy. Results indicated web-based online/computer delivered-interventions were effective or at least partially effective at decressing depression, anxiety, stress and eating disorder symptoms. This was similar for skills-training interventions, CBT-based intervention and mobile applications. However, digital mental health interventions using virtual reality and relaxation, exposure-based therapy was inconclusive. Due to the variation in study settings and inconsistencies in reporting, effectiveness was greatly dependent on the delivery format, targeted mental health problem and targeted purpose group. Conclusion The findings provide evidence for the beneficial effect of digital mental health interventions for university students. However, this review calls for a more systematic approach in testing and reporting the effectiveness of digital mental health interventions.

Journal ArticleDOI
04 Jan 2022-PeerJ
TL;DR: In this article , a recurrent neural network was used to predict ground reaction forces (GRF) waveforms across a range of running speeds and slopes from accelerometer data, and the predicted versus measured GRF waveforms had an average ± SD RMSE of 0.16 ± 0.04 BW and relative RMSE was 6.4 ± 1.5% across all conditions and subjects.
Abstract: Ground reaction forces (GRFs) are important for understanding human movement, but their measurement is generally limited to a laboratory environment. Previous studies have used neural networks to predict GRF waveforms during running from wearable device data, but these predictions are limited to the stance phase of level-ground running. A method of predicting the normal (perpendicular to running surface) GRF waveform using wearable devices across a range of running speeds and slopes could allow researchers and clinicians to predict kinetic and kinematic variables outside the laboratory environment.We sought to develop a recurrent neural network capable of predicting continuous normal (perpendicular to surface) GRFs across a range of running speeds and slopes from accelerometer data.Nineteen subjects ran on a force-measuring treadmill at five slopes (0°, ±5°, ±10°) and three speeds (2.5, 3.33, 4.17 m/s) per slope with sacral- and shoe-mounted accelerometers. We then trained a recurrent neural network to predict normal GRF waveforms frame-by-frame. The predicted versus measured GRF waveforms had an average ± SD RMSE of 0.16 ± 0.04 BW and relative RMSE of 6.4 ± 1.5% across all conditions and subjects.The recurrent neural network predicted continuous normal GRF waveforms across a range of running speeds and slopes with greater accuracy than neural networks implemented in previous studies. This approach may facilitate predictions of biomechanical variables outside the laboratory in near real-time and improves the accuracy of quantifying and monitoring external forces experienced by the body when running.

Journal ArticleDOI
18 May 2022-PeerJ
TL;DR: A systematic review of the role of Machine learning in Lockdown Exam Management Systems was conducted by evaluating 135 studies over the last five years and concluded with issues and challenges that machine learning imposes on the examination system.
Abstract: Examinations or assessments play a vital role in every student’s life; they determine their future and career paths. The COVID pandemic has left adverse impacts in all areas, including the academic field. The regularized classroom learning and face-to-face real-time examinations were not feasible to avoid widespread infection and ensure safety. During these desperate times, technological advancements stepped in to aid students in continuing their education without any academic breaks. Machine learning is a key to this digital transformation of schools or colleges from real-time to online mode. Online learning and examination during lockdown were made possible by Machine learning methods. In this article, a systematic review of the role of Machine learning in Lockdown Exam Management Systems was conducted by evaluating 135 studies over the last five years. The significance of Machine learning in the entire exam cycle from pre-exam preparation, conduction of examination, and evaluation were studied and discussed. The unsupervised or supervised Machine learning algorithms were identified and categorized in each process. The primary aspects of examinations, such as authentication, scheduling, proctoring, and cheat or fraud detection, are investigated in detail with Machine learning perspectives. The main attributes, such as prediction of at-risk students, adaptive learning, and monitoring of students, are integrated for more understanding of the role of machine learning in exam preparation, followed by its management of the post-examination process. Finally, this review concludes with issues and challenges that machine learning imposes on the examination system, and these issues are discussed with solutions.

Journal ArticleDOI
14 Jun 2022-PeerJ
TL;DR: A review of the effects of microplastics on the health of different animals is presented in this paper , where the authors search PubMed and Scopus from the period of 1/2010 to 09/2021 for peer-reviewed scientific publications focused on (1) environmental pollution with microplastic; (2) uptake of micro-plastics by humans; and (3) the impact of micro plastics on animal health.
Abstract: The environmental pollution by microplastics is a global problem arising from the extensive production and use of plastics. Small particles of different plastics, measured less than 5 mm in diameter, are found in water, air, soil, and various living organisms around the globe. Humans constantly inhale and ingest these particles. The associated health risks raise major concerns and require dedicated evaluation.In this review we systematize and summarize the effects of microplastics on the health of different animals. The article would be of interest to ecologists, experimental biologists, environmental physicians, and all those concerned with anthropogenic environmental changes.We searched PubMed and Scopus from the period of 01/2010 to 09/2021 for peer-reviewed scientific publications focused on (1) environmental pollution with microplastics; (2) uptake of microplastics by humans; and (3) the impact of microplastics on animal health.The number of published studies considering the effects of microplastic particles on aquatic organisms is considerable. In aquatic invertebrates, microplastics cause a decline in feeding behavior and fertility, slow down larval growth and development, increase oxygen consumption, and stimulate the production of reactive oxygen species. In fish, the microplastics may cause structural damage to the intestine, liver, gills, and brain, while affecting metabolic balance, behavior, and fertility; the degree of these harmful effects depends on the particle sizes and doses, as well as the exposure parameters. The corresponding data for terrestrial mammals are less abundant: only 30 papers found in PubMed and Scopus deal with the effects of microplastics in laboratory mice and rats; remarkably, about half of these papers were published in 2021, indicating the growing interest of the scientific community in this issue. The studies demonstrate that in mice and rats microplastics may also cause biochemical and structural damage with noticeable dysfunctions of the intestine, liver, and excretory and reproductive systems.Microplastics pollute the seas and negatively affect the health of aquatic organisms. The data obtained in laboratory mice and rats suggest a profound negative influence of microplastics on human health. However, given significant variation in plastic types, particle sizes, doses, models, and modes of administration, the available experimental data are still fragmentary and controversial.

Journal ArticleDOI
14 Jan 2022-PeerJ
TL;DR: This work provides a large-scale analysis of software usage and citation practices facilitated through an unprecedented knowledge graph of software mentions and affiliated metadata generated through supervised information extraction models trained on a unique gold standard corpus and applied to more than 3 million scientific articles.
Abstract: Science across all disciplines has become increasingly data-driven, leading to additional needs with respect to software for collecting, processing and analysing data. Thus, transparency about software used as part of the scientific process is crucial to understand provenance of individual research data and insights, is a prerequisite for reproducibility and can enable macro-analysis of the evolution of scientific methods over time. However, missing rigor in software citation practices renders the automated detection and disambiguation of software mentions a challenging problem. In this work, we provide a large-scale analysis of software usage and citation practices facilitated through an unprecedented knowledge graph of software mentions and affiliated metadata generated through supervised information extraction models trained on a unique gold standard corpus and applied to more than 3 million scientific articles. Our information extraction approach distinguishes different types of software and mentions, disambiguates mentions and outperforms the state-of-the-art significantly, leading to the most comprehensive corpus of 11.8 M software mentions that are described through a knowledge graph consisting of more than 300 M triples. Our analysis provides insights into the evolution of software usage and citation patterns across various fields, ranks of journals, and impact of publications. Whereas, to the best of our knowledge, this is the most comprehensive analysis of software use and citation at the time, all data and models are shared publicly to facilitate further research into scientific use and citation of software.

Journal ArticleDOI
22 Feb 2022-PeerJ
TL;DR: The majority of HLA alleles and haplotypes was not significantly affected by the mutations, suggesting the maintenance of effective T-cell immunity against the Omicron and Delta variants.
Abstract: The T-cell immune response is a major determinant of effective SARS-CoV-2 clearance. Here, using the recently developed T-CoV bioinformatics pipeline (https://t-cov.hse.ru) we analyzed the peculiarities of the viral peptide presentation for the Omicron, Delta and Wuhan variants of SARS-CoV-2. First, we showed the absence of significant differences in the presentation of SARS-CoV-2-derived peptides by the most frequent HLA class I/II alleles and the corresponding HLA haplotypes. Then, the analysis was limited to the set of peptides originating from the Spike proteins of the considered SARS-CoV-2 variants. The major finding was the destructive effect of the Omicron mutations on PINLVRDLPQGFSAL peptide, which was the only tight binder from the Spike protein for HLA-DRB1*03:01 allele and some associated haplotypes. Specifically, we predicted a dramatical decline in binding affinity of HLA-DRB1*03:01 and this peptide after N211 deletion, L212I substitution and EPE 212-214 insertion. The computational prediction was experimentally validated by ELISA with the use of corresponding thioredoxin-fused peptides and recombinant HLA-DR molecules. Another finding was the significant reduction in the number of tightly binding Spike peptides for HLA-B*07:02 HLA class I allele (both for Omicron and Delta variants). Overall, the majority of HLA alleles and haplotypes was not significantly affected by the mutations, suggesting the maintenance of effective T-cell immunity against the Omicron and Delta variants. Finally, we introduced the Omicron variant to T-CoV portal and added the functionality of haplotype-level analysis to it.

Journal ArticleDOI
25 Apr 2022-PeerJ
TL;DR: This article reviews state-of-the-art works in both years 2020 and 2021 by considering AI-guided tools to analyze cough sound for COVID-19 screening primarily based on machine learning algorithms.
Abstract: For COVID-19, the need for robust, inexpensive, and accessible screening becomes critical. Even though symptoms present differently, cough is still taken as one of the primary symptoms in severe and non-severe infections alike. For mass screening in resource-constrained regions, artificial intelligence (AI)-guided tools have progressively contributed to detect/screen COVID-19 infections using cough sounds. Therefore, in this article, we review state-of-the-art works in both years 2020 and 2021 by considering AI-guided tools to analyze cough sound for COVID-19 screening primarily based on machine learning algorithms. In our study, we used PubMed central repository and Web of Science with key words: (Cough OR Cough Sounds OR Speech) AND (Machine learning OR Deep learning OR Artificial intelligence) AND (COVID-19 OR Coronavirus). For better meta-analysis, we screened for appropriate dataset (size and source), algorithmic factors (both shallow learning and deep learning models) and corresponding performance scores. Further, in order not to miss up-to-date experimental research-based articles, we also included articles outside of PubMed and Web of Science, but pre-print articles were strictly avoided as they are not peer-reviewed.

Journal ArticleDOI
21 Feb 2022-PeerJ
TL;DR: The ML techniques and the methods used in this research study can be used by instructors/administrators to optimize learning content and provide informed guidance to students, thus improving their learning experience and making it exciting and adaptive.
Abstract: Corona Virus Disease 2019 (COVID-19) pandemic has increased the importance of Virtual Learning Environments (VLEs) instigating students to study from their homes. Every day a tremendous amount of data is generated when students interact with VLEs to perform different activities and access learning material. To make the generated data useful, it must be processed and managed by the proper machine learning (ML) algorithm. ML algorithms’ applications are many folds with Education Data Mining (EDM) and Learning Analytics (LA) as their major fields. ML algorithms are commonly used to process raw data to discover hidden patterns and construct a model to make future predictions, such as predicting students’ performance, dropouts, engagement, etc. However, in VLE, it is important to select the right and most applicable ML algorithm to give the best performance results. In this study, we aim to improve those ML and DL algorithms’ performance that give an inferior performance in terms of performance, accuracy, precision, recall, and F1 score. Several ML algorithms were applied on Open University Learning Analytics (OULA) dataset to reveal which one offers the best results in terms of performance, accuracy, precision, recall, and F1 score. Two popular ML algorithms called Decision Tree (DT) and Feed-Forward Neural Network (FFNN) provided unsatisfactory results. They were selected and experimented with various techniques such as grid search cross-validation, adaptive boosting, extreme gradient boosting, early stopping, feature engineering, and dropping inactive neurons to improve their performance scores. Moreover, we also determined the feature weights/importance in predicting the students’ study performance, leading to the design and development of the adaptive learning system. The ML techniques and the methods used in this research study can be used by instructors/administrators to optimize learning content and provide informed guidance to students, thus improving their learning experience and making it exciting and adaptive.

Journal ArticleDOI
23 Mar 2022-PeerJ
TL;DR: This work applied culture-independent shotgun metagenomics to thoroughbred equine faecal samples to deliver novel insights into this complex microbial community, providing new insights into the bacterial, archaeal and bacteriophage components of the horse gut microbiome.
Abstract: Background The horse plays crucial roles across the globe, including in horseracing, as a working and companion animal and as a food animal. The horse hindgut microbiome makes a key contribution in turning a high fibre diet into body mass and horsepower. However, despite its importance, the horse hindgut microbiome remains largely undefined. Here, we applied culture-independent shotgun metagenomics to thoroughbred equine faecal samples to deliver novel insights into this complex microbial community. Results We performed metagenomic sequencing on five equine faecal samples to construct 123 high- or medium-quality metagenome-assembled genomes from Bacteria and Archaea. In addition, we recovered nearly 200 bacteriophage genomes. We document surprising taxonomic diversity, encompassing dozens of novel or unnamed bacterial genera and species, to which we have assigned new Candidatus names. Many of these genera are conserved across a range of mammalian gut microbiomes. Conclusions Our metagenomic analyses provide new insights into the bacterial, archaeal and bacteriophage components of the horse gut microbiome. The resulting datasets provide a key resource for future high-resolution taxonomic and functional studies on the equine gut microbiome.

Journal ArticleDOI
05 Jan 2022-PeerJ
TL;DR: In this paper , the authors summarize the current evidence on the origin, genetics, morphological, volatile compounds, and organoleptic characteristics of the CCN 51 clone and the impact of this popular clone on the current and future cacao industry.
Abstract: Many decades of improvement in cacao have aided to obtain cultivars with characteristics of tolerance to diseases, adaptability to different edaphoclimatic conditions, and higher yields. In Ecuador, as a result of several breeding programs, the clone CCN 51 was obtained, which gradually expanded through the cacao-production regions of Ecuador, Colombia, Brazil and Peru. Recognized for its high yield and adaptability to different regions and environments, it has become one of the most popular clones for breeding programs and cultivation around the world. This review aims to summarize the current evidence on the origin, genetics, morphological, volatile compounds, and organoleptic characteristics of this clone. Physiological evidence, production dynamics, and floral biology are also included to explain the high yield of CCN 51. Thus, characteristics such as osmotic adjustment, long pollen longevity, and fruit formation are further discussed and associated with high production at the end of the dry period. Finally, the impact of this popular clone on the current and future cacao industry will be discussed highlighting the major challenges for flavor enhancement and its relevance as a platform for the identification of novel genetic markers for cultivar improvement in breeding programs.

Journal ArticleDOI
11 Feb 2022-PeerJ
TL;DR: In this article , a new set of stabilizing switched velocity-based continuous controllers was derived using the Lyapunov-based Control Scheme (LbCS) from the category of classical approaches where switching of these nonlinear controllers is invoked by a new rule.
Abstract: Robotic arms play an indispensable role in multiple sectors such as manufacturing, transportation and healthcare to improve human livelihoods and make possible their endeavors and innovations, which further enhance the quality of our lives. This paper considers such a robotic arm comprised of n revolute links and a prismatic end-effector, where the articulated arm is anchored in a restricted workspace. A new set of stabilizing switched velocity-based continuous controllers was derived using the Lyapunov-based Control Scheme (LbCS) from the category of classical approaches where switching of these nonlinear controllers is invoked by a new rule. The switched controllers enable the end-effector of the robotic arm to navigate autonomously via a series of landmarks, known as hierarchal landmarks, and finally converge to its equilibrium state. The interaction of the inherent attributes of LbCS that are the safeness, shortness and smoothness of paths for motion planning bring about cost and time efficiency of the controllers. The stability of the switched system was proven using Branicky’s stability criteria for switched systems based on multiple Lyapunov functions and was numerically validated using the RK4 method (Runge–Kutta method). Finally, computer simulation results are presented to show the effectiveness of the continuous time-invariant velocity-based controllers.

Journal ArticleDOI
27 Jan 2022-PeerJ
TL;DR: In this paper , a comparison of the unsaturated fatty acid profile of cattle and pork raised on one of the regenerative farms to a regional health-promoting brand and conventional meat from local supermarkets, found higher levels of omega-3 fats.
Abstract: Several independent comparisons indicate regenerative farming practices enhance the nutritional profiles of crops and livestock. Measurements from paired farms across the United States indicate differences in soil health and crop nutrient density between fields worked with conventional (synthetically-fertilized and herbicide-treated) or regenerative practices for 5 to 10 years. Specifically, regenerative farms that combined no-till, cover crops, and diverse rotations—a system known as Conservation Agriculture—produced crops with higher soil organic matter levels, soil health scores, and levels of certain vitamins, minerals, and phytochemicals. In addition, crops from two regenerative no-till vegetable farms, one in California and the other in Connecticut, had higher levels of phytochemicals than values reported previously from New York supermarkets. Moreover, a comparison of wheat from adjacent regenerative and conventional no-till fields in northern Oregon found a higher density of mineral micronutrients in the regenerative crop. Finally, a comparison of the unsaturated fatty acid profile of beef and pork raised on one of the regenerative farms to a regional health-promoting brand and conventional meat from local supermarkets, found higher levels of omega-3 fats and a more health-beneficial ratio of omega-6 to omega-3 fats. Despite small sample sizes, all three crop comparisons show differences in micronutrient and phytochemical concentrations that suggest soil health is an under appreciated influence on nutrient density, particularly for phytochemicals not conventionally considered nutrients but nonetheless relevant to chronic disease prevention. Likewise, regenerative grazing practices produced meat with a better fatty acid profile than conventional and regional health-promoting brands. Together these comparisons offer preliminary support for the conclusion that regenerative soil-building farming practices can enhance the nutritional profile of conventionally grown plant and animal foods.

Journal ArticleDOI
13 Oct 2022-PeerJ
TL;DR: Palmscan as mentioned in this paper identifies the polymerase sub-domain of RNA viruses using well-conserved catalytic motifs, a segment of the palm subdomain robustly delineated by well conserved catalytic structures.
Abstract: RNA viruses encoding a polymerase gene (riboviruses) dominate the known eukaryotic virome. High-throughput sequencing is revealing a wealth of new riboviruses known only from sequence, precluding classification by traditional taxonomic methods. Sequence classification is often based on polymerase sequences, but standardised methods to support this approach are currently lacking. To address this need, we describe the polymerase palmprint, a segment of the palm sub-domain robustly delineated by well-conserved catalytic motifs. We present an algorithm, Palmscan, which identifies palmprints in nucleotide and amino acid sequences; PALMdb, a collection of palmprints derived from public sequence databases; and palmID, a public website implementing palmprint identification, search, and annotation. Together, these methods demonstrate a proof-of-concept workflow for high-throughput characterisation of RNA viruses, paving the path for the continued rapid growth in RNA virus discovery anticipated in the coming decade.