scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Activity identification in modular construction using audio signals and machine learning

01 Nov 2020-Automation in Construction (Elsevier)-Vol. 119, pp 103361
TL;DR: The result of this study demonstrates the potential of the proposed system to be applied for automated monitoring and data collection in modular construction factory in conjunction with other activity recognition frameworks based on computer vision (CV) and/or inertial measurement units (IMU).
About: This article is published in Automation in Construction.The article was published on 2020-11-01. It has received 25 citations till now. The article focuses on the topics: Audio signal & Feature vector.
Citations
More filters
Journal ArticleDOI
TL;DR: The study identifies research gaps at each SC stage of each OSC type to incentivize future studies to consider more environmental and social sustainability factors in OSC-SC models.

47 citations

Journal ArticleDOI
TL;DR: A Machine Learning methodology was developed to train and evaluate 13 classifiers using artificial features extracted from raw accelerometer data segments, indicating that accelerometers can be used to create a robust system to recognise large sets of Construction worker activities automatically.
Abstract: Automated Construction worker activity classification has the potential to not only benefit the worker performance in terms of productivity and safety, but also the overall project management and control. The activity-level knowledge and indicators that can be extracted from this process may support project decision making, aiding in project schedule adjustment, resource management, construction site control, among others. Previous works on this topic focused on the collection and classification of worker acceleration data using wearable accelerometers and supervised machine learning algorithms, respectively. However, most of these studies tend to consider small sets of activities performed in an instructed manner, which can lead to higher accuracy results than those expected in a real construction scenario. To this end, this paper builds on the results of these past studies, committing to expand this discussion by covering a larger set of complex Construction activities than the current state-of-the-art, while avoiding the need to instruct test subjects on how and when to perform each activity. As such, a Machine Learning methodology was developed to train and evaluate 13 classifiers using artificial features extracted from raw accelerometer data segments. An experimental study was carried out under the form of a realistic activity-circuit to recognise ten different activities: gearing up; hammering; masonry; painting; roughcasting; sawing; screwing; sitting; standing still; and walking; with most activities being a cluster of simpler tasks (i.e. masonry includes fetching, transporting, and laying bricks). Activities were initially separated and tested in three different activity groups, before assessing all activities together. It was found that a segment length of 6 s, with a 75% overlap, enhanced the classifier performance. Feature selection was carried out to speed the algorithm running time. A nested cross-validation approach was performed for hyperparameter tuning and classifier training and testing. User-dependent and -independent approaches (differing in whether the system must undergo an additional training phase for each new user) were evaluated. Results indicate that accelerometers can be used to create a robust system to recognise large sets of Construction worker activities automatically. The K-Nearest Neighbours and Gradient Boosting algorithms were selected according to their performances, respectively, for the user-dependent and -independent scenarios. In both cases, the classifiers showed balanced accuracies above 84% for their respective approaches and test groups. Results also indicate that a user-dependent approach using task groups provides the highest accuracy.

27 citations

Journal ArticleDOI
TL;DR: This in-depth review of state-of-the-art deep-learning applications on visual data analytics in the context of construction project management identifies six major fields and fifty-two subfields of construction management where deep- learning-based visual data Analytics methods have been applied and proposes a generalized workflow for applying these methods.

21 citations

Journal ArticleDOI
TL;DR: A Deep Neural Network (DNN) model for estimating the productivity of excavators and establishing a productivity measure for their benchmark was developed and tested and indicates that the accuracy attained is adequate, but the proposed approach is more accurate in a highly mechanised environment.

19 citations

Journal ArticleDOI
TL;DR: In this paper , a metaheuristic optimization algorithm with efficient machine learning models was proposed to assist civil engineers in estimating scour depth at bridge piers, which achieved a mean absolute percentage errors of 7.127%, 29.195% and 13.131% in the prediction of scour depths using a laboratory dataset, a field dataset, and a complex pier foundations dataset, respectively.

7 citations

References
More filters
Journal ArticleDOI
Li Li1, Jian Yao1, Renping Xie1, Menghan Xia1, Wei Zhang2 
22 Dec 2016-Sensors
TL;DR: Experimental results on a large set of challenging street-view panoramic images captured form the real world illustrate that the proposed system is capable of creating high-quality panoramas.
Abstract: In this paper, we propose a unified framework to generate a pleasant and high-quality street-view panorama by stitching multiple panoramic images captured from the cameras mounted on the mobile platform. Our proposed framework is comprised of four major steps: image warping, color correction, optimal seam line detection and image blending. Since the input images are captured without a precisely common projection center from the scenes with the depth differences with respect to the cameras to different extents, such images cannot be precisely aligned in geometry. Therefore, an efficient image warping method based on the dense optical flow field is proposed to greatly suppress the influence of large geometric misalignment at first. Then, to lessen the influence of photometric inconsistencies caused by the illumination variations and different exposure settings, we propose an efficient color correction algorithm via matching extreme points of histograms to greatly decrease color differences between warped images. After that, the optimal seam lines between adjacent input images are detected via the graph cut energy minimization framework. At last, the Laplacian pyramid blending algorithm is applied to further eliminate the stitching artifacts along the optimal seam lines. Experimental results on a large set of challenging street-view panoramic images captured form the real world illustrate that the proposed system is capable of creating high-quality panoramas.

863 citations

Proceedings Article
01 Jan 2007
TL;DR: An overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with the MIRtoolbox, an integrated set of functions written in Matlab dedicated to the extraction of musical features from audio files.
Abstract: We present the MIRtoolbox, an integrated set of functions written in Matlab, dedicated to the extraction of musical features from audio files The design is based on a modular framework: the different algorithms are decomposed into stages, formalized using a minimal set of elementary mechanisms, and integrating different variants proposed by alternative approaches – including new strategies we have developed –, that users can select and parametrize This paper offers an overview of the set of features, related, among others, to timbre, tonality, rhythm or form, that can be extracted with the MIRtoolbox One particular analysis is provided as an example The toolbox also includes functions for statistical analysis, segmentation and clustering Particular attention has been paid to the design of a syntax that offers both simplicity of use and transparent adaptiveness to a multiplicity of possible input types Each feature extraction method can accept as argument an audio file, or any preliminary result from intermediary stages of the chain of operations Also the same syntax can be used for analyses of single audio files, batches of files, series of audio segments, multi-channel signals, etc For that purpose, the data and methods of the toolbox are organised in an object-oriented architecture 1 MOTIVATION AND APPROACH MIRtoolbox is a Matlab toolbox dedicated to the extraction of musically-related features from audio recordings It has been designed in particular with the objective of enabling the computation of a large range of features from databases of audio files, that can be subjected to statistical analyses Few softwares have been proposed in this area One particularity of our own approach relies in the use of the Matlab computing environment, which offers good visualisation capabilities and gives access to a large variety of other toolboxes In particular, the MIRtoolbox makes use of functions available in public-domain toolboxes such as the Auditory Toolbox [6], NetLab [5] and SOMtoolbox [10] Other toolboxes, such as the Statistics toolbox or the Neural Network toolbox from MathWorks, can be directly used for further analyses of the features extracted c © 2007 Austrian Computer Society (OCG) by MIRtoolbox without having to export the data from one software to another Such computational framework, because of its general objectives, could be useful to the research community in Music Information Retrieval (MIR), but also for educational purposes For that reason, particular attention has been paid concerning the ease of use of the toolbox In particular, complex analytic processes can be designed using a very simple syntax, whose expressive power comes from the use of an object-oriented paradigm The different musical features extracted from the audio files are highly interdependent: in particular, as can be seen in figure 1, some features are based on the same initial computations In order to improve the computational efficiency, it is important to avoid redundant computations of these common components Each of these intermediary components, and the final musical features, are therefore considered as building blocks that can been freely articulated one with each other Besides, in keeping with the objective of optimal ease of use of the toolbox, each building block has been conceived in a way that it can adapt to the type of input data For instance, the computation of the MFCCs can be based on the waveform of the initial audio signal, or on the intermediary representations such as spectrum, or mel-scale spectrum (see Fig 1) Similarly, autocorrelation is computed for different range of delays depending on the type of input data (audio waveform, envelope, spectrum) This decomposition of all feature extraction algorithms into a common set of building blocks has the advantage of offering a synthetic overview of the different approaches studied in this domain of research 2 FEATURE EXTRACTION 21 Feature overview Figure 1 shows an overview of the main features implemented in the toolbox All the different processes start from the audio signal (on the left) and form a chain of operations proceeding to right Each musical feature is related to one of the musical dimensions traditionally defined in music theory Boldface characters highlight features related to pitch and tonality Bold italics indicate features related to rhythm Simple italics highlight a large set of features that can be associated to timbre and dynamics Among them, all the operators in grey italics can be Audio signal waveform Zero-crossing rate RMS energy Envelope Low Energy Rate Attack Slope Attack Time Envelope Autocorrelation Tempo Onsets

677 citations

Journal ArticleDOI
TL;DR: This work broadly examines types of feature selection and defines RBAs, and introduces the original Relief algorithm and associated concepts, emphasizing the intuition behind how it works, how feature weights generated by the algorithm can be interpreted, and why it is sensitive to feature interactions without evaluating combinations of features.

659 citations

Journal ArticleDOI
TL;DR: An empirical feature analysis for audio environment characterization is performed and a matching pursuit algorithm is proposed to use to obtain effective time-frequency features to yield higher recognition accuracy for environmental sounds.
Abstract: The paper considers the task of recognizing environmental sounds for the understanding of a scene or context surrounding an audio sensor. A variety of features have been proposed for audio recognition, including the popular Mel-frequency cepstral coefficients (MFCCs) which describe the audio spectral shape. Environmental sounds, such as chirpings of insects and sounds of rain which are typically noise-like with a broad flat spectrum, may include strong temporal domain signatures. However, only few temporal-domain features have been developed to characterize such diverse audio signals previously. Here, we perform an empirical feature analysis for audio environment characterization and propose to use the matching pursuit (MP) algorithm to obtain effective time-frequency features. The MP-based method utilizes a dictionary of atoms for feature selection, resulting in a flexible, intuitive and physically interpretable set of features. The MP-based feature is adopted to supplement the MFCC features to yield higher recognition accuracy for environmental sounds. Extensive experiments are conducted to demonstrate the effectiveness of these joint features for unstructured environmental sound classification, including listening tests to study human recognition capabilities. Our recognition system has shown to produce comparable performance as human listeners.

626 citations