scispace - formally typeset
Search or ask a question
Author

Peter S. Popolo

Bio: Peter S. Popolo is an academic researcher from Montclair State University. The author has contributed to research in topics: Phonation & Vocal folds. The author has an hindex of 8, co-authored 17 publications receiving 806 citations. Previous affiliations of Peter S. Popolo include Denver Center for the Performing Arts & University of Iowa.

Papers
More filters
Journal ArticleDOI
TL;DR: The results showed that large F0 and SPL variations in speech affected the dose measures, suggesting that accumulation of phonation time alone is insufficient and the derived safety limits for vocalization will likely require refinement based on a more detailed knowledge of the differences in hand and vocal fold tissue morphology and their response to vibrational stress, and on the effect of recovery of the vocal Fold tissue during voicing pauses.
Abstract: To measure the exposure to self-induced tissue vibration in speech, three vocal doses were defined and described: distance dose, which accumulates the distance that tissue particles of the vocal folds travel in an oscillatory trajectory; energy dissipation dose, which accumulates the total amount of heat dissipated over a unit volume of vocal fold tissues; and time dose, which accumulates the total phonation time. These doses were compared to a previously used vocal dose measure, the vocal loading index, which accumulates the number of vibration cycles of the vocal folds. Empirical rules for viscosity and vocal fold deformation were used to calculate all the doses from the fundamental frequency (F0) and sound pressure level (SPL) values of speech. Six participants were asked to read in normal, monotone, and exaggerated speech and the doses associated with these vocalizations were calculated. The results showed that large F0 and SPL variations in speech affected the dose measures, suggesting that accumulation of phonation time alone is insufficient. The vibration exposure of the vocal folds in normal speech was related to the industrial limits for hand-transmitted vibration, in which the safe distance dose was derived to be about 500 m. This limit was found rather low for vocalization; it was related to a comparable time dose of about 17 min of continuous vocalization, or about 35 min of continuous reading with normal breathing and unvoiced segments. The voicing pauses in normal speech and dialogue effectively prolong the safe time dose. The derived safety limits for vocalization will likely require refinement based on a more detailed knowledge of the differences in hand and vocal fold tissue morphology and their response to vibrational stress, and on the effect of recovery of the vocal fold tissue during voicing pauses.

250 citations

Journal ArticleDOI
TL;DR: Results indicated that the bifurcations occur more often in phonations with F(0)-F(1) crossovers, suggesting that nonlinear source-filter coupling is partly responsible for source instabilities.
Abstract: Nonlinear source–filter coupling has been demonstrated in computer simulations, in excised larynx experiments, and in physical models, but not in a consistent and unequivocal way in natural human phonations. Eighteen subjects (nine adult males and nine adult females) performed three vocal exercises that represented a combination of various fundamental frequency and formant glides. The goal of this study was to pinpoint the proportion of source instabilities that are due to nonlinear source–tract coupling. It was hypothesized that vocal fold vibration is maximally destabilized when F0 crosses F1, where the acoustic load changes dramatically. A companion paper provides the theoretical underpinnings. Expected manifestations of a source–filter interaction were sudden frequency jumps, subharmonic generation, or chaotic vocal fold vibrations that coincide with F0–F1 crossovers. Results indicated that the bifurcations occur more often in phonations with F0–F1 crossovers, suggesting that nonlinear source–filter coupling is partly responsible for source instabilities. Furthermore it was observed that male subjects show more bifurcations in phonations with F0–F1 crossovers, presumably because in normal speech they are less likely to encounter these crossovers as much as females and hence have less practice in suppressing unwanted instabilities.

164 citations

Journal ArticleDOI
TL;DR: The results indicate that the mean SPL of voiced speech can be estimated with accuracy better than +/-2.8 dB in 95% of the cases when the subjects are individually calibrated, which makes the accelerometer an interesting sensor for SPL measurement of speech when microphones are problematic to use.
Abstract: How accurately can sound pressure levels (SPLs) of speech be estimated from skin vibration of the neck? Measurements using a small accelerometer were carried out in 27 subjects (10 males and 17 females) who read Rainbow and Marvin Williams passages in soft, comfortable, and loud voice, while skin acceleration levels (SALs) and SPLs were simultaneously registered and analyzed every 30 ms. The results indicate that the mean SPL of voiced speech can be estimated with accuracy better than +/-2.8 dB in 95% of the cases when the subjects are individually calibrated. This makes the accelerometer an interesting sensor for SPL measurement of speech when microphones are problematic to use (e.g., noisy environments or in voice dosimetry). The estimates of equivalent SPL, which is the logarithm of averaged relative energy of voiced speech, were found to be up to 1.5 dB less accurate than the mean SPL. The estimation accuracy for instantaneous SPLs was worse than for the mean and equivalent SPLs (on average +/-6 and +/-5 dB for males and females, respectively).

144 citations

Journal ArticleDOI
TL;DR: An experimental method for quantifying the amount of voicing over time is described in a tutorial manner and a new procedure for obtaining calibrated sound pressure levels (SPL) of speech from a head-mounted microphone is offered.
Abstract: An experimental method for quantifying the amount of voicing over time is described in a tutorial manner. A new procedure for obtaining calibrated sound pressure levels (SPL) of speech from a head-mounted microphone is offered. An algorithm for voicing detection (kv) and fundamental frequency (F0) extraction from an electroglottographic signal is described. The extracted values of SPL, F0, and kv are used to derive five vocal doses: the time dose (total voicing time), the cycle dose (total number of vocal fold oscillatory cycles), the distance dose (total distance travelled by the vocal folds in an oscillatory path), the energy dissipation dose (total amount of heat energy dissipated in the vocal folds) and the radiated energy dose (total acoustic energy radiated from the mouth). The doses measure the vocal load and can be used for studying the effects of vocal fold tissue exposure to vibration.

112 citations

Journal ArticleDOI
TL;DR: This article deals with the adaptation of a commercially available Pocket PC for use as a voice dosimeter, a wearable device that measures the vocal dose of teachers or other individuals on the job, at home, and elsewhere during the course of an entire day.
Abstract: This article deals with the adaptation of a commercially available Pocket PC for use as a voice dosimeter, a wearable device that measures the vocal dose of teachers or other individuals on the job...

99 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A new measure of dysphonia, pitch period entropy (PPE), is introduced, which is robust to many uncontrollable confounding effects including noisy acoustic environments and normal, healthy variations in voice frequency, and is well suited to telemonitoring applications.
Abstract: In this paper, we present an assessment of the practical value of existing traditional and nonstandard measures for discriminating healthy people from people with Parkinson's disease (PD) by detecting dysphonia. We introduce a new measure of dysphonia, pitch period entropy (PPE), which is robust to many uncontrollable confounding effects including noisy acoustic environments and normal, healthy variations in voice frequency. We collected sustained phonations from 31 people, 23 with PD. We then selected ten highly uncorrelated measures, and an exhaustive search of all possible combinations of these measures finds four that in combination lead to overall correct classification performance of 91.4%, using a kernel support vector machine. In conclusion, we find that nonstandard methods in combination with traditional harmonics-to-noise ratios are best able to separate healthy from PD subjects. The selected nonstandard methods are robust to many uncontrollable variations in acoustic environment and individual subjects, and are thus well suited to telemonitoring applications.

816 citations

Journal ArticleDOI
Ingo R. Titze1
TL;DR: In this article, a theory of interaction between the source of sound in phonation and the vocal tract filter is developed, where the degree of interaction is controlled by the cross-sectional area of the laryngeal vestibule (epilarynx tube), which raises the inertive reactance of the supraglottal vocal tract.
Abstract: A theory of interaction between the source of sound in phonation and the vocal tract filter is developed. The degree of interaction is controlled by the cross-sectional area of the laryngeal vestibule (epilarynx tube), which raises the inertive reactance of the supraglottal vocal tract. Both subglottal and supraglottal reactances can enhance the driving pressures of the vocal folds and the glottal flow, thereby increasing the energy level at the source. The theory predicts that instabilities in vibration modes may occur when harmonics pass through formants during pitch or vowel changes. Unlike in most musical instruments (e.g., woodwinds and brasses), a stable harmonic source spectrum is not obtained by tuning harmonics to vocal tract resonances, but rather by placing harmonics into favorable reactance regions. This allows for positive reinforcement of the harmonics by supraglottal inertive reactance (and to a lesser degree by subglottal compliant reactance) without the risk of instability. The traditional linear source–filter theory is encumbered with possible inconsistencies in the glottal flow spectrum, which is shown to be influenced by interaction. In addition, the linear theory does not predict bifurcations in the dynamical behavior of vocal fold vibration due to acoustic loading by the vocal tract.

378 citations

Journal ArticleDOI
Erkki Vilkman1
TL;DR: On the basis of epidemiological and acoustic-physiological research, the presence of risk to vocal health can be substantiated and loading-related physiological changes (adaptation) may play a role in the occupational risk.
Abstract: A well-functioning voice is an essential tool for one third of the labour force Vocal demands vary to a great extent between the different voice and speech professions In professions with heavy vocal loading (eg school and kindergarten teachers), occupational voice disorders threatening working ability are common Vocal loading is a combination of prolonged voice use and additional loading factors (eg background noise, acoustics, air quality) affecting the fundamental frequency, type and loudness of phonation or the vibratory characteristics of the vocal folds as well as the external frame of the larynx The prevention and treatment of occupational voice disorders calls for improved occupational safety and health (OSH) arrangements for voice and speech professionals On the basis of epidemiological and acoustic-physiological research, the presence of risk to vocal health can be substantiated From the point of view of the physical load on the vocal apparatus, loading-related physiological changes (adaptation) may play a role in the occupational risk Environmental factors affect vocal loading changes In teaching professions, the working environment is shared with children, who benefit from amendments of OSH legislation concerning their teachers

358 citations

01 Jan 1999
TL;DR: In this paper, a video camera is used for high-speed visualization of the vocal folds of the human laryngeal larynx, where the camera selects one active horizontal line (transverse to the glottis) from the whole image and the successive line images are presented in real time o a commercial TV monitor.
Abstract: A digital technique for high-speed visualization of vibration, called videokymography, was developed and applied to the vocal folds. The system uses a modified video camera able to work in two modes : high-speed (nearly 8000 images/s) and standard (50 images/s in CCIR norm). In the high-speed mode, the camera selects one active horizontal line (transverse to the glottis) from the whole laryngeal image. The successive line images are presented in real time o a commercial TV monitor, filling each video frame from top to bottom. The system makes it possible to observe left-right asymetries, open quotient, propagation of mucosal waves, movement of the upper and, in the closing phase, the lower margins of the vocal folds, etc... The technique is suitable for further processing and quantification of recorded vibration.

297 citations

Journal ArticleDOI
TL;DR: A review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality is provided.
Abstract: As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed.

233 citations