scispace - formally typeset
Search or ask a question

Showing papers on "Audio signal processing published in 2009"


Journal ArticleDOI
TL;DR: An empirical feature analysis for audio environment characterization is performed and a matching pursuit algorithm is proposed to use to obtain effective time-frequency features to yield higher recognition accuracy for environmental sounds.
Abstract: The paper considers the task of recognizing environmental sounds for the understanding of a scene or context surrounding an audio sensor. A variety of features have been proposed for audio recognition, including the popular Mel-frequency cepstral coefficients (MFCCs) which describe the audio spectral shape. Environmental sounds, such as chirpings of insects and sounds of rain which are typically noise-like with a broad flat spectrum, may include strong temporal domain signatures. However, only few temporal-domain features have been developed to characterize such diverse audio signals previously. Here, we perform an empirical feature analysis for audio environment characterization and propose to use the matching pursuit (MP) algorithm to obtain effective time-frequency features. The MP-based method utilizes a dictionary of atoms for feature selection, resulting in a flexible, intuitive and physically interpretable set of features. The MP-based feature is adopted to supplement the MFCC features to yield higher recognition accuracy for environmental sounds. Extensive experiments are conducted to demonstrate the effectiveness of these joint features for unstructured environmental sound classification, including listening tests to study human recognition capabilities. Our recognition system has shown to produce comparable performance as human listeners.

626 citations


Proceedings ArticleDOI
08 Dec 2009
TL;DR: A novel open-source affect and emotion recognition engine, which integrates all necessary components in one highly efficient software package, and which can be used for batch processing of databases.
Abstract: Various open-source toolkits exist for speech recognition and speech processing. These toolkits have brought a great benefit to the research community, i.e. speeding up research. Yet, no such freely available toolkit exists for automatic affect recognition from speech. We herein introduce a novel open-source affect and emotion recognition engine, which integrates all necessary components in one highly efficient software package. The components include audio recording and audio file reading, state-of-the-art paralinguistic feature extraction and plugable classification modules. In this paper we introduce the engine and extensive baseline results. Pre-trained models for four affect recognition tasks are included in the openEAR distribution. The engine is tailored for multi-threaded, incremental on-line processing of live input in real-time, however it can also be used for batch processing of databases.

408 citations


Patent
06 Jul 2009
TL;DR: In this paper, an orientation sensor detects an orientation of the speaker array and provides an orientation signal. But the orientation signal may be provided according to selection of display orientation, shape of touch input, image recognition of the listener, or the like.
Abstract: A device that provides an audio output includes a speaker array mechanically fixed to the device. The speaker array includes at least three speakers. An orientation sensor detects an orientation of the speaker array and provides an orientation signal. An audio receiver receives a number of audio signals that include spatial position information. An audio processor is coupled to the speakers, the orientation sensor, and the audio receiver. The audio processor receives the audio signals and the orientation signal, and selectively routes the audio signals to the speakers according to the spatial position information and the orientation signal such that the spatial position information is perceptible to a listener. The orientation signal may be provided by a compass, an accelerometer, an inertial sensor, or other device. The orientation signal may be provided according to selection of display orientation, shape of touch input, image recognition of the listener, or the like.

260 citations


Book
21 Mar 2009
TL;DR: An introduction to pitch estimation is given and a number of statistical methods for pitch estimation are presented, which include both single- and multi-pitch estimators based on statistical approaches, filtering methods based on both static and optimal adaptive designs, and subspace methodsbased on the principles of subspace orthogonality and shift-invariance.
Abstract: Periodic signals can be decomposed into sets of sinusoids having frequencies that are integer multiples of a fundamental frequency. The problem of finding such fundamental frequencies from noisy observations is important in many speech and audio applications, where it is commonly referred to as pitch estimation. These applications include analysis, compression, separation, enhancement, automatic transcription and many more. In this book, an introduction to pitch estimation is given and a number of statistical methods for pitch estimation are presented. The basic signal models and associated estimation theoretical bounds are introduced, and the properties of speech and audio signals are discussed and illustrated. The presented methods include both single- and multi-pitch estimators based on statistical approaches, like maximum likelihood and maximum a posteriori methods, filtering methods based on both static and optimal adaptive designs, and subspace methods based on the principles of subspace orthogonality and shift-invariance. The application of these methods to analysis of speech and audio signals is demonstrated using both real and synthetic signals, and their performance is assessed under various conditions and their properties discussed. Finally, the estimators are compared in terms of computational and statistical efficiency, generalizability and robustness. Table of Contents: Fundamentals / Statistical Methods / Filtering Methods / Subspace Methods / Amplitude Estimation

221 citations


Patent
Hyun Jin Park1, Kwokleung Chan1
18 Nov 2009
TL;DR: In this article, the uses of an enhanced sidetone signal in an active noise cancellation operation are described. But the authors do not discuss the application of the enhanced sideto-signal in the active cancellation operation.
Abstract: Uses of an enhanced sidetone signal in an active noise cancellation operation are disclosed.

202 citations


Patent
Kwokleung Chan1
23 Oct 2009
TL;DR: In this article, the authors estimate the proximity of an audio source by transforming audio signals from a plurality of sensors to the frequency domain, and the amplitudes of the transformed audio signals are then determined.
Abstract: Estimating the proximity of an audio source (14, 15) is accomplished by transforming audio signals from a plurality of sensors (18, 20) to frequency domain. The amplitudes of the transformed audio signals are then determined. The proximity of the audio source is determined based on a comparison of the frequency domain amplitudes. This estimation permits a device (16) to differentiate between relatively distant audio sources (14) and audio sources (15) at close proximity to the device. The technique can be applied to mobile handsets, such as cellular phones or PDAs, hands-free headsets, and other audio input devices. Devices taking advantage of this "close proximity" detection are better able to suppress background noise and deliver an improved user experience.

201 citations


Patent
David G. Conroy1, Barry J. Corlett1, Aram Lindahl1, Steve Schell1, Neil David Warren1 
22 Jul 2009
TL;DR: In this article, a media processing system and device with improved power usage characteristics, improved audio functionality and improved media security is provided, including an audio processing subsystem that operates independently of the host processor for long periods of time.
Abstract: A media processing system and device with improved power usage characteristics, improved audio functionality and improved media security is provided. Embodiments of the media processing system include an audio processing subsystem that operates independently of the host processor for long periods of time, allowing the host processor to enter a low power state while the audio data is being processed. Other aspects of the media processing system provide for enhanced audio effects such as mixing stored audio samples into real-time telephone audio. Still other aspects of the media processing system provide for improved media security due to the isolation of decrypted audio data from the host processor.

179 citations


Patent
Chad G. Seguin1
21 Jan 2009
TL;DR: In this article, a multi-band compressor has a band splitter that splits the input audio signal into a number of different band signals, each band signal is input to a respective compressor block, which is independently programmable so that its audio frequency response differs from a linear response in at least two non-overlapping windows of its input signal, and (b) differs from the frequency response of another one of the compressor blocks.
Abstract: An uplink or downlink audio processor contains a multi band compressor that receives an input, uplink or downlink, audio signal. The multi-band compressor has a band splitter that splits the input audio signal into a number of different band signals. Each band signal is input to a respective compressor block, which is independently programmable so that its audio frequency response (a) differs from a linear response in at least two non-overlapping windows of its input signal, and (b) differs from the frequency response of another one of the compressor blocks. Other embodiments are also described and claimed.

168 citations


Patent
10 Mar 2009
TL;DR: In this paper, a secondary audio clip mixing profile is selected based upon the type of audio output device, such as a speaker or a headset, coupled to the electronic device, to substantially optimize audibility and user-perceived comfort.
Abstract: Various techniques for controlling the playback of secondary audio data on an electronic device are provided. In one embodiment, a secondary audio clip mixing profile is selected based upon the type of audio output device, such as a speaker or a headset, coupled to the electronic device. The selected mixing profile may define respective digital gain values to be applied to a secondary audio stream at each digital audio level of the electronic device, and may be customized based upon one or more characteristics of the audio output device to substantially optimize audibility and user-perceived comfort. In this manner, the overall user listening experience may be improved.

166 citations


Journal ArticleDOI
TL;DR: This paper proposes effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie, using the application of neural network for the classification of audio.
Abstract: In the age of digital information, audio data has become an important part in many modern computer applications Audio classification has been becoming a focus in the research of audio processing and pattern recognition Automatic audio classification is very useful to audio indexing, content-based audio retrieval and on-line audio distribution, but it is a challenge to extract the most common and salient themes from unstructured raw audio data In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content Support vector machines are applied to classify audio into their respective classes by learning from training data Then the proposed method extends the application of neural network (RBFNN) for the classification of audio RBFNN enables nonlinear transformation followed by linear transformation to achieve a higher dimension in the hidden space The experiments on different genres of the various categories illustrate the results of classification are significant and effective

160 citations


Journal ArticleDOI
TL;DR: Examples include sampling rate conversion for software radio and between audio formats, biomedical imaging, lens distortion correction and the formation of image mosaics, and super-resolution of image sequences.
Abstract: Digital applications have developed rapidly over the last few decades. Since many sources of information are of analog or continuous-time nature, discrete-time signal processing (DSP) inherently relies on sampling a continuous-time signal to obtain a discrete-time representation. Consequently, sampling theories lie at the heart of signal processing devices and communication systems. Examples include sampling rate conversion for software radio and between audio formats, biomedical imaging, lens distortion correction and the formation of image mosaics, and super-resolution of image sequences.

Patent
17 Jun 2009
TL;DR: In this paper, a method and a control circuit for controlling an output of an audio signal of a battery-powered device are described. And in case of low charge level, the audio filter/gain parameter and/or the audio compression parameter are adjusted to reduce power consumption, thus allowing longer reproducing time by lower sound quality.
Abstract: A method and a control circuit for controlling an output of an audio signal of a battery-powered device are described. The charge condition of the battery is determined and in case of low charge level, the audio filter/gain parameter and/or the audio compression parameter are adjusted to reduce power consumption, thus allowing longer reproducing time by lower sound quality.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: Novel audio features that combine the high temporal accuracy of onset features with the robustness of chroma features are introduced and it is shown how previous synchronization methods can be extended to make use of these new features.
Abstract: The general goal of music synchronization is to automatically align the multiple information sources such as audio recordings, MIDI files, or digitized sheet music related to a given musical work. In computing such alignments, one typically has to face a delicate tradeoff between robustness and accuracy. In this paper, we introduce novel audio features that combine the high temporal accuracy of onset features with the robustness of chroma features. We show how previous synchronization methods can be extended to make use of these new features. We report on experiments based on polyphonic Western music demonstrating the improvements of our proposed synchronization framework.

Journal ArticleDOI
TL;DR: A transition from mere association in beginner readers to more automatic, but still not “adult-like,” integration in advanced readers is indicated and evidence for an extended development of letter–speech sound integration is provided.
Abstract: In transparent alphabetic languages, the expected standard for complete acquisition of letter-speech sound associations is within one year of reading instruction. The neural mechanisms underlying the acquisition of letter-speech sound associations have, however, hardly been investigated. The present article describes an ERP study with beginner and advanced readers in which the influence of letters on speech sound processing is investigated by comparing the MMN to speech sounds presented in isolation with the MMN to speech sounds accompanied by letters. Furthermore, SOA between letter and speech sound presentation was manipulated in order to investigate the development of the temporal window of integration for letter-speech sound processing. Beginner readers, despite one year of reading instruction, showed no early letter-speech sound integration, that is, no influence of the letter on the evocation of the MMN to the speech sound. Only later in the difference wave, at 650 msec, was an influence of the letter on speech sound processing revealed. Advanced readers, with 4 years of reading instruction, showed early and automatic letter-speech sound processing as revealed by an enhancement of the MMN amplitude, however, at a different temporal window of integration in comparison with experienced adult readers. The present results indicate a transition from mere association in beginner readers to more automatic, but still not "adult-like," integration in advanced readers. In contrast to general assumptions, the present study provides evidence for an extended development of letter-speech sound integration.

Patent
16 Feb 2009
TL;DR: In this paper, a user preference processor (109) receives user preference feedback for the test audio signals and generates a personalization parameter for the user in response to the user preferences and a noise parameter for each noise component of at least one of the audio signals.
Abstract: An audio device is arranged to present a plurality of test audio signals to a user where each test audio signal comprises a signal component and a noise component. A user preference processor (109) receives user preference feedback for the test audio signals and generates a personalization parameter for the user in response to the user preference feedback and a noise parameter for the noise component of at least one of the test audio signals. An audio processor (113) then processes an audio signal in response to the personalization parameter and the resulting signal is presented to the user. The invention may allow improved characterization of a user thereby resulting in improved adaptation of the processing and thus an improved personalization of the presented signal. The invention may e.g. be beneficial for hearing aids for hearing impaired users.

Patent
Hyen O Oh1, Yang Won Jung1
29 Jul 2009
TL;DR: In this article, an apparatus for processing an audio signal and method thereof is described, which includes receiving, by an audio processing apparatus, a signal including a first data of a first block encoded with rectangular coding scheme and a second data of the second block encoded in non-rectangular coding scheme.
Abstract: An apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving, by an audio processing apparatus, an audio signal including a first data of a first block encoded with rectangular coding scheme and a second data of a second block encoded with non-rectangular coding scheme; receiving a compensation signal corresponding to the second block; estimating a prediction of an aliasing part using the first data; and, obtaining a reconstructed signal for the second block based on the second data, the compensation signal and the prediction of aliasing part.

Journal ArticleDOI
TL;DR: A novel approach to applying text-based information retrieval techniques to music collections that represents tracks with a joint vocabulary consisting of both conventional words, drawn from social tags, and audio muswords, representing characteristics of automatically-identified regions of interest within the signal.
Abstract: In this paper we describe a novel approach to applying text-based information retrieval techniques to music collections. We represent tracks with a joint vocabulary consisting of both conventional words, drawn from social tags, and audio muswords, representing characteristics of automatically-identified regions of interest within the signal. We build vector space and latent aspect models indexing words and muswords for a collection of tracks, and show experimentally that retrieval with these models is extremely well-behaved. We find in particular that retrieval performance remains good for tracks by artists unseen by our models in training, and even if tags for their tracks are extremely sparse.

Patent
06 Jul 2009
TL;DR: In this paper, an apparatus for generating at least one audio output signal representing a superposition of at least two different audio objects comprises a processor for processing an audio input signal to provide an object representation of the input signal, where this object representation can be generated by a parametrically guided approximation of original objects using an object downmix signal.
Abstract: An apparatus for generating at least one audio output signal representing a superposition of at least two different audio objects comprises a processor for processing an audio input signal to provide an object representation of the audio input signal, where this object representation can be generated by a parametrically guided approximation of original objects using an object downmix signal. An object manipulator individually manipulates objects using audio object based metadata referring to the individual audio objects to obtain manipulated audio objects. The manipulated audio objects are mixed using an object mixer for finally obtaining an audio output signal having one or several channel signals depending on a specific rendering setup.

Journal ArticleDOI
C. Joder1, Slim Essid1, Gael Richard1
TL;DR: A number of methods for early and late temporal integration are proposed and an in-depth experimental study on their interest for the task of musical instrument recognition on solo musical phrases is provided.
Abstract: Nowadays, it appears essential to design automatic indexing tools which provide meaningful and efficient means to describe the musical audio content. There is in fact a growing interest for music information retrieval (MIR) applications amongst which the most popular are related to music similarity retrieval, artist identification, musical genre or instrument recognition. Current MIR-related classification systems usually do not take into account the mid-term temporal properties of the signal (over several frames) and lie on the assumption that the observations of the features in different frames are statistically independent. The aim of this paper is to demonstrate the usefulness of the information carried by the evolution of these characteristics over time. To that purpose, we propose a number of methods for early and late temporal integration and provide an in-depth experimental study on their interest for the task of musical instrument recognition on solo musical phrases. In particular, the impact of the time horizon over which the temporal integration is performed will be assessed both for fixed and variable frame length analysis. Also, a number of proposed alignment kernels will be used for late temporal integration. For all experiments, the results are compared to a state of the art musical instrument recognition system.

Book
08 Sep 2009
TL;DR: This invaluable guide will provide audio, R&D and software engineers in the industry of building systems or computer peripherals for speech enhancement with a comprehensive overview of the technologies, devices and algorithms required for modern computers and communication devices.
Abstract: Provides state-of-the-art algorithms for sound capture, processing and enhancement Sound Capture and Processing: Practical Approaches covers the digital signal processing algorithms and devices for capturing sounds, mostly human speech. It explores the devices and technologies used to capture, enhance and process sound for the needs of communication and speech recognition in modern computers and communication devices. This book gives a comprehensive introduction to basic acoustics and microphones, with coverage of algorithms for noise reduction, acoustic echo cancellation, dereverberation and microphone arrays; charting the progress of such technologies from their evolution to present day standard. Sound Capture and Processing: Practical Approaches Brings together the state-of-the-art algorithms for sound capture, processing and enhancement in one easily accessible volume Provides invaluable implementation techniques required to process algorithms for real life applications and devices Covers a number of advanced sound processing techniques, such as multichannel acoustic echo cancellation, dereverberation and source separation Generously illustrated with figures and charts to demonstrate how sound capture and audio processing systems work An accompanying website containing Matlab code to illustrate the algorithms This invaluable guide will provide audio, R&D and software engineers in the industry of building systems or computer peripherals for speech enhancement with a comprehensive overview of the technologies, devices and algorithms required for modern computers and communication devices. Graduate students studying electrical engineering and computer science, and researchers in multimedia, cell-phones, interactive systems and acousticians will also benefit from this book.

Journal ArticleDOI
TL;DR: Experimental results show that proposed derivative-based and wavelet-based approaches remarkably improve the detection accuracy.
Abstract: To improve a recently developed mel-cepstrum audio steganalysis method, we present in this paper a method based on Fourier spectrum statistics and mel-cepstrum coefficients, derived from the second-order derivative of the audio signal. Specifically, the statistics of the high-frequency spectrum and the mel-cepstrum coefficients of the second-order derivative are extracted for use in detecting audio steganography. We also design a wavelet-based spectrum and mel-cepstrum audio steganalysis. By applying support vector machines to these features, unadulterated carrier signals (without hidden data) and the steganograms (carrying covert data) are successfully discriminated. Experimental results show that proposed derivative-based and wavelet-based approaches remarkably improve the detection accuracy. Between the two new methods, the derivative-based approach generally delivers a better performance.

Proceedings ArticleDOI
04 Dec 2009
TL;DR: A new probabilistic model for polyphonic audio termed factorial scaled hidden Markov model (FS-HMM), which generalizes several existing models, notably the Gaussian scaled mixture model and the Itakura-Saito nonnegative matrix factorization (NMF) model is presented.
Abstract: We present a new probabilistic model for polyphonic audio termed Factorial Scaled Hidden Markov Model (FS-HMM), which generalizes several existing models, notably the Gaussian scaled mixture model and the Itakura-Saito Nonnegative Matrix Factorization (NMF) model. We describe two expectation-maximization (EM) algorithms for maximum likelihood estimation, which differ by the choice of complete data set. The second EM algorithm, based on a reduced complete data set and multiplicative updates inspired from NMF methodology, exhibits much faster convergence. We consider the FS-HMM in different configurations for the difficult problem of speech / music separation from a single channel and report satisfying results.

Journal ArticleDOI
TL;DR: The DSP audience is given some insight into the types of problems and challenges that face practitioners in audio forensic laboratories, and several of the frustrations and pitfalls encountered by signal processing experts when dealing with typical forensic material due to the standards and practices of the legal system.
Abstract: The field of audio forensics involves many topics familiar to the general audio digital signal processing (DSP) community, such as speech recognition, talker identification, and signal quality enhancement. There is potentially much to be gained by applying modern DSP theory to problems of interest to the forensics community, and this article is written to give the DSP audience some insight into the types of problems and challenges that face practitioners in audio forensic laboratories. However, this article must also present several of the frustrations and pitfalls encountered by signal processing experts when dealing with typical forensic material due to the standards and practices of the legal system.

MonographDOI
23 Mar 2009
TL;DR: This practically oriented text provides MATLAB examples throughout to illustrate the concepts discussed and to give the reader hands-on experience with important techniques, and is ideal for graduate students and practitioners working with speech or audio systems.
Abstract: Applied Speech and Audio Processing is a MATLAB-based, one-stop resource that blends speech and hearing research in describing the key techniques of speech and audio processing This practically oriented text provides MATLAB examples throughout to illustrate the concepts discussed and to give the reader hands-on experience with important techniques Chapters on basic audio processing and the characteristics of speech and hearing lay the foundations of speech signal processing, which are built upon in subsequent sections explaining audio handling, coding, compression, and analysis techniques The final chapter explores a number of advanced topics that use these techniques, including psychoacoustic modelling, a subject which underpins MP3 and related audio formats With its hands-on nature and numerous MATLAB examples, this book is ideal for graduate students and practitioners working with speech or audio systems

PatentDOI
TL;DR: In this paper, two-channel input audio signals are processed to construct output audio signals by decomposing the input signal into a plurality of subband audio signals, and the output signal is synthesized from the generated subband signals.
Abstract: Two-channel input audio signals are processed to construct output audio signals by decomposing the two-channel input audio signals into a plurality of two-channel subband audio signals. Separately, in each of a plurality of subbands, at least three generated subband audio signals are generated by steering the two-channel subband audio signals into at least three generated signal locations. The output audio signals are synthesized from the generated subband audio signals. The steering applies differing construction rules in at least two of the plurality of subbands.

Patent
29 Jul 2009
TL;DR: In this article, a transfer function estimate of the electroacoustic channel is established, responsive to the second audio signal and part of the first audio signal, and filters are obtained with transfer functions based on the estimate.
Abstract: An electroacoustic channel soundfield is altered. An audio signal is applied by an electromechanical transducer to an acoustic space, causing air pressure changes therein. Another audio signal is obtained by a second electromechanical transducer, responsive to air pressure changes in the acoustic space. A transfer function estimate of the electroacoustic channel is established, responsive to the second audio signal and part of the first audio signal. The transfer function estimate is derived to be adaptive to temporal variations in the electroacoustic channel transfer function. Filters are obtained with transfer functions based on the transfer function estimate. Part of the first audio signal is filtered therewith.

Proceedings ArticleDOI
28 Jun 2009
TL;DR: It is shown that HHMM can handle audio events with recursive patterns to improve the classification performance, and a model fusion method is proposed to cover large variations often existing in healthcare audio events.
Abstract: Audio is a useful modality complement to video for healthcare monitoring. In this paper, we investigate the use of Hierarchical Hidden Markov Models (HHMMs) for healthcare audio event classification. We show that HHMM can handle audio events with recursive patterns to improve the classification performance. We also propose a model fusion method to cover large variations often existing in healthcare audio events. Experimental results from classifying key eldercare audio events show the effectiveness of the model fusion method for healthcare audio event classification.

Patent
02 Sep 2009
TL;DR: In this paper, an audio system to calibrate an audio signal based on a wirelessly received signal, and a signal calibration method are provided, where a transceiver is connected to an external device to enable wireless communication between a main unit and the external device.
Abstract: An audio system to calibrate an audio signal based on a wirelessly received signal, and a signal calibration method are provided. The audio system includes a sound output unit to output a sound corresponding to a received audio signal. A transceiver is connected to an external device to enable wireless communication between a main unit and the external device. The external device converts the sound output from the sound output unit into an electric signal to generate a calibration audio signal. The main unit performs calibration on an audio signal to be played back through the sound output unit using the calibration audio signal.

Journal ArticleDOI
TL;DR: Speech recognition results in adult cochlear implant recipients revealed small but significant improvements with HR 120 for single syllable words and for 2 of 3 sentence recognition measures in noise, and 7 of 8 subjects preferred HR 120 over HR for listening in everyday life.
Abstract: Objective HiRes (HR) 120 is a sound processing strategy purported to offer an increase in the precision of frequency-to-place mapping through the use of current steering. This within-subject study was designed to compare speech recognition as well as music and sound quality ratings for HR and HR 120 processing.

PatentDOI
TL;DR: In this paper, an audio plug that connects to a mating audio jack in an electronic device is used to establish a wired communications path between the accessory and the electronic device, and a microphone may be included in an accessory to capture sound for an associated electronic device.
Abstract: Electronic devices and accessories such as headsets for electronic devices are provided. A microphone may be included in an accessory to capture sound for an associated electronic device. Buttons and other user interfaces may be included in the accessories. An accessory may have an audio plug that connects to a mating audio jack in an electronic device, thereby establishing a wired communications path between the accessory and the electronic device. Path configuration circuitry may be used to selectively configure the path between the electronic device and accessory to support different operational modes. Analog audio lines in the wired path may convey left and right channel analog audio channels. When it is desired to convey power over the wired path, one of the analog audio channel lines may be converted to a power line. Audio functionality may be retained by simultaneously converting a unidirectional line into a bidirectional line using hybrids.