Direction of Arrival Estimation in the Spherical Harmonic Domain Using Subspace Pseudointensity Vectors
read more
Citations
The LOCATA Challenge: Acoustic Source Localization and Tracking
The LOCATA Challenge Data Corpus for Acoustic Source Localization and Tracking
Acoustic SLAM
Sound Localization Based on Phase Difference Enhancement Using Deep Neural Networks
Novel application of FO-DPSO for 2-D parameter estimation of electromagnetic plane waves
References
Multiple emitter location and signal parameter estimation
Image method for efficiently simulating small‐room acoustics
Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST
TIMIT Acoustic-Phonetic Continuous Speech Corpus
Coherent signal-subspace processing for the detection and estimation of angles of arrival of multiple wide-band sources
Related Papers (5)
Localization of multiple speakers under high reverberation using a spherical microphone array and the direct-path dominance test
Frequently Asked Questions (18)
Q2. What is the sh domain representation of a sound field?
The SH domain representation of the plane-wave density, as expressed in (7), is useful because the steering vectors, y(Ψn), are analytic functions which are independent of frequency.
Q3. what is the sh representation of a sound field?
The SH representation of a sound field [4], [30] around a particular point in space is determined by the complexvalued plane-wave density a(k, θ, φ), which is a function of wavenumber k, inclination θ and azimuth φ.
Q4. What are the parameters specific to the proposed method?
The parameters specific to the proposed method were: σ = 4◦; NKθ=91 and NKφ=180, which corresponds to 2◦resolution in azimuth and inclination; and λ = 0.001/(σ √2π), which removes entries >15◦ from the look direction.
Q5. What is the computational cost of calculating SSPIV?
Since SSPIV also requires significantly less computation than DPD-MUSIC at dense grid resolutions, it is particularly well suited to DOA estimation in situations involving multiple, moving speakers.
Q6. What is the corresponding spatial cost function for the DOA estimation method?
For all methods (PIV, SSPIV, PWD-SRP and DPD-MUSIC) the corresponding spatial cost function were computed over a 2D grid with 2◦ resolution in azimuth and inclination.
Q7. What is the way to estimate the DOA of moving sources?
For moving sources the optimal length of observation interval is a trade-off between robustness to noise and the ability to follow the true source direction.
Q8. What is the effect of the DPD test on the noise subspace?
It is assumed that the effective rank of R̂x̃lm(ν, `) in those TF-regions whichpass the DPD test is unity and so the noise subspace has dimension (L+ 1)2 − 1.
Q9. How many inclinations did the speakers have to be arranged?
These were arranged at approximately 60◦ intervals and their inclinations alternated to be above or below the horizontal plane of the array, according to whether they were seated or standing.
Q10. What is the maximum directivity index of the beamformer?
The PWD beamformer maximizes the directivity index and is equivalent to the MVDR under the assumption of an uncorrelated diffuse noise field.
Q11. How many spherical microphones were used to record speech?
To demonstrate the efficacy of the proposed methods, speech was recorded in a real room with dimensions of approximately 10.3×9.2×2.6 m and a reverberation time of 0.4 s. Speech signals were recorded using an Eigenmike 32 channel rigid spherical microphone array with radius 4.2 cm located close to the centre of the room.
Q12. What is the error angle of cos?
The error is highly dependent on all the factors but for any interferer angle the error is zero when cos |γ| = −g and increases as |γ| → 0◦ and |γ| → 180◦.
Q13. What is the effect of estimation errors in the spatial covariance matrices?
The effect of estimation errors in the spatial covariance matrices is addressed through numerical simulations and real experiments in Sec. V and VI, respectively.
Q14. What is the effect of DPD-MUSIC on the miss rate?
This is especially apparent for miss rates between 0.25 and 0.5 where DPD-MUSIC averages 0.7-2.3 clutter measurements per time step whereas SSPIVaverages less than 0.3.
Q15. How many s did it take to compute for grid resolutions 10, 5?
These took {0.0073, 0.0122, 0.0726, 0.2954} s and{0.0040, 0.0181, 0.3051, 2.9103} s, respectively, to compute for grid resolutions {10◦, 5◦, 2◦, 1◦}.
Q16. What is the definition of a sparse dictionary?
A sparse dictionary is enforced by setting entries smaller than λ to zero, i.e.K̂jθ,jφ (ϕ) ={ 0 Kjθ,jφ (ϕ) < λKjθ,jφ (ϕ) otherwise .
Q17. How many DOAs were estimated for each trial?
using Nd = 1 (and 4) a single (set of) estimated DOA(s) was obtained for each trial by setting the observation interval to the full length of the signal (4 seconds).
Q18. How many sources were recorded in the second scenario?
So as to be relevant to practical scenarios with moving sound sources, in the second scenario, two sources were recorded whilst moving around a radius of 1.5 m.