Single Channel Target Speaker Extraction and Recognition with Speaker Beam
Citations
197 citations
158 citations
Cites background or methods from "Single Channel Target Speaker Extra..."
...We gradually built and refined the SpeakerBeam approach over several studies [28]–[31]....
[...]
...While these studies [28]–[30] focused on a multichannel case, in [31], we investigated the ASR performance in a single-channel setting....
[...]
141 citations
Cites methods from "Single Channel Target Speaker Extra..."
...These approaches include TSASR [21] for target speech recognition, Speaker Beam [22, 23] and Voice Filter [24] for target speech extraction, and Personal VAD [26] for target speech detection....
[...]
...This direction is represented by such approaches as Target-Speaker ASR [21], Speaker Beam [22, 23] and Voice Filter [24] aimed at the target-speaker speech extraction, etc....
[...]
129 citations
Cites background or methods from "Single Channel Target Speaker Extra..."
...To handle this problem two families of algorithm were proposed in recent years, namely the blind speech separation [5, 10, 3, 4] and informed speech extraction [13, 14, 15]....
[...]
...In [13, 14, 18], speaker identity features extracted from an additional enrollment utterance has been shown useful for separation....
[...]
100 citations
Cites methods from "Single Channel Target Speaker Extra..."
...In the SpeakerBeam method [20, 21], the guidance signal in the T-F-domain A ∈ CF×Ka is converted to the sequence-summarized feature λ ∈ R using an auxiliary neural network G : CF×Ka → RP×Ka as...
[...]
References
111,197 citations
"Single Channel Target Speaker Extra..." refers methods in this paper
...The AM and all other models were trained using the ADAM optimizer [28]....
[...]
1,009 citations
788 citations
714 citations
"Single Channel Target Speaker Extra..." refers background or methods in this paper
...There have been many studies on adaptation of DNN-based acoustic models exploiting auxiliary features [19, 20, 23, 24]....
[...]
...Conventional approaches simply concatenate the auxiliary feature to the input of a DNN (auxiliary input DNN) [20,23,24]....
[...]
604 citations
"Single Channel Target Speaker Extra..." refers background in this paper
...Recently, deep clustering [9] and deep attractor networks [11] have been proposed to release these limitations....
[...]