A Gentle Summary of Postfilter-related Work

近年来基于深度学习方法的 postfilter 的工作大致有以下几种：

通过训练目标等设计 encoder-decoder 结构中的 decoder 起到 postfilter 的作用 [1]；
先进行预分离，再通过基于 NN 的 postfilter 进行进一步分离 [2,3]；
在 AEC 的任务中，讲传统的 postfilter 模块替换成 NN 的 [4,5]；
添加额外网络进行后处理 [6]；
用 GAN 的判别器对增强后的语谱进行后处理 [7,8]。

下面是这些文章的详细介绍

NN-based Postfilter (Decoder)

Inplace Gated Convolutional Recurrent Neural Network for Dual-Channel Speech Enhancement, Interspeech 2021

Decoder: signal filtering & reconstruction （两个 encoder 分别对 mag 和 phase 进行 masking 和 mapping）

End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features, TASLP 2020

先在时频域预分离，然后将混合语音和预分离的语音同时作为输入，通过一维卷积和 attention 进行特征融合，融合的特征送入 TCN-based postfilter

Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation, Interspeech 2021

先预分离，再根据 speaker 信息进一步分离笔记

$Y^2$-Net FCRN for Acoustic Echo and Noise Suppression, Interspeech 2021

用两个全卷积循环网络 FCRN，首先是 AEC 模块估计回声，再用后置滤波模块进行残留回声抑制。

Bandwidth-Scalable Fully Mask-Based Deep FCRN Acoustic Echo Cancellation and Postfiltering

Follow $Y^2$-Net 的工作，增加了频带宽度扩展 Bandwidth Extension。

SFSRNet: Super-Resolution for Single-Channel Audio Source Separation, AAAI 2022

decoder 后面接上超分网络 + 渐进学习恢复降采样带来的信息损失笔记

GAN-related

Generative Adversarial Network-Based Postfilter for STFT Spectrograms, Interspeech 2017

Generated spectra typically lack the fine structures that are close to those of the true data. Propose a GAN-based postfilter that is implicitly optimized to match the true feature distribution in adversarial learning.

GAN cannot be easily trained for very high-dimensional data such as STFT spectra. Thus take divide-and-concatenate strategy: first divide the spectrograms into multiple freq bands with overlap, reconstruct the individual bands using the GAN-based postfilter trained for each band, and connect th bands with overlap.

Wavecyclegan2: Time-domain neural post-filter for speech waveform generation

Conventional

Nonlinear Spatial Filtering in Multichannel Speech Enhancement, TASLP 2021

A Synergistic Kalman- and Deep Postfiltering Approach to Acoustic Echo Cancellation, EUSIPCO 2021

Loss

Improving Perceptual Quality by Phone-Fortified Perceptual Loss Using Wasserstein Distance for Speech Enhancement, Interspeech 2021

利用音素相关的信息计算 enhanced speech 和 clean speech 之间的loss (PFPL)

Other

Residual Echo and Noise Cancellation with Feature Attention Module and Multi-Domain Loss Function

Multiple inputs (features extracted from far-end reference and the echo estimated by the Linear Adaptive Filter) are weighted by a feature attention module.

加入 feature attention module 来更好地融合远端参考信号、LAF输出的 $E(k,f)$ 和估计的线性回声 $C(k,f)$ 三种输入，而非简单的拼接。

Deep Learning-Based Joint Control of Acoustic Echo Cancellation, Beamforming and Postfiltering

如题

Filterbank design for end-to-end speech separation, Manuel Pariente et al., ICASSP 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Gentle Summary of Postfilter-related Work.md

A Gentle Summary of Postfilter-related Work.md

A Gentle Summary of Postfilter-related Work

NN-based Postfilter (Decoder)

Inplace Gated Convolutional Recurrent Neural Network for Dual-Channel Speech Enhancement, Interspeech 2021

End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features, TASLP 2020

Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation, Interspeech 2021

$Y^2$-Net FCRN for Acoustic Echo and Noise Suppression, Interspeech 2021

Bandwidth-Scalable Fully Mask-Based Deep FCRN Acoustic Echo Cancellation and Postfiltering

SFSRNet: Super-Resolution for Single-Channel Audio Source Separation, AAAI 2022

GAN-related

Generative Adversarial Network-Based Postfilter for STFT Spectrograms, Interspeech 2017

Wavecyclegan2: Time-domain neural post-filter for speech waveform generation

Conventional

Nonlinear Spatial Filtering in Multichannel Speech Enhancement, TASLP 2021

A Synergistic Kalman- and Deep Postfiltering Approach to Acoustic Echo Cancellation, EUSIPCO 2021

Loss

Improving Perceptual Quality by Phone-Fortified Perceptual Loss Using Wasserstein Distance for Speech Enhancement, Interspeech 2021

Other

Residual Echo and Noise Cancellation with Feature Attention Module and Multi-Domain Loss Function

Deep Learning-Based Joint Control of Acoustic Echo Cancellation, Beamforming and Postfiltering

Files

A Gentle Summary of Postfilter-related Work.md

Latest commit

History

A Gentle Summary of Postfilter-related Work.md

File metadata and controls

A Gentle Summary of Postfilter-related Work

NN-based Postfilter (Decoder)

Inplace Gated Convolutional Recurrent Neural Network for Dual-Channel Speech Enhancement, Interspeech 2021

End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features, TASLP 2020

Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation, Interspeech 2021

$Y^2$-Net FCRN for Acoustic Echo and Noise Suppression, Interspeech 2021

Bandwidth-Scalable Fully Mask-Based Deep FCRN Acoustic Echo Cancellation and Postfiltering

SFSRNet: Super-Resolution for Single-Channel Audio Source Separation, AAAI 2022

GAN-related

Generative Adversarial Network-Based Postfilter for STFT Spectrograms, Interspeech 2017

Wavecyclegan2: Time-domain neural post-filter for speech waveform generation

Conventional

Nonlinear Spatial Filtering in Multichannel Speech Enhancement, TASLP 2021

A Synergistic Kalman- and Deep Postfiltering Approach to Acoustic Echo Cancellation, EUSIPCO 2021

Loss

Improving Perceptual Quality by Phone-Fortified Perceptual Loss Using Wasserstein Distance for Speech Enhancement, Interspeech 2021

Other

Residual Echo and Noise Cancellation with Feature Attention Module and Multi-Domain Loss Function

Deep Learning-Based Joint Control of Acoustic Echo Cancellation, Beamforming and Postfiltering