UBGAN: Enhancing Coded Speech with Blind and Guided Bandwidth Extension

UBGAN

Modern communication systems often rely on low‑bitrate codecs that transmit only a limited audio bandwidth at a fixed sampling rate. The Unified Bandwidth‑extension GAN (UBGAN) offers a universal solution to enhance the perceptual quality of legacy speech codecs by extending their audio bandwidth while keeping the required bitrate unchanged or almost unchanged.

UBGAN presents a modular and lightweight GAN‑based solution for increasing the operational flexibility of both conventional and neural speech codecs. Our proposed approach enables the extension of standard wideband coded speech (up to 8 kHz) to super‑wideband quality (up to 16 kHz), without requiring changes in the underlying transmission.

UBGAN can work in two modes:

  • Blind Bandwidth Extension: Reconstructing high‑frequency components using only the decoded signal at the receiver.
  • Guided Bandwidth Extension: Incorporating minimal side information to deliver even higher perceptual quality.

 

Subjective listening tests demonstrate that UBGAN consistently enhances transmitted speech quality across a wide range of conventional and neural speech codecs. As shown in the figure below, 27 human listeners rated the quality of speech produced by different codecs, both with and without our solution. In all cases, applying our bandwidth extension methods led to noticeable improvements in the Wideband (WB) speech decoded by legacy codecs. Notably, the guided variant of our approach was judged even more beneficial, delivering the highest perceived quality gains.

Scores-codes-UBGAN
P.808 DCR scores with 27 listeners for WB codecs with blind-UBGAN and guided-UBGAN

This work was presented at the highly regarded 2025 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), a leading bi‑annual event in the audio and speech research community.

You can read the full paper on arXiv:
🔗 https://arxiv.org/abs/2505.16404

A listening demo is available here:
🔊 https://fhgspco.github.io/ubgan/

Author: Kishan Gupta, Srikanth Korse, Andreas Brendel, Nicola Pia, and Guillaume Fuchs from Fraunhofer IIS

Related Posts