Efficient Neural Network Models for Audio and Signal Processing

The field of audio and signal processing is moving towards the development of efficient neural network models that can operate on low-cost and low-compute devices. Researchers are exploring innovative approaches to reduce model complexity and improve performance, such as using single quantizers, residual scalar-vector quantization, and hardware-software co-optimization. These advancements have the potential to enable real-time communication, improve audio quality, and enhance the overall efficiency of audio and signal processing systems. Notable papers in this area include:

One Quantizer is Enough: Toward a Lightweight Audio Codec, which presents a lightweight neural audio codec that achieves audio quality comparable to multi-quantizer baselines while reducing resource consumption by an order of magnitude.
A Streamable Neural Audio Codec with Residual Scalar-Vector Quantization for Real-Time Communication, which proposes a streamable neural audio codec that achieves decoded audio quality comparable to advanced non-streamable neural audio codecs with a fixed latency of only 20 ms.

Efficient Neural Network Models for Audio and Signal Processing

Sources