Neural audio codec

Definition

A codec that learns compression from data instead of hand-designed transforms — Lyra, EnCodec, SoundStream. Often using residual vector quantization to hit very low bitrates.

A neural audio codec learns how to compress and reconstruct sound from data, using deep neural networks in place of the hand-designed transforms and psychoacoustic models of traditional codecs. The common design is an autoencoder: a neural encoder maps audio to a compact latent representation, a quantizer (often residual vector quantization, RVQ) discretizes it to a target bitrate, and a neural decoder regenerates the waveform. Lyra, EnCodec, and SoundStream are the landmark examples, achieving usable quality at bitrates far below what classical codecs can reach, especially for speech. They also produce the discrete tokens that modern generative audio and speech models operate on. The cost is heavier compute, but neural codecs are reshaping the low-bitrate and generative frontier of audio.

Also known as

learned codec, AI codec