Published: 2026-06-05 · Reading time: 19 min read · Author: Nikolay Sapunov, CEO at Fora Soft
Why this matters
If your product touches video — streaming, conferencing, OTT, telemedicine, e-learning — AAC is almost certainly already in your pipeline, because it is the default audio codec for MP4, YouTube, Netflix, Apple devices, and most of the web. But "we use AAC" is an incomplete sentence. The four family members behave so differently that confusing them is the audio equivalent of confusing a sedan with a freight truck. This article is written for a product manager, founder, or operations lead with no audio background; every claim traces back to the ISO/IEC standard or the patent maker's own statement, not a secondhand summary. Read it and you will know exactly which AAC to ask your engineers for, and why.
The family, in one sentence each
The word "AAC" stands for Advanced Audio Coding. It first appeared in 1997 as the successor to MP3, and it has grown into a family because engineers kept adding optional layers on top of the original codec rather than replacing it. Each layer is a tool the encoder can switch on; each new family member is the original plus one or two more tools. That layered design is the single most important thing to understand, so hold onto it: every AAC member contains AAC-LC at its core, then adds machinery to squeeze more quality out of fewer bits.
Here is the whole family in one line each, from oldest and simplest to newest and cleverest:
- AAC-LC (Low Complexity) — the plain, full-quality codec. The default for video everywhere.
- HE-AAC v1 (High Efficiency) — AAC-LC plus a treble-rebuilding trick called SBR. Good down to ~48 kbps stereo.
- HE-AAC v2 — HE-AAC v1 plus a stereo-rebuilding trick called Parametric Stereo. Good down to ~24 kbps stereo.
- xHE-AAC (Extended HE-AAC) — a redesigned core that also codes speech, plus mandatory loudness metadata. Good from ~12 kbps mono up to transparent stereo.
The rest of this article unpacks each one, shows the arithmetic that makes the tricks work, and gives you a decision rule. We will lean on the master comparison in how audio compression works for the underlying ideas — psychoacoustic masking, the frequency transform, and entropy coding — so if any of those terms feel shaky, that article is the prerequisite.
Figure 1. One root, three branches. Each AAC member is AAC-LC plus optional tools. The bitrate sweet spot drops as you add tools, because each tool rebuilds something the encoder no longer has to send in full.
AAC-LC: the workhorse you already ship
AAC-LC is the member you are almost certainly using right now without thinking about it. When a phone records a video, when YouTube serves a clip, when an iPhone plays a song from the Music app, the audio is very often AAC-LC. The "LC" stands for Low Complexity, which is a slightly misleading name — it does not mean low quality. It means the codec deliberately leaves out a few computationally expensive tools from the original 1997 AAC design so that cheap hardware can decode it in real time. That trade paid off: AAC-LC decoders are in every phone, browser, smart TV, and set-top box made in the last two decades.
The reason AAC-LC replaced MP3 as the default is simple arithmetic dressed up as engineering. At the same bitrate, AAC-LC sounds cleaner than MP3, because it uses a finer frequency analysis and smarter ways to allocate bits to the parts of the sound your ear actually notices. The rule of thumb the industry settled on: AAC-LC reaches "transparent" quality — meaning most listeners cannot tell it from the original — at around 128 kbps for stereo music. MP3 needs roughly 192 kbps to get to the same place. So for the same audio quality, AAC-LC saves about a third of the bytes:
MP3 transparent ≈ 192 kbps
AAC-LC transparent ≈ 128 kbps
192 ÷ 128 = 1.5× → AAC-LC needs about 33% fewer bits
That is why a new product in 2026 that wants maximum compatibility — playable on literally everything — picks AAC-LC for its standard stereo audio. It is the safe default. The standard that defines it is ISO/IEC 14496-3 (MPEG-4 Audio, current edition 2019), which itself traces back to the original AAC in the MPEG-2 standard ISO/IEC 13818-7 from 1997.
Where AAC-LC stops being the right answer is the low-bitrate basement. Below about 96 kbps for stereo, AAC-LC starts to sound thin and grainy, because there simply are not enough bits left to describe the full sound. That is the gap the next two family members were built to fill.
HE-AAC v1: rebuilding the treble you never sent
Imagine you have to describe a photograph to someone over a phone line that is too slow to send the whole picture. You could send the bottom two-thirds in detail and then say, "the top third is more sky, same blue, getting lighter toward the top." The listener paints the missing third from your hint. They do not get the exact original sky, but they get something convincing, and you sent almost no data for it.
That is exactly what HE-AAC v1 does with sound. The trick is called Spectral Band Replication, or SBR. High frequencies — the treble, the sparkle in cymbals and consonants — are expensive to encode in full but are often closely related to the lower frequencies underneath them. So HE-AAC v1 encodes the lower frequencies properly with ordinary AAC-LC, then sends a tiny set of instructions — a few kilobits — telling the decoder how to rebuild the missing treble from the bass and mids it already has. The decoder replicates the upper frequency band, hence the name.
The payoff is large. A treble band that might cost 30 kbps to encode in full can be hinted at for 2–3 kbps. That freed-up budget goes into the part of the sound the ear cares about most. The result: HE-AAC v1 delivers acceptable stereo music at bitrates where plain AAC-LC falls apart — roughly the 48–64 kbps range. The 3GPP standards body's own controlled tests rated HE-AAC and HE-AAC v2 as "Good" for music at bitrates as low as 24 kbps.
HE-AAC v1 began life outside MPEG. A company called Coding Technologies built SBR on top of AAC-LC for satellite radio, marketed it as "aacPlus" and "AAC+", and then handed the SBR method to MPEG, which standardized it in 2003 as ISO/IEC 14496-3:2001/Amd 1:2003. If you ever see a stream labelled "aacPlus", "AAC+", or "eAAC", that is HE-AAC v1 under a vendor trade name.
Pitfall — an old decoder hears HE-AAC at half the brightness. Because HE-AAC v1 is AAC-LC plus an SBR hint, a decoder that does not understand SBR will still play the file — it just ignores the treble-rebuild instructions and outputs only the low-frequency AAC-LC core. The audio plays, but it sounds muffled, as if a blanket were thrown over the speakers, and some old players even report the track as twice its real length. If a customer says "your low-bitrate stream sounds dull on my old device", a missing SBR decoder is the prime suspect. Always confirm the target devices support the exact AAC member you ship.
HE-AAC v2: rebuilding the stereo, too
HE-AAC v2 takes the same idea one step further and applies it to the stereo image. Two-channel stereo normally costs nearly twice as much as mono, because you send a left channel and a right channel. But the two channels are usually very similar — most of the sound is in the middle, and the differences between left and right are small and can be described with a handful of numbers rather than a full second channel.
That is Parametric Stereo, or PS. The encoder collapses the two channels down to one mono channel, encodes that mono channel properly, and then sends a tiny side-stream of parameters — numbers that describe how to spread the mono signal back out into a convincing left and right at the decoder. The whole stereo image is rebuilt from one real channel plus a few kilobits of spatial hints.
Stack that on top of SBR and you get HE-AAC v2: AAC-LC for the core, SBR for the treble, Parametric Stereo for the width. Three layers, each rebuilding something the encoder no longer has to send in full. The result is stereo music that stays listenable at around 24–32 kbps — a bitrate at which plain AAC-LC would be unusable noise. Here is the layering as plain arithmetic:
Full stereo, properly coded: ~128 kbps for transparency
+ SBR (rebuild treble): treble hinted for ~2–3 kbps instead of ~30 kbps
+ Parametric Stereo (rebuild width): 2nd channel becomes ~3 kbps of parameters
Net result (HE-AAC v2): listenable stereo at ~24 kbps
HE-AAC v2 was standardized in 2006 as ISO/IEC 14496-3:2005/Amd 2:2006, with the Parametric Stereo tool itself published earlier in 2004. Vendors marketed it as "aacPlus v2", "eAAC+", and "Enhanced AAC+". It became a required audio format in the 3GPP specifications for 3G mobile multimedia, which is why it spread across early mobile streaming and digital radio standards like DAB+ and Digital Radio Mondiale.
There is a quality ceiling, though, and it matters. SBR and Parametric Stereo are reconstructions, not the real thing. They are brilliant when bits are scarce, but as you raise the bitrate they stop helping and can even hold you back, because the reconstructed treble and stereo never quite match a fully coded original. So HE-AAC v2 is a low-bitrate specialist. Above roughly 64 kbps, plain AAC-LC sounds as good or better and is simpler. The encoder, ideally, switches member based on the target bitrate — and that idea of one stream flexing across bitrates is exactly what the newest member was built to deliver.
xHE-AAC: one codec for speech and music, at any bitrate
For forty years, audio engineering kept speech and music in separate boxes. Speech codecs — the ones in your phone calls — model the human voice box and sound superb on voice at tiny bitrates but mangle music. Music codecs like AAC do the opposite: great on music, wasteful on speech. A single product that carries both — say a video with dialogue, then a music sting, then dialogue again — had to either compromise or switch codecs mid-stream.
xHE-AAC ends that split. The "x" stands for Extended, and the codec is built on a technology called USAC — Unified Speech and Audio Coding — standardized as ISO/IEC 23003-3 in 2012. USAC is not AAC-LC with more tools bolted on; it is a redesigned core that can decide, frame by frame, whether a moment of sound is more speech-like or more music-like, and code it with the right tool. The result is a codec that holds up across an enormous bitrate range: roughly 12 kbps for mono speech, climbing smoothly to transparent stereo music at higher rates, all in one continuous family of settings.
Two features make xHE-AAC matter for video products specifically.
The first is bitrate flexibility for adaptive streaming. Modern streaming serves different quality levels to different network conditions — the technique covered in audio adaptive bitrate ladders. xHE-AAC is designed to span that whole ladder with one codec, switching bitrates seamlessly as the network changes, so a phone on a weak connection drops to 16 kbps without an audible break and climbs back to full quality when the signal returns. Netflix reported that when it moved Android mobile streaming to xHE-AAC's adaptive audio, viewers switched from speakers to headphones 16% less often on high-dynamic-range content — a measurable comfort win.
The second is mandatory loudness and dynamic-range metadata, built on a standard called MPEG-D DRC. Every xHE-AAC stream must carry an objective loudness measurement, computed with the ITU-R BS.1770 algorithm explained in loudness, peak, RMS, LUFS, plus instructions the decoder can use to compress the dynamic range on the fly. The practical effect: the listener on a noisy bus can ask their device to flatten the quiet-to-loud range so dialogue stays audible, and the device does it at playback using metadata the encoder already embedded. Loudness control is not an afterthought in xHE-AAC; it is required by the standard.
xHE-AAC is supported in Android since version 9 (Pie, 2018), in iOS since version 13 (2019), and in Windows 11 and Xbox since October 2022. As of Amazon's September 2025 hardware refresh, new Echo and Fire TV devices decode it natively. Fraunhofer reports that services including Netflix, Audible, Facebook, and Instagram now deliver billions of hours of xHE-AAC content each month. The compatibility gap that once made xHE-AAC a risky choice has largely closed for mobile and modern smart-TV audiences.
Picking the right member: a decision you can make in one minute
The choice between AAC members comes down to two questions: how many bits can you spend, and which devices must play it back. The table below is the short version.
| AAC member | Core tools | Bitrate sweet spot (stereo) | Best for | Watch out for |
|---|---|---|---|---|
| AAC-LC | AAC-LC only | 96–256 kbps | Default video audio, max compatibility | Sounds thin below ~96 kbps |
| HE-AAC v1 | + SBR | 48–80 kbps | Low-bandwidth streaming, radio | Old decoders lose treble |
| HE-AAC v2 | + SBR + PS | 24–48 kbps | Very low bitrate, mobile, DAB+ | No quality gain above ~64 kbps |
| xHE-AAC | USAC core + DRC | 12 kbps–transparent | Adaptive streaming, speech + music, loudness control | Needs a modern (2018+) decoder |
Table 1. The four AAC members at a glance. "Sweet spot" is the bitrate range where each member sounds best for its cost; outside that range, a different member usually wins.
A worked example shows how the layers pay off. Suppose you stream a 90-minute talk-and-music show to phones, and you want a fallback rendition for users on a poor cellular connection at 24 kbps stereo. AAC-LC at 24 kbps would be unusable. HE-AAC v2 at 24 kbps is listenable. xHE-AAC at 24 kbps is comfortably good and carries loudness metadata for the noisy-environment listener. If every target phone was sold after 2019, xHE-AAC is the answer. If you must support a fleet of older or cheaper devices, HE-AAC v2 is the safe low-bitrate floor, with AAC-LC for the higher rungs.
Pitfall — "we use AAC" tells your engineers almost nothing. The four members differ by an order of magnitude in their low-bitrate behaviour. A ticket that says "encode the audio in AAC at 32 kbps" without naming the member will get you AAC-LC by default on many tools — and AAC-LC at 32 kbps stereo sounds broken. Always specify the member (AAC-LC, HE-AAC v2, xHE-AAC) and the profile, not just "AAC". The same care applies when you receive files: read the actual profile in the container, because "it's an .m4a" does not tell you which AAC is inside.
How AAC sits inside your files and streams
AAC is a codec, not a file. The encoded AAC bitstream rides inside a container — usually MP4 or its fragmented variant fMP4 for streaming, sometimes a raw ADTS stream with the .aac extension, sometimes Matroska. The container records which AAC profile the bitstream uses, so a correct player reads that field and loads the matching decoder. This is the same codec-versus-container split covered in audio in containers: the container is the box, the AAC profile is what is inside it.
For streaming over HLS and DASH, the profile is also declared in the manifest and in the codecs parameter — strings like mp4a.40.2 for AAC-LC, mp4a.40.5 for HE-AAC v1, mp4a.40.29 for HE-AAC v2, and mp4a.40.42 for xHE-AAC. Players use those strings to decide, before downloading a single audio segment, whether they can play the track. Get the codec string wrong and a capable device may refuse a track it could actually decode, or try to load a track it cannot. How a player picks among audio tracks is covered in depth in audio in HLS, DASH, CMAF.
Licensing in 2026: who pays, and for what
Audio codec licensing confuses people because two separate questions get tangled: do I pay to use the codec software, and do I pay to distribute content encoded with it. For the AAC family the answers, as of 2026, are clearer than the rumours suggest.
The AAC family — AAC-LC, HE-AAC v1, HE-AAC v2, and xHE-AAC, together with the MPEG-D DRC loudness tools — is covered by a single patent pool. That pool was run for years by Via Licensing and, after Via Licensing merged with MPEG LA in 2022, is now administered under the Via LA brand. A product company that ships an AAC encoder or decoder — in hardware or in software — needs a license and pays a per-unit royalty. Crucially, xHE-AAC and its DRC tools are included in that same AAC program at no additional cost over baseline AAC, so adopting the newest member does not add a new licensing line item.
The part that trips people up: unlike the old MP3 program before its 2017 expiry, content owners are not required to pay license fees merely to distribute AAC-encoded content. If you stream AAC audio to your users, you are distributing content, not shipping a codec, and the pool does not charge you for that. The royalty attaches to the encoder and decoder products, which is why the cost usually lives with the operating system, browser, or device maker rather than with you. For most product teams using AAC through the platform's built-in encoder and decoder, the licensing is already handled upstream. If you embed your own commercial AAC encoder, confirm its license terms — but you will not owe a content-distribution fee for the format itself.
This contrasts sharply with Opus, the royalty-free open codec that dominates WebRTC, where there is no pool to license at all. The AAC-versus-Opus choice often comes down to exactly this: AAC's universal device support and broadcast pedigree against Opus's zero licensing and real-time strengths.
Where Fora Soft fits in
We have built audio into video products across streaming, OTT and Internet TV, video conferencing, telemedicine, e-learning, and AR/VR since 2005. AAC is the audio codec we reach for most in streaming and on-demand work, because it plays on every device a client's audience owns. We pick AAC-LC for the standard rungs of a streaming ladder, layer in HE-AAC or xHE-AAC for the low-bitrate fallback that keeps mobile viewers connected, and lean on xHE-AAC's loudness metadata when a client's content swings between quiet dialogue and loud action. The recurring lesson from production is the one this article opened with: the failure is almost never "AAC is the wrong codec" — it is shipping the wrong AAC member for the bitrate, or never naming the member at all.
What to read next
- How audio compression works: the four ideas behind every modern codec
- Opus: the open codec that ate WebRTC
- Loudness, peak, RMS, LUFS — measuring how loud audio actually is
Call to action
- Talk to a audio engineer — book a 30-minute scoping call to talk through your aac family explained plan.
- See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
- Download the AAC Family — cheat sheet — One-page reference: the four AAC members, their bitrate sweet spots, HLS/DASH codec strings, fast facts on SBR / Parametric Stereo / xHE-AAC, 2026 licensing, and the one rule that prevents the most common AAC mistake.
References
- ISO/IEC 14496-3:2019, Information technology — Coding of audio-visual objects — Part 3: Audio — the current edition of MPEG-4 Audio, the controlling standard for AAC-LC, HE-AAC v1, and HE-AAC v2 profiles. https://www.iso.org/standard/76383.html
- ISO/IEC 23003-3:2012, Information technology — MPEG audio technologies — Part 3: Unified speech and audio coding (USAC) — the controlling standard for xHE-AAC. https://www.iso.org/standard/57464.html
- ISO/IEC 14496-3:2001/Amd 1:2003 — the amendment that first standardized the HE-AAC v1 (SBR) profile. https://www.iso.org/standard/38148.html
- ISO/IEC 14496-3:2005/Amd 2:2006 — the amendment that first standardized the HE-AAC v2 (SBR + Parametric Stereo) profile. https://www.iso.org/standard/43026.html
- Fraunhofer IIS, xHE-AAC product and technology pages — the codec's maker on USAC, MPEG-D DRC loudness, and platform support. https://www.iis.fraunhofer.de/en/ff/amm/broadcast-streaming/xheaac.html
- Fraunhofer Audio Blog, Netflix Now Streaming xHE-AAC Audio on Android Mobile (2021) — primary report of Netflix's adaptive-audio adoption and the 16% headphone-switching figure. https://www.audioblog.iis.fraunhofer.com/netflix-xheaac-android
- Fraunhofer Audio Blog, xHE-AAC Audio Codec supported in Amazon's new Line of Products (2025) — Amazon Echo / Fire TV native xHE-AAC support. https://www.audioblog.iis.fraunhofer.com/xhe-aac-amazon-vega
- High-Efficiency Advanced Audio Coding — overview of HE-AAC v1/v2 standardization history, trade names, and the Via Licensing patent pool; used for orientation, with every standards claim confirmed against the ISO/IEC amendment listings in references 3–4. https://en.wikipedia.org/wiki/High-Efficiency_Advanced_Audio_Coding
- EBU Technical Review, Moser, MPEG-4 HE-AAC v2 — audio coding for today's media world (2006) — the EBU's explanation of SBR and Parametric Stereo. https://tech.ebu.ch/docs/techreview/trev_305-moser.pdf
- Via LA Licensing, AAC patent pool program — administrator of the AAC family license following the 2022 Via Licensing / MPEG LA merger. https://www.via-la.com/
Per §4.3.2, where the popular "AAC is licensed for content distribution" framing (common on SEO blogs, reference 8's secondary citations) conflicts with the patent pool's own terms, this article follows the pool's published position (reference 10) — content distribution carries no fee; the royalty attaches to encoders and decoders.


