An audio frame is the fixed block of samples a codec treats as a single unit of encoding — for example 1024 samples for AAC, or a 20 ms chunk for Opus. It is the atom of compressed audio: the encoder analyses and packs one frame at a time, so a frame is also the granularity of latency (you can't send audio until a frame is full), of seeking (you can only cut on frame boundaries), and of packet loss (lose a packet, lose whole frames). Frame size is a deliberate tradeoff — shorter frames mean lower latency but worse compression efficiency and more overhead, which is why real-time and file codecs pick differently.