An MCU (Multipoint Control Unit) is the older conferencing architecture in which a central server decodes every participant's audio and video, mixes them into a single combined stream, and re-encodes that for each recipient. Its advantage is the client side: each device sends and receives just one stream, so client bandwidth and CPU are minimal — valuable for weak devices, telephony bridges, and recording. The cost falls on the server, which must run a full decode-mix-encode pipeline per call, and on latency and quality, since transcoding adds delay and a generation of loss. For audio specifically, server-side mixing is still common (it is how a recording or a phone dial-in gets one clean blended track), even when video uses an SFU.