An MCU (Multipoint Control Unit) is a media server that decodes every incoming audio and video stream, mixes them into a single composite picture and audio track, and sends that one combined stream back to each participant. From the client's point of view a group call looks like a single incoming video, which makes client load predictable and keeps interoperability with older or constrained endpoints simple. The cost of that simplicity is borne by the server: mixing requires heavy CPU because every stream is fully decoded and re-encoded, and that extra encode-decode step adds a hop of latency to the conversation.
For telemedicine, the MCU's strengths line up with two specific situations. The first is producing a single composite recording of a multi-party consult, where you genuinely want one merged file rather than separate per-participant streams. The second is supporting low-power or legacy endpoints — older hardware, kiosks, or telephony bridges — that cannot handle decoding several independent streams at once.
In practice, most modern telemedicine architectures choose an SFU (which forwards streams without mixing) for live calls, and reserve MCU-style mixing for the narrower jobs of recording composition and legacy interop. Like an SFU, an MCU decrypts and re-encodes PHI-bearing media in the clear, so it must sit inside the HIPAA compliance boundary and be covered by a Business Associate Agreement (BAA) when a vendor runs it. The pitfall is defaulting to an MCU for everything and then paying for the latency and server cost on calls that never needed mixing.

