Consent in telemedicine is not a single agreement but a stack of distinct ones, each with its own legal basis. There is consent to treatment (the patient agreeing to be cared for), consent to telehealth as a modality (many US states require this specifically, sometimes in writing, because a video visit carries different risks and limitations than an in-person one), consent to recording when a session is captured, and HIPAA authorizations for any use of the patient's data beyond treatment, payment, and operations. Each of these has its own moment of capture, its own required wording, and its own retention obligation, and an auditor will ask you to produce the specific artifact for a specific encounter.
For a video product team this matters because consent is evidence, and evidence has to be retrievable on demand. When a regulator, a malpractice attorney, or a hospital compliance officer asks whether a given patient consented to telehealth and to recording on a given date, "the box was checked during onboarding" is not a defensible answer. You need to know which version of which consent text the patient saw, when, and how they agreed.
The engineering implication is to model consent as first-class, recorded, versioned, queryable events rather than a transient UI state buried in a signup flow. Store the consent type, the exact text version, the timestamp, the actor, and the context, and make it queryable by patient and encounter. The common pitfall is collapsing all the layers into one generic checkbox, which fails the moment any single layer is challenged independently.

