This is engineering guidance, not legal advice. Confirm specifics with qualified counsel.
Why this matters
If you are deciding whether to build a telemedicine platform or buy one, recording is the feature that looks trivial in a sales demo and turns into your largest liability in production. A "record this visit" button is a few lines of code; a recording that survives a breach, a subpoena, a wrong-patient mix-up, or a deletion request you cannot honor is a board-level event. This article is written for the founder, product manager, or clinical IT lead who has to decide whether their product records clinical video, and then has to defend that decision to a compliance officer, an auditor, and a patient. The questions you need to be able to answer — Why do we record? Who can play it back? How long do we keep it? What happens when a patient asks for their copy, or asks us to delete it? — are product decisions long before they are engineering ones. Getting them wrong is not a bug; in healthcare it is a reportable incident.
The first decision: record, or do not record
Most teams treat recording as a default-on convenience and reason backward from there. Reverse that instinct. In clinical video, not recording is the safer default, and every recording you keep should earn its place by serving a specific, defensible purpose.
The reason is asymmetry. A live consult is Protected Health Information — PHI, meaning any health data tied to an identifiable person — but it is PHI in motion: it exists for twenty minutes and is gone. A recording is the same PHI frozen into a file that can be copied, emailed, mis-filed, subpoenaed, leaked, and retained for years. You have converted a transient risk into a permanent one. That trade can absolutely be worth it — but only for a reason, not by default.
Good reasons to record exist. A recorded session can be part of the medical record that supports a diagnosis or a later review. It can be required for supervision in training settings, where a senior clinician reviews a trainee's encounters. It can feed quality assurance, dispute resolution, or, increasingly, an AI scribe that drafts the clinical note from the conversation. Each is legitimate. None of them is "we record everything because the platform can."
Figure 1. The record-or-not decision. Start from "do not record," and only branch into recording when a specific purpose justifies it — then the purpose drives every downstream choice about capture, retention, and access.
The purpose you choose is not paperwork; it drives the entire build. A recording kept for the medical record follows your state's medical-record retention clock. A recording made only to generate an AI note can often be deleted minutes after the note is signed. A recording for supervision may need tight, time-limited access for exactly one reviewer. Decide the why first, because it sets the how long, the who, and the where for everything that follows.
A recording is PHI the moment it exists
Once you decide to record, treat the file as PHI from the first frame — because it is. Any audio or video that can identify a patient is electronic PHI (ePHI) and falls under the full weight of the HIPAA Privacy and Security Rules, even if you believe nothing sensitive was said. The camera caught a face; that alone is enough.
Two consequences follow immediately, and teams routinely forget both.
First, a clinical recording is almost always part of the designated record set — HIPAA's term (45 CFR §164.501) for the records a provider uses to make decisions about a patient. That matters because of the patient's right of access under 45 CFR §164.524: a patient can request a copy of records in the designated record set, generally within 30 days. If your architecture cannot export a single patient's recording on request, you have built a compliance gap, not a feature.
Second, the same right cuts the other way: a recording you keep is a recording you must be able to produce and account for. "We record everything to an opaque bucket and never look at it" is not a safe posture — it is an un-auditable pile of PHI waiting for a breach. Every recording needs an owner, an access policy, a retention clock, and a deletion path. We build those below.
Consent: the two-layer rule everyone gets half-right
Here is the mistake we see most often: a team adds a checkbox that says "I consent to this visit being recorded," ships it, and believes they are covered. They are covered for one of the two permissions recording actually requires, and often not the one they think.
Layer one is the HIPAA layer. HIPAA governs whether you may use and disclose the patient's health information. For treatment, payment, and ordinary health-care operations, HIPAA generally does not require separate written authorization to create a recording that becomes part of the record. But the moment the recording is used for something outside that core — marketing, research, training videos shown outside the care team, anything the patient would not expect — you need a specific HIPAA authorization, the patient's written, purpose-bound permission. Spell out the purpose; a vague consent does not stretch to cover a new use.
Layer two is the wiretapping layer, and it is the one teams miss. Recording a conversation is also governed by state surveillance law, which is completely separate from HIPAA. The United States splits into two camps. In one-party consent states, one participant (the provider) consenting is enough. In all-party consent states — often called "two-party consent" — every participant must agree before the recording is legal, and recording without that agreement can be a criminal offense, entirely apart from any HIPAA question.
Eleven states are all-party consent for these purposes: California, Delaware, Florida, Illinois, Maryland, Massachusetts, Montana, Nevada, New Hampshire, Pennsylvania, and Washington. (State laws shift and some have nuances by call type — this is exactly the kind of list to confirm with counsel for your states.)
| One-party consent | All-party consent | |
|---|---|---|
| Who must agree | One participant (the provider) | Every participant on the call |
| Example states | Most US states | CA, FL, IL, MD, MA, WA, PA, NV, NH, DE, MT |
| Practical rule for telehealth | Notice plus provider consent | Capture explicit patient agreement before recording starts |
| Risk if skipped | Lower, but HIPAA still applies | Criminal/civil liability on top of HIPAA |
Table 1. The two consent regimes for recording a conversation. Telehealth crosses state lines, so the patient's location, not the provider's, usually decides which rule applies — design for the stricter case.
The telehealth twist makes this sharper. Your patient may be in a different state from your provider, and the stricter state's law generally controls when the conversation reaches into it. A platform serving patients nationwide cannot assume one-party rules. The safe, simple engineering answer is to design for all-party consent everywhere: capture an explicit, logged patient agreement before any recording begins, show a visible "recording" indicator for the entire session, and store the consent event next to the recording so you can prove it later. A clear spoken or on-screen agreement at the start of the recording — "this visit will be recorded for your medical record; do you agree?" — satisfies both layers at once and costs you nothing.
One more rule that saves real grief: make consent revocable and make the indicator honest. If the patient withdraws consent mid-visit, recording must actually stop, and a recording light that stays on after recording ends will destroy patient trust faster than any feature can rebuild it.
Where the recording is made: server-side versus client-side
Now the architecture. There are two fundamental places to capture a recording, and the choice ripples through cost, quality, reliability, and — most importantly — encryption.
Server-side recording captures the media at the media server. Almost every multi-party clinical platform routes video through a Selective Forwarding Unit, or SFU — the media server that receives each participant's stream and forwards it to the others, and the component you need once a call has more than two people or needs recording at all. Because every stream already passes through the SFU, recording there is natural and cheap: the server writes the streams to storage as they flow. Server-side recording is reliable (it does not depend on the patient's laptop staying awake), it captures every participant uniformly, and it is the default for any platform at scale. Its one hard requirement is the catch we return to below: the server must be able to read the media, which means the media is not end-to-end encrypted from the server's point of view.
Client-side recording captures the media in the browser or app, using the device's own recorder (in the browser, the MediaRecorder API). Its great advantage is that it can record media that the server never sees in the clear — which is the only way to record a truly end-to-end-encrypted call. Its disadvantages are practical and serious: it consumes the patient's CPU and battery, it depends on a device that may sleep or run out of space, and it captures only what that one device sees. For a clinical record you can rely on, client-side recording alone is fragile.
// Client-side capture with MediaRecorder — records what THIS device sees.
// Note: fragile for a clinical record (depends on the device staying awake),
// and you must still gate it on consent and protect the resulting blob as PHI.
const recorder = new MediaRecorder(localAndRemoteMixedStream, {
mimeType: "video/webm;codecs=vp9,opus",
});
const chunks = [];
recorder.ondataavailable = (e) => chunks.push(e.data);
recorder.onstop = () => uploadToComplianceBoundary(new Blob(chunks)); // encrypted, access-controlled
recorder.start(1000); // emit a chunk per second so a crash loses ≤ 1 s
Most production telemedicine platforms record server-side, accept that the media is transport-encrypted rather than end-to-end encrypted, and keep the whole recording pipeline inside a tightly controlled compliance boundary. That is a perfectly compliant choice — as long as you have a signed Business Associate Agreement (BAA), the contract that legally lets a vendor handle PHI on your behalf, with whoever operates the SFU and the storage. The protocol mechanics of bridging a live WebRTC call to a recording are covered in the Video Streaming section; see WebRTC recording and the HLS bridge. Here we care about the compliance consequences, not the codec plumbing.
The end-to-end-encryption conflict — and its three resolutions
This is the tension in the article's title, and it is worth slowing down for, because it is where good intentions collide with physics.
Start with two definitions, in plain language. Transport encryption scrambles the media while it travels between each device and the server, so an eavesdropper on the network sees nothing — but the server itself can read the media. WebRTC does this by default with a mechanism called DTLS-SRTP. End-to-end encryption (E2EE) goes further: only the patient's and clinician's devices hold the keys, so even the server forwarding the media cannot read it. E2EE is the gold standard for the most sensitive consultations — behavioral health, anything with heightened legal exposure.
Now the collision. Server-side recording requires the server to read the media. True E2EE forbids exactly that. You cannot have a server quietly record a call that is genuinely end-to-end encrypted, because to the server the media is meaningless noise. This is not a bug you can engineer around; it is the whole point of E2EE working as designed.
Figure 2. The E2EE-versus-recording conflict and its three resolutions. Each keeps the recording possible while being honest with the patient about who can see the media.
Products resolve this in one of three ways, and a mature platform picks deliberately rather than stumbling into one.
Resolution one — record on the client. Keep the call genuinely end-to-end encrypted and capture the recording on an endpoint that legitimately holds the keys (usually the clinician's app, sometimes a dedicated, authorized recording client that joins the call as a participant inside the encryption). The media is never readable by the server. The cost is the fragility of client-side capture and the need to securely upload the resulting file into your compliance boundary.
Resolution two — the declared compliance recorder. Add a single trusted participant to the encrypted session whose only job is to record, and tell the patient it is there. Modern WebRTC supports this through a browser capability called Insertable Streams (with a media-encryption layer often built on a standard called SFrame), which lets endpoints encrypt media above the transport layer. A designated recorder that holds a key can decrypt for recording while the forwarding server still cannot. The non-negotiable condition is transparency: the patient must know the session is recorded, because "end-to-end encrypted" and "silently recorded" cannot both be true honestly.
Resolution three — drop E2EE for recorded sessions, on purpose. For most clinical visits, transport encryption plus a BAA-covered, access-controlled recording store is the correct and fully compliant design, and true E2EE is more than the visit needs. The honest move is to decide per session: offer genuine E2EE (with client-side recording or no recording) for the high-sensitivity cases that warrant it, and use transport-encrypted server-side recording for the rest. What you must never do is market a call as "end-to-end encrypted" while a server records it. That is not a compliance nuance; it is a false security claim.
The product principle underneath all three: recording and E2EE are honesty constraints before they are engineering constraints. Whatever you record, the patient should be able to understand who can see it.
Composited versus per-track recording
A smaller architectural choice, but one that shapes how useful the recording is later. When you record a multi-party session, you can store it two ways.
A composited recording mixes every participant into one video file — the familiar gallery or side-by-side layout, exactly what the participants saw. It is one file, easy to play back, and good for a human reviewer who wants to watch the visit as it happened. A per-track recording keeps each participant's audio and video as separate streams. It is larger and needs a player that can re-assemble it, but it is far more useful downstream: an AI scribe transcribes each speaker cleanly when their audio is isolated, a reviewer can mute or redact one participant, and you can delete one person's track without destroying the whole record.
The rule of thumb: record composited when a person will watch the visit, per-track when a machine will process it or when selective redaction and deletion matter. Behavioral-health and AI-scribe products usually want per-track for exactly the redaction and clean-transcription reasons. Either way, every track is PHI and lives inside the same boundary.
Storage, access, retention, and deletion — the unglamorous core
A recording's whole life happens after the call ends, and this is where most of the real compliance work lives. Four controls, each tied to a rule.
Figure 3. The whole pipeline lives inside the BAA-covered HIPAA boundary: a consent gate, capture, encryption at rest, access-controlled signed-URL playback, a retention clock, and deletion — with an audit log wrapping every stage.
Encrypt the stored recording. The HIPAA Security Rule treats encryption of stored ePHI as an addressable implementation specification (45 CFR §164.312(a)(2)(iv)) — which does not mean optional, it means you must either do it or document a defensible reason why an equivalent control is used instead. In practice, for a bucket full of clinical video, the only defensible answer is to encrypt it at rest. The proposed 2026 HIPAA Security Rule update (HHS NPRM, RIN 0945-AA22, published January 2025 and still proposed as of June 2026) would make encryption mandatory rather than addressable for most ePHI — so encrypting recordings is both today's best practice and tomorrow's likely requirement.
Control and log access. Recordings need role-based access — only the people with a treatment, operations, or supervisory reason should be able to play one back — and every access must be recorded. The Security Rule's audit-controls standard (45 CFR §164.312(b)) requires mechanisms that record and examine activity in systems containing ePHI. For a recording store, that means: who played, downloaded, exported, or deleted which recording, and when. Serve playback through short-lived, signed links rather than permanent public URLs, so a leaked link does not become a permanent door.
Retain for the right clock — and know which clock it is. Here is a precise distinction that catches even careful teams. HIPAA's six-year retention rule (45 CFR §164.530(j)) applies to your policies and compliance documentation, not to medical records. How long you must keep the recording itself is set by state medical-record retention law, which commonly runs five to ten years for adults from the last encounter, and for minors typically until the age of majority plus several years. Telehealth crosses state lines, so a multi-state product inherits the strictest applicable clock. Pick a retention period deliberately, document it, and enforce it automatically.
Delete on schedule and on request. A recording you no longer need is pure liability. Build automatic deletion when the retention clock expires, and build a path to honor a valid deletion or access request. "Delete" must mean the media is actually gone — including backups and any derived copies (thumbnails, transcripts, AI training caches) — not merely hidden behind a flag. If a recording existed only to generate an AI note, deleting it minutes after the note is signed is often the most defensible choice you can make.
The arithmetic is worth saying out loud, because retention is where storage cost and risk both compound. Suppose you run 1,000 recorded visits a day, each producing a 500-megabyte (0.5 GB) recording. That is 1,000 × 0.5 GB = 500 GB per day, or about 500 × 365 = 182,500 GB ≈ 182 TB per year. At a representative cloud storage price of $0.023 per GB per month, one year of recordings costs roughly 182,500 GB × $0.023 × 12 ≈ $50,000 for that year's data alone — and you are paying it every month, for every year you retain. A seven-year retention clock means you are storing seven cohorts at once. The point is not the exact figure (re-check current prices and your codec's real file sizes); it is that "record everything forever" is a five-or-six-figure recurring line item and a growing breach surface. Recording less, and deleting on schedule, is both cheaper and safer.
Figure 4. The recording's lifecycle. Two consent layers gate the start; the end is governed by the state medical-record retention clock — distinct from HIPAA's six-year documentation rule — followed by deletion of the file and every derivative.
Behavioral health and substance-use records: an extra layer
One category needs special care. Records about substance-use-disorder treatment from federally assisted programs are protected by a separate federal rule, 42 CFR Part 2 (administered by SAMHSA), which is stricter than HIPAA about consent for disclosure. A 2024 final rule (effective April 2024, with enforcement aligned from February 2026) brought Part 2 closer to HIPAA — for example, allowing a single patient consent covering future treatment, payment, and operations — but it remains its own regime with its own redisclosure limits, especially for legal proceedings.
For a recording, the practical implication is sharp: a recorded substance-use-treatment session is among the most sensitive artifacts your platform can hold, and the consent, access, and redisclosure rules around it are tighter than ordinary HIPAA. If your product serves behavioral health, addiction, or mental-health verticals, treat recording as opt-in, narrowly scoped, and reviewed by counsel — and lean toward the E2EE-with-client-recording or simply-do-not-record end of the spectrum. We go deeper on the vertical in mental and behavioral health telemedicine.
A common-mistakes callout
Recording is a field of well-worn traps. The ones we see ruin builds:
Recording by default with no stated purpose. A "record" toggle that is on for every visit, justified by nothing, creates maximum PHI and maximum liability for no defined benefit. Start from off; record for a reason.
A consent checkbox that covers only HIPAA, not wiretapping. Teams ship the HIPAA-use consent and forget the all-party-consent states entirely, exposing themselves to criminal surveillance liability that HIPAA paperwork does not touch.
Claiming "end-to-end encrypted" while a server records. A marketing line that cannot be true at the same time as server-side recording. Pick one, and describe it honestly.
Recordings in an un-BAA'd or unencrypted bucket. The classic breach: clinical video written to general-purpose object storage that no Business Associate Agreement covers, or that is not encrypted at rest, often with permanent public-style URLs. Every recording belongs inside the BAA-covered, encrypted, access-logged boundary.
No deletion path. Building create-and-store but never deletion, so recordings accumulate past their retention clock and a patient's deletion request cannot be honored. Deletion is a feature, not an afterthought.
Forgetting the derivatives. Deleting the master recording but leaving the transcript, thumbnail, or AI-training copy behind. "Delete" must reach every copy.
Where Fora Soft fits in
The requirement comes first: a clinical recording is PHI that must be consented, encrypted, access-logged, retained on the right clock, and deletable — and, where the visit is sensitive, recorded without breaking an end-to-end-encryption promise. Fora Soft has built real-time video on WebRTC since the technology was new, across video conferencing, e-learning, streaming, and telemedicine, including the recording and storage pipelines that sit behind clinical and high-compliance products. We treat the record-or-not decision, two-layer consent capture, the E2EE-versus-recording trade-off, per-track versus composited capture, and the encrypted, access-controlled, retention-governed store as one connected design problem — because a recording feature that gets any one of them wrong is not a feature, it is an incident waiting to be reported.
What to read next
- Encryption for telemedicine: in transit, at rest, end-to-end
- Patient consent, recording, and data retention
- The compliance architecture pattern: how to wrap a video stack in HIPAA
Call to action
- Talk to a telemedicine engineer — book a 30-minute scoping call to talk through your telehealth session recording plan.
- See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
- Download the Clinical Recording Compliance Checklist — Twenty checks across the record-or-not decision, two-layer consent, capture and encryption, access and retention, the E2EE trade-off, and deletion — settle before launch.
References
- 45 CFR §164.501 — HIPAA Privacy Rule, definitions (designated record set). eCFR / HHS. Current as of 2026-06-14. Tier 1. https://www.ecfr.gov/current/title-45/subtitle-A/subchapter-C/part-164/subpart-E/section-164.501
- 45 CFR §164.524 — HIPAA Privacy Rule, right of access to PHI. eCFR / HHS. Current as of 2026-06-14. Tier 1. https://www.ecfr.gov/current/title-45/subtitle-A/subchapter-C/part-164/subpart-E/section-164.524
- 45 CFR §164.312 — HIPAA Security Rule, technical safeguards (encryption §164.312(a)(2)(iv); audit controls §164.312(b)). eCFR / HHS. Current as of 2026-06-14. Tier 1. https://www.ecfr.gov/current/title-45/subtitle-A/subchapter-C/part-164/subpart-C/section-164.312
- 45 CFR §164.530(j) — HIPAA Privacy Rule, six-year retention of policies and documentation (distinct from medical-record retention). eCFR / HHS. Current as of 2026-06-14. Tier 1. https://www.ecfr.gov/current/title-45/subtitle-A/subchapter-C/part-164/subpart-E/section-164.530
- 42 CFR Part 2 — Confidentiality of Substance Use Disorder Patient Records (2024 Final Rule; enforcement aligned Feb 2026). eCFR / SAMHSA / HHS. Checked 2026-06-14. Tier 1. https://www.ecfr.gov/current/title-42/chapter-I/subchapter-A/part-2
- HIPAA Security Rule NPRM (90 FR 898, RIN 0945-AA22) — proposal to make encryption of ePHI mandatory rather than addressable. HHS, published 2025-01-06; still proposed as of 2026-06-14. Tier 1. https://www.federalregister.gov/documents/2025/01/06/2024-30983/hipaa-security-rule-to-strengthen-the-cybersecurity-of-electronic-protected-health-information
- Understanding the Confidentiality of Substance Use Disorder (SUD) Patient Records — HHS.gov. HHS, checked 2026-06-14. Tier 2. https://www.hhs.gov/hipaa/for-professionals/special-topics/hipaa-part-2/index.html
- MediaRecorder API — MDN Web Docs (client-side recording of a MediaStream). Mozilla, checked 2026-06-14. Tier 6 (orientation). https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder
- True End-to-End Encryption with WebRTC Insertable Streams — webrtcHacks. Checked 2026-06-14. Tier 3 (first-party engineering). https://webrtchacks.com/true-end-to-end-encryption-with-webrtc-insertable-streams/
- Recording Telehealth Appointments — Little Health Law (state all-party consent and telehealth recording). Checked 2026-06-14. Tier 5 (practitioner/legal commentary). https://www.littlehealthlaw.com/blog/recording-telehealth-appointments/


