Published 2026-06-04 · 41 min read · By Nikolay Sapunov, CEO at Fora Soft

Why This Matters

This article is for the founder, product lead, or trust-and-safety lead at any platform where strangers upload video — a social app, a dating app, a creator marketplace, a live-shopping service, a community forum with clips, a video-sharing product of any kind. The moment your platform lets users post video to other users, you have inherited a problem that does not go away and that the law no longer treats as optional: some fraction of what gets uploaded will be illegal, some will violate your own rules, and you are now responsible for catching it at a speed and a scale no team of humans can match by watching. This capstone shows you what that pipeline actually costs, how the volume is funnelled so the bill stays sane, which parts you buy versus build, where the law draws hard lines you cannot cross, and how to treat the people who do the final human review as a first-class part of the system rather than an afterthought. It is equally for the engineer who has read the individual computer-vision and multimodal lessons and wants them assembled into one deployable system with named technologies, real prices, and an honest account of where automated moderation fails. By the end you will be able to draw the pipeline on a whiteboard, name the exact 2026 technology in every stage, defend the cost per video to a finance team, sequence the build so a first version protects users in weeks, and tell a design that an Ofcom or European Commission audit will pass from one that will end up in a headline.

What You Are Building, Stated Precisely

Fix the product before any technology. You are building a service with one job: every time a user uploads a video, the service decides — quickly, and on its own for the overwhelming majority of cases — whether that video is allowed to be published, must be blocked, or must be escalated to a human because the machine is not sure. Around that decision sit two things that matter as much as the decision itself: a record of why each choice was made, kept in a form you can hand to a regulator, and a reporting path that, for the narrow class of content that is illegal rather than merely against the rules, sends the required report to the right authority instead of quietly deleting the evidence.

Two terms in that description carry weight. UGC stands for "user-generated content" — anything the users themselves upload, as opposed to content the platform produces. It is the defining feature of every social, dating, community, and creator product, and it is exactly the content you cannot vet in advance because you did not make it and there is far too much of it. Moderation, here, means the whole process of deciding what is allowed to stay up: not a single model call, but the pipeline of automated checks and human judgement that stands between the upload button and the rest of the platform's users.

The technique underneath has a shape you have met in every other capstone: a funnel. The plain version of the idea is the way an airport screens bags — not by opening every suitcase, but by running all of them through a cheap X-ray, pulling aside only the few that look wrong, and hand-searching only the tiny number the second look could not clear. Cheap, fast, automatic screening for everything; expensive, slow, human attention for almost nothing. A moderation pipeline is that airport, and the entire art of building one affordably is deciding what each layer of screening costs, what it catches, and what it is allowed to wave through on its own.

The scope line that matters most is what the pipeline is for. It exists to enforce two different things that people constantly confuse: the law, which is the same for everyone and which you do not get to opt out of, and your platform's own rules, which are yours to write and which can be stricter than the law. A video can be perfectly legal and still break your rules — spam, a competitor's advert, nudity on a platform that bans it. A video can also be illegal regardless of your rules, and that narrow, serious class — child sexual abuse material above all — triggers obligations that have nothing to do with your terms of service and everything to do with statutes. The architecture below is shaped to keep those two jobs separate by construction, because the cheapest mistake in this whole field is to treat a legal obligation as if it were a content-policy preference.

The moderation pipeline drawn as a five-stage funnel, widest at the top where every uploaded video enters and narrowest at the bottom where a human reviewer sits, with the volume reaching each stage and the cost per item printed on the side. Stage zero is hash matching against databases of known illegal content, run on every video at almost no cost per item, which removes and reports the small slice of known-bad material. Stage one is a bank of cheap automated classifiers run on a handful of sampled frames plus the audio transcript, which clears the large majority of videos as plainly safe at a fraction of a cent each. Stage two is a vision-language model that reads the ambiguous middle — a few percent of videos — for context a frame score cannot judge, at a couple of cents each. Stage three is the human review queue, reaching well under one percent of uploads, where a trained person makes the final call on the hardest cases at a cost of cents to dollars per review. Running underneath all four stages is a governance floor reading mandatory reporting, the protection of minors, an audit log of every decision and its reason, and an appeals path. A footer line reads cheapest check first; every decision logged, every illegal item reported.

Figure 1. The pipeline is a funnel. Every video enters at the top; each stage either decides or passes the video up to a more expensive stage; the human at the bottom sees a tiny fraction of the input. The cost of the whole system is set by how aggressively the cheap stages clear volume on their own.

The Spine: Two Rules That Outlast The Models

Two ideas carry the entire build. Get them right and everything else is detail — and, as with every capstone in this course, both rules exist because the technology underneath is moving faster than any single article can track. The classifier you pick this quarter will be beaten next quarter; the law, and the shape of the pipeline, will not move nearly as fast.

The first rule is cheapest check first — each stage decides or escalates, and the expensive stages never see the bulk of the traffic. You order the checks from nearly-free to expensive, and you arrange them so that the great majority of videos are settled by the cheap stages and never reach the costly ones. This is not a preference; it is forced by arithmetic the cost section makes concrete. A platform taking a hundred thousand videos a day cannot afford to send each one, in full, to the most capable model available — the bill runs to millions a year and the most capable model is also the slowest, so you would fail on both cost and speed at once. So you build a funnel: a video that a one-cent check can clear is cleared by the one-cent check and goes no further; only the small residue that the cheap checks cannot settle climbs to the vision-language model, and only the residue of that reaches a human. Keeping the stages ordered by cost, and being honest about what each one can decide alone, is the single most useful mental model for budgeting the whole pipeline.

The second rule is every decision is logged with a reason, and illegal content is reported, not just removed. Every choice the pipeline makes — allow, block, escalate — must be recorded together with what triggered it, in a form you can produce months later for a regulator, a court, or a user who appeals. And for the narrow class of content that is illegal, removal is not the end of your obligation: the law in most of the markets you operate in requires you to report it to a named authority and to preserve the evidence, not to silently delete it. This is the difference between a pipeline that protects the platform and one that quietly destroys the very records that a child-protection investigation, or your own defence in an enforcement action, will later depend on. The record is not bureaucracy bolted on at the end. It is a constraint that runs backwards through the whole pipeline: every stage must emit not just a verdict but the evidence for it, or the audit trail has holes exactly where it matters most.

Hold the two rules together and the pipeline has a clean shape. The first rule decides where the cost lives — almost entirely in the human stage and the vision-language stage, which is precisely why you work so hard to keep volume away from them. The second rule decides what makes the pipeline defensible — not the cleverness of any single model, which will be obsolete soon enough, but the unbroken chain from upload to decision to reason to, where required, a report. Everything in the rest of this article fills in the stages between those two rules, decides what you build versus buy, and prices it.

The Throughput The Words "100,000 A Day" Actually Describe

Before any architecture, feel the scale, because the number is the whole reason the funnel exists. A hundred thousand videos a day is not a tidy trickle. Spread evenly it is 100,000 ÷ 86,400 seconds, which is about 1.16 videos every second — but traffic is never even. Uploads cluster in the evening and around events, so a realistic design target is a peak three to four times the average, call it four videos a second that the front of the pipeline must accept without falling behind. Miss that target and the queue grows without bound; a backlog in moderation is not a cosmetic delay, it is unreviewed content sitting in front of users.

Now turn videos into the thing you actually process. Say the average upload runs two minutes — short-form UGC skews shorter, but two minutes keeps the arithmetic honest. A hundred thousand videos at two minutes each is 100,000 × 2, which is 200,000 video-minutes a day flowing in. Hold that number; it is the one that decides whether you can afford to moderate by the minute or must moderate by the sampled frame.

Here is the pitfall that sinks naive designs, stated as plainly as it deserves.

Common mistake: moderating video by the minute. The obvious first instinct is to send each whole video to a video-moderation API. At a representative 2026 list price of about ten cents per minute of video analysed, your 200,000 video-minutes × $0.10 comes to $20,000 a day — call it $7.3 million a year — to look at every second of every upload, most of which is a person talking to a camera about nothing in particular. It is also slow, because analysing a video end to end takes time you do not have at four uploads a second. You do not moderate by the minute. You moderate by a handful of sampled frames plus the audio transcript, and you only ever spend real money on the small slice of videos the cheap pass could not clear.

That single decision — sample, do not watch — is the hinge the entire cost model turns on, and it is the same frame-thinning idea the computer-vision phase built: a two-minute clip at sixty frames a second holds 7,200 frames, nearly all of them near-duplicates of their neighbours, and you reduce that to a few dozen meaningful keyframes before any model sees it. The mechanics of that reduction are in the computer-vision applications lesson; here it is the move that makes a hundred thousand videos a day affordable to screen.

The Production Architecture, Stage By Stage

A real deployment is more than a call to a moderation model. Five stages plus a governance floor show up in every UGC moderation pipeline we have scoped, and naming them precisely is the first hour of any project. They are ordered by the first spine rule: cheapest first.

The pipeline opens, before any judgement, with ingest and preparation — the unglamorous front door. An upload arrives, gets accepted to temporary storage, is transcoded into a form the downstream models can read, and is reduced to the two cheap representations the rest of the pipeline actually works on: a small set of sampled keyframes (the few dozen still images that capture what the video shows) and an audio transcript produced by automatic speech recognition — the same caption-making technology from the WhisperX lesson, here used to turn speech into searchable, scoreable text. From this point on, "moderating a video" mostly means moderating those keyframes and that transcript, which is why it is cheap.

Stage 0 is hash matching against databases of known-bad content, and it is both the cheapest stage and the only one the law effectively mandates. A hash is a short digital fingerprint of a file — a few hundred bytes that stand in for the whole image or video. A perceptual hash, specifically, is a fingerprint built so that two visually similar files produce similar fingerprints, so a match survives re-encoding, cropping, or a watermark, which an exact-copy fingerprint would not. The pipeline computes the perceptual hash of each incoming video and its keyframes and compares them against curated databases of fingerprints of already-known illegal material — most importantly known child sexual abuse material (CSAM) catalogued by child-protection bodies. The canonical image system is PhotoDNA, created by Microsoft and licensed to platforms; the canonical open video-and-image system is Meta's PDQ (for photos) and TMK+PDQF (for video), open-sourced in 2019 and built to run at platform scale. A match here is not a guess — it means this exact known-bad thing is being uploaded again — so it can act automatically: block the upload, file the legally required report, and preserve the evidence. The crucial limit, which the accuracy section returns to, is that hash matching only ever catches content that is already known; it is blind to anything new, and the explosion of AI-generated abuse material — which by its nature has no prior fingerprint — is exactly why the later stages exist.

Stage 1 is a bank of cheap automated classifiers, run on the sampled keyframes and the transcript, and it is where the bulk of the volume is settled. A classifier is a model that takes a piece of content and outputs scores for a fixed set of categories — here, the moderation categories: explicit sexual content, graphic violence, hate symbols, weapons, self-harm, and so on. You run an image classifier over the keyframes and a text classifier over the transcript, and most videos come back with every score comfortably low — a person cooking, a product demo, a dog — and are allowed on the spot, never touching a more expensive stage. The honest engineering reality is that this stage is bought far more often than built: the mature options are managed APIs such as Amazon Rekognition content moderation, Hive, Microsoft Azure AI Content Safety, Google Cloud SafeSearch, and Sightengine, while the self-hosted path uses open models such as NudeNet (which localises exposed body regions rather than just flagging a whole image) or a CLIP-based zero-shot classifier that scores a frame against text prompts like "this image contains a weapon." The single most important design choice here is the threshold: set the bar for "clearly safe" high and you pass too much through to the costly stages; set it low and illegal content slips past. Thresholds are set per category, not globally, for reasons the accuracy section makes concrete.

Stage 2 is a vision-language model reading the ambiguous middle, and it exists because a frame score cannot judge context. A vision-language model, or VLM, is a model that takes images and a written question together and answers in words — it can look at a clip and tell you not just that it contains a knife but that it is a chef boning a fish, not a threat; not just that it contains blood but that it is a wound-care tutorial, not gore for its own sake. Most of moderation's genuinely hard cases are exactly this kind of context call, and a VLM is the first tool in the pipeline that can make them without a human. You route only the small fraction of videos that the cheap classifiers flagged as uncertain — high enough to worry, not high enough to auto-block — to a VLM with a careful prompt that asks the specific policy question, and you keep its written explanation as part of the record. The 2026 field spans managed frontier models (Google Gemini, Anthropic Claude, OpenAI's multimodal models) and open weights you can self-host (LLaVA, Qwen-VL), the subject of the open-frontier VLM lesson. The decision of when a VLM should replace a custom classifier entirely is its own topic, covered in the "just use a VLM" lesson.

Stage 3 is the human review queue, and it is both the most expensive stage and the one you can never fully remove. Some decisions are too consequential, too contextual, or too legally fraught to leave to a model — a borderline case that could be art or could be abuse, an appeal from a user whose video was removed, a class of content your own policy says a human must sign off. Those land in a queue where a trained reviewer makes the final call. This stage is where the second half of the governance section lives, because the people doing this work are exposed, by the nature of the job, to the worst of what the pipeline catches, and treating their wellbeing as an engineering requirement rather than a cost line is both the right thing and, increasingly, a legal and contractual one.

Running underneath all four stages is the governance and audit floor — the decision log, the mandatory-reporting connectors, the minor-protection controls, the data-retention rules, and the appeals path. For a system that sits in judgement over what a hundred thousand strangers a day are allowed to say, that floor is not optional infrastructure; it is what makes the pipeline lawful to operate at all.

The production architecture drawn left to right as a horizontal pipeline with a governance band running underneath the whole length. On the left an upload enters an ingest-and-prepare block that produces two cheap representations, sampled keyframes and an audio transcript. These flow into stage zero, hash matching against known-bad databases, which forks off a block-and-report action for any match. What is not matched flows into stage one, a bank of cheap classifiers scoring frames and transcript, which forks off an allow action for the large majority that score clearly safe. The uncertain remainder flows into stage two, a vision-language model that judges context, which can allow, block, or pass on. The residue flows into stage three, a human review queue, which makes the final allow-or-block call. Underneath all stages runs a governance band reading audit log of every decision and reason, mandatory reporting and evidence preservation, protection of minors, data retention, and appeals. A footer line reads cheapest check first; every decision logged, every illegal item reported.

Figure 2. The pipeline stage by stage. Each stage either resolves the video (allow, block, or report) or passes the uncertain remainder to the next, more expensive stage; the governance floor records every decision and carries the legal obligations that removal alone does not satisfy.

Build Versus Buy: The 2026 Verdict, Component By Component

A capable team does not write all of this from scratch, and does not buy all of it either. The rule of thumb mirrors the other capstones in this course: adopt the mature infrastructure, rent or self-host the fast-moving models, and build only the parts that are your actual product — here, the routing logic that decides what goes to which stage and at what threshold, the decision store and audit trail, the reporting and governance layer, and the review tooling your human moderators use. Those are the parts that make the pipeline accurate, lawful, and humane; everything else is bought or borrowed.

Component Build or buy Concrete 2026 choice Why
CSAM / known-bad hash matching BUY / PARTNER PhotoDNA; Meta PDQ + TMK+PDQF; NCMEC & IWF hash sets You must not build or hold this yourself; you connect to the authorities' systems
Cheap classifiers (NSFW/violence) BUY or HOST Rekognition / Hive / Azure Content Safety / open NudeNet / CLIP Mature, commodity; swap freely on price and accuracy
Audio transcription for moderation BUY / HOST Whisper-class ASR Solved; reused from the audio phase
VLM contextual review BUY / HOST Gemini / Claude / GPT, or open LLaVA / Qwen-VL Fastest-moving part; rent and keep swappable
Frame sampling / keyframe selection BUILD Downsample + keyframe dedup Standard CV; small, owned, cheap
Routing + thresholds + escalation BUILD Your policy expressed as code This is your moderation product
Decision store + audit trail BUILD Append-only log keyed to every upload Your legal evidence; cannot delegate
Reporting + retention connectors BUILD / PARTNER NCMEC CyberTipline; preservation store A statutory duty; build to the spec, do not improvise
Human review tooling + wellbeing BUILD or PARTNER In-house console or a T&S vendor The hardest part to do humanely; never an afterthought

The pattern is the same one that runs through the whole course: rent the models, buy the solved infrastructure, and own the parts that encode your policy, your evidence, and your duty of care. The classifiers and VLMs will be replaced; the routing logic, the audit trail, and the governance layer are what you keep.

The 2026 Tool Field — And Why You Abstract It

Two stages of the pipeline lean on outside tools that move fast and must be treated as swappable parts rather than fixed choices: the cheap classifier bank and the VLM reviewer. Wire each behind a thin internal interface — a single classify(frames, transcript) call and a single review(clip, question) call whose innards you can change without touching the rest of the pipeline — and replacing a vendor becomes a configuration change instead of a rewrite. This matters because the field genuinely churns: prices move, accuracy leaders change, and a model you depend on can be deprecated out from under you, exactly as embedding models were in the previous capstone.

On the classifier side, the managed APIs cluster around a familiar price band in 2026. Azure AI Content Safety lists images at roughly $1.50 per 1,000 analysed (about $0.0015 an image) and text at about $1.00 per 1,000 records. Amazon Rekognition content moderation lists at about $0.10 a minute for video analysed whole — which is exactly why you feed it sampled frames through its image API rather than whole videos. Google Cloud SafeSearch is free for the first thousand calls a month, then about $1.50 per 1,000, dropping to $0.60 per 1,000 above five million. Hive is custom-quoted, in the region of $3 per 1,000 images at a million images a month. Sightengine sells operation bundles from about $29 a month. The open self-hosted route — NudeNet, a CLIP-based scorer, a small fine-tuned vision model — trades a per-call fee for the fixed cost of running your own GPUs, which wins above a volume the cost section locates.

On the VLM side, the two-cents-a-call frontier models and the self-hostable open models sit at very different points on the cost-and-control curve, and the right answer is usually both: a cheap open VLM for the routine context calls, a frontier model reserved for the genuinely hard or high-stakes ones. The whole point of the interface is that this routing is a tuning decision, not an architectural one.

There is one tool category you do not shop for on price, and the diagram makes the line explicit: the known-bad hash systems. You do not build your own CSAM detector, you do not assemble your own library of the material, and you do not treat this as a vendor bake-off. You connect to the established systems — PhotoDNA, the hash sets maintained by the National Center for Missing & Exploited Children (NCMEC) in the United States and the Internet Watch Foundation (IWF) in the United Kingdom, and the cross-industry sharing programs such as the Tech Coalition's Lantern and Video Hash Interoperability Project — and you route matches to the legal reporting channel. This is the one stage where "build" is the wrong answer for reasons that are legal and ethical, not technical.

The 2026 tool field drawn as two cards side by side above a separated, fenced-off third panel. The left card is the cheap classifier bank, labelled the commodity swap, listing managed options Azure Content Safety at about a tenth of a cent per image, Amazon Rekognition, Hive, Google SafeSearch, and Sightengine, and open options NudeNet and a CLIP-based scorer, with a note that these swap freely on price and accuracy. The right card is the VLM reviewer, labelled the fast-moving swap, listing managed Gemini, Claude, and GPT and open LLaVA and Qwen-VL, with a note to route routine context to a cheap open model and reserve a frontier model for the hard cases. Below both, set apart by a heavier border and a different colour, is a third panel labelled known-bad hash systems, do not shop on price, listing PhotoDNA, NCMEC and IWF hash sets, Meta PDQ and TMK, and the Tech Coalition Lantern and Video Hash Interoperability Project, with the note you connect to these and report matches; you never build or hold this material yourself. A footer line reads abstract the classifiers and the VLM behind one interface each; the hash systems are a partnership, not a purchase.

Figure 3. Two of the three tool categories are commodities you abstract behind a thin interface and swap on price and accuracy. The third — the known-bad hash systems — is a partnership with child-protection authorities, fenced off because it is governed by law and ethics, not procurement.

The Cost Model: Why The Funnel Beats The Obvious Design By Roughly Ten To One

The pipeline's whole economic argument fits in one comparison, and it is worth doing the arithmetic out loud because the result decides whether the product is viable. All figures are illustrative 2026 list prices; your real numbers will move with your contracts and your content mix, but the shape is robust.

Start with the naive design from the pitfall above — send every video, in full, to a video-moderation API:

200,000 video-minutes/day  ×  $0.10 / minute  =  $20,000 / day  ≈  $7.3M / year

Now price the funnel, stage by stage, against the same hundred thousand videos a day. Stage 0, hash matching, is a hash computation plus a database lookup — self-hosted, it is a fraction of a cent per video; budget a generous $0.0005 × 100,000 = $50/day. Stage 1, the cheap classifiers, runs not on minutes but on the sampled keyframes — say a dozen frames per video — plus the short transcript. At Azure's $0.0015 an image, twelve frames is $0.018 a video, so screening the whole day is $0.018 × 100,000 = $1,800/day. This stage clears the large majority of uploads on its own. Stage 2, the VLM, sees only the uncertain remainder — suppose a deliberately conservative seven percent of videos, or 7,000 a day — at roughly two cents each: $0.02 × 7,000 = $140/day. Stage 3, human review, reaches only the hardest residue — say one percent, or 1,000 videos a day — at a loaded cost of roughly a quarter-dollar per review (the review-economics section unpacks that figure): $0.25 × 1,000 = $250/day.

Stage 0  hash match      $50/day
Stage 1  classifiers  $1,800/day
Stage 2  VLM            $140/day
Stage 3  humans         $250/day
-------------------------------------
Funnel total        ≈ $2,240/day  ≈  $818K / year

The funnel costs roughly $0.8 million a year against the naive design's $7.3 million — about a ninefold saving — and it is more accurate, not less, because the expensive stages spend their attention only on the content that actually needs it instead of being diluted across millions of minutes of someone's lunch. The lever that produces the saving is the percentage of traffic each stage passes upward: push Stage 1's clear-rate up by tuning its thresholds and the whole bill falls; let it leak and Stages 2 and 3 swell. That is why threshold tuning, not model choice, is where a moderation team actually spends its time. The wider menu of levers — batching, caching, cheaper model tiers, reserved capacity — is the subject of the cost-optimization lesson.

The cost model drawn as two horizontal bars stacked above a stage-by-stage breakdown. The top bar, long and orange, is labelled naive design, every video to a video API, and reads about seven point three million dollars a year. The second bar, short and green, is labelled the funnel and reads about zero point eight million dollars a year, visibly under a fifth the length. Below the bars a small stacked breakdown shows the funnel's four stages with their daily costs: hash matching fifty dollars, classifiers one thousand eight hundred dollars, the vision-language model one hundred forty dollars, and human review two hundred fifty dollars, summing to about two thousand two hundred forty dollars a day. A caption reads the saving comes from the share of traffic each cheap stage clears on its own; the costly stages only ever see what truly needs them. A footer line reads moderate by sampled frame and transcript, not by the minute.

Figure 4. The funnel costs roughly a ninth of the obvious design and is more accurate. The saving is produced almost entirely by how much volume the cheap stages clear before anything expensive runs.

The Accuracy Strategy: Two Errors That Are Not Equal

Every moderation decision can be wrong in two directions, and the central insight of the whole field is that the two are not equally bad. A false negative is letting through something that should have been caught. A false positive is removing something that should have been allowed. A naive system tunes one knob to balance them as if they were symmetric. A serious system refuses the premise, because the cost of the two errors depends entirely on what was missed or wrongly removed.

Consider the extremes. Missing a piece of illegal child sexual abuse material is a catastrophe — for the child, for the platform, and for the executives who can be held personally liable; there is essentially no acceptable rate of it, which is exactly why that class is handled by deterministic hash matching plus mandatory human confirmation rather than by a probabilistic classifier with a tunable threshold. Wrongly removing a legal cooking video, on the other hand, is an annoyance you fix with an appeal. The two errors are separated by many orders of magnitude in cost, and your thresholds must reflect that. So you do not set one global sensitivity; you set a threshold per harm category, tuned to that category's error costs — extremely aggressive (escalate on the faintest signal) for the gravest, legally-defined harms, and relaxed for low-stakes policy categories where a false positive merely irritates and an appeal sets it right.

The honest map of where this pipeline fails has three regions, and naming them is how you design around them.

The first is the known-versus-novel gap. Hash matching is perfect on content it has seen and blind on content it has not. It will never catch a brand-new video that no database has fingerprinted — and the surge of AI-generated abuse material, which by construction has no prior hash, lives entirely in this gap. The fix is not to abandon hashing but to layer the classifiers and the VLM behind it so that novel harmful content has a second and third chance to be caught on its content rather than its fingerprint, and to feed confirmed novel cases back to the authorities who maintain the hash sets so that tomorrow they are known.

The second is the context trap. A frame-level classifier sees a knife, or skin, or blood, and cannot tell a crime from a cooking show, a swimming lesson, or a surgical tutorial. Lean on classifiers alone and you either over-block (removing the swimming lesson) or under-block (passing the genuine harm that happens to look mild frame by frame). The fix is the VLM stage, which judges the scene rather than the still, and the human stage behind it for the cases even the VLM cannot settle.

The third is adversarial evasion. People who want to get harmful content through actively try to defeat the pipeline — re-encoding to break exact hashes (defeated by perceptual hashing, which survives it), overlaying noise, hiding the payload in a few seconds of an otherwise innocuous clip (defeated by sampling densely enough, and by audio analysis). You will never close this region completely; you treat moderation as an adversarial, evolving system, measure it continuously against a held-out set of known-answer cases, and accept that it is a permanent contest rather than a problem you solve once.

The accuracy strategy drawn as two panels. The left panel is a two-by-two grid of the two error types against two example harm classes, showing that a false negative on illegal abuse material is catastrophic while a false negative on a policy nuisance is minor, and that a false positive on a legal video is an appeal while thresholds must therefore be set per category rather than globally; the gravest category is marked handled by deterministic hash plus human confirmation, not a tunable score. The right panel lists the three failure regions as stacked cards: the known-versus-novel gap, where hash matching is blind to anything new including AI-generated material, fixed by layering classifiers and a VLM behind the hash and feeding novel cases back to the authorities; the context trap, where a frame score cannot tell a crime from a cooking show, fixed by the VLM and human stages judging the scene; and adversarial evasion, where bad actors re-encode and hide payloads, fixed by perceptual hashing, dense sampling, audio analysis, and continuous measurement. A footer line reads tune thresholds per harm class; the two errors are not equal.

Figure 5. The two errors are not symmetric, so thresholds are set per harm category, and the gravest harms are handled deterministically rather than by a tunable score. Three failure regions — novel content, context, and adversarial evasion — are designed around, not wished away.

Governance: The Law, The Minors, And The Humans In The Loop

A UGC moderation pipeline is one of the most heavily regulated systems an engineer can build, and the governance floor under it has three load-bearing pillars. None of what follows is legal advice — it is the engineering context a competent team must understand before involving the lawyers who give the actual advice.

The first pillar is mandatory reporting and preservation of illegal content. In the United States, federal law (18 U.S.C. § 2258A) requires electronic service providers to report apparent child sexual abuse material to the NCMEC CyberTipline once they become aware of it, and to preserve the associated content and records — a window recently extended to a full year — rather than delete them. The scale this operates at is not abstract: the CyberTipline received 20.5 million reports in 2024, referencing nearly 63 million files, and the share involving generative-AI imagery rose more than thirteenfold in a single year. The engineering consequence is concrete and easy to get fatally wrong: when Stage 0 matches known CSAM, the correct action is block, report, and preserve to the legally mandated store — never a quiet delete, which destroys the evidence a child-protection investigation depends on and can itself be an offence. You build this path to the published specification, you do not improvise it, and the decision of what is reportable is one you make with counsel, not in code review.

The second pillar is transparency and the protection of minors under the new platform laws. The European Union's Digital Services Act (DSA) requires that platforms give users a clear statement of reasons for moderation decisions and feed those statements into a public transparency database, and it backs this with penalties of up to six percent of global annual turnover for the largest platforms — those above 45 million monthly EU users. Its Article 28 imposes specific duties to protect minors, with Commission guidelines published in 2025. The United Kingdom's Online Safety Act, whose illegal-harms duties became enforceable in March 2025 and under which the regulator Ofcom had by 2026 opened more than twenty investigations, requires in-scope services to assess the risk of illegal content and act against it. And the EU AI Act's Article 50, applying from August 2026, requires AI-generated and manipulated media to be marked as such — which intersects this pipeline directly, because a moderation system increasingly has to reason about whether what it is looking at is real, a question taken up in the C2PA and AI-disclosure lesson. The engineering consequence is that the audit trail and the statement-of-reasons generator are not optional features; they are how you stay on the right side of laws with turnover-scale penalties.

The third pillar is the one teams forget until it becomes a scandal: the wellbeing of the human reviewers. The people in Stage 3 are exposed, as a condition of the job, to the worst content the pipeline surfaces, and the industry's record here is poor — outsourced moderators have been documented earning as little as a few dollars an hour, and surveys find that a large majority feel their employers do too little for their mental health. By 2026 this is no longer only an ethical matter: global labour bodies have proposed binding protocols — limits on daily exposure to traumatic material, the elimination of unrealistic speed quotas, sustained mental-health support, and living wages — and platforms are increasingly held to them by contract and reputation. The engineering consequences are real design requirements: blur and grey-scale previews by default so a reviewer is not ambushed by full-resolution horror, hard caps on how much of the gravest material any one person sees per shift, rotation away from the worst queues, and routing that uses the automated stages to keep as much of this content away from human eyes as possible. A pipeline that treats its moderators as a disposable cost line is both wrong and, increasingly, a legal and commercial liability.

The three governance pillars drawn as three vertical cards of equal weight under a single heading. The first card, mandatory reporting and preservation, notes that United States law requires reporting apparent child sexual abuse material to the NCMEC CyberTipline and preserving it for a year, that the line received twenty point five million reports in 2024, and that the engineering rule is block, report, and preserve, never a quiet delete. The second card, transparency and minors, notes the European Union Digital Services Act requirement for a statement of reasons and a public transparency database backed by fines up to six percent of global turnover, the Article 28 duties to protect minors, the United Kingdom Online Safety Act illegal-harms duties enforceable since March 2025, and the EU AI Act Article 50 marking of AI-generated media from August 2026. The third card, reviewer wellbeing, notes documented low pay and poor mental-health support, proposed binding protocols on exposure limits and living wages, and the design requirements that follow: blurred previews by default, hard exposure caps per shift, rotation, and using automation to keep content away from human eyes. A footer line reads none of this is legal advice; it is the floor a competent team builds before the lawyers advise.

Figure 6. Three governance pillars carry the pipeline: the legal duty to report and preserve illegal content, the transparency-and-minors obligations of the new platform laws, and the wellbeing of the human reviewers — each a concrete engineering requirement, not a policy footnote.

The Build Order: Each Milestone Protects Users

Sequence the build so that every milestone ships something that protects users, rather than disappearing for a quarter to build the whole pipeline before anything runs. Five milestones, in order.

Milestone one is hash matching and mandatory reporting — Stage 0 and the legal floor under it, first, because it is both the cheapest stage and the one the law requires. Wire up the known-bad hash systems, the block-report-preserve action, and the audit log. This alone makes the platform meaningfully safer and legally defensible on day one.

Milestone two is the cheap classifier pass with conservative thresholds — Stage 1, set to escalate generously rather than to clear aggressively. At first you would rather send too much to review than too little; you tighten the thresholds as you gather evidence about where the classifiers are reliable. This is the stage that turns the pipeline from "catches the known" to "screens the new."

Milestone three is the human review console and queue — Stage 3 before Stage 2, deliberately, because you need a place for escalations to land, and a humane, well-instrumented review tool, before you start trusting a VLM to reduce the queue. Build the wellbeing features — blurred previews, exposure caps, rotation — into this console from its first version, not as a later retrofit.

Milestone four is the VLM context stage — Stage 2, inserted between the classifiers and the humans to shrink the human queue by resolving the context calls automatically. By now you have labelled data from the human stage to evaluate the VLM against, so you can prove it is helping rather than hoping.

Milestone five is scale, measurement, and the adversarial loop — the throughput target met at peak, a held-out evaluation set of known-answer cases run continuously, threshold tuning driven by measured error costs, and the feedback path that sends confirmed novel harms back to the authorities. This is the milestone that never quite finishes, because moderation is an evolving contest, not a shipped feature.

Where Fora Soft Fits In

We build the products that generate exactly this problem. Across two decades of video conferencing, live streaming, OTT platforms, dating and social apps, e-learning, and video surveillance, the moment a product lets users post video to other users, a moderation pipeline stops being optional — and the teams that ship those products are usually meeting hash databases, classifier APIs, VLM review, mandatory-reporting law, and reviewer-wellbeing requirements for the first time, all at once, under a launch deadline. The engineering that recurs is the funnel itself: ordering the stages by cost, tuning the thresholds per harm class so the bill and the accuracy both hold, and wiring the audit-and-reporting floor that turns a content filter into a system a regulator will accept. The patterns in this capstone are the ones that come up when a live-shopping, dating, social, or community-video product has to be safe to launch in the EU, the UK, and the US at the same time.

What To Read Next

Talk To Us / See Our Work / Download

References

  1. 18 U.S.C. § 2258A — Reporting requirements of providers (Legal Information Institute / U.S. Code). Standard / statute, tier 1. The federal obligation to report apparent CSAM to the NCMEC CyberTipline and preserve associated records. https://www.law.cornell.edu/uscode/text/18/2258A — supports the mandatory-reporting and preservation pillar.
  2. NCMEC — CyberTipline 2024 data (National Center for Missing & Exploited Children). Primary data, tier 2. 20.5 million reports and ~63 million files in 2024; the 1,325% rise in generative-AI-related reports. https://www.missingkids.org/cybertiplinedata — supports the scale figures and the AI-CSAM gap.
  3. Regulation (EU) 2022/2065 — Digital Services Act (EUR-Lex, official consolidated text). Standard / regulation, tier 1. Statement-of-reasons duty, transparency database, Article 28 protection of minors, penalties up to 6% of global turnover, VLOP 45M-user threshold. https://eur-lex.europa.eu/eli/reg/2022/2065/oj — supports the transparency-and-minors pillar.
  4. EU AI Act, Article 50 — Transparency obligations for certain AI systems (Regulation (EU) 2024/1689). Standard / regulation, tier 1. Marking of AI-generated and manipulated media, applying from 2 August 2026. https://artificialintelligenceact.eu/article/50/ — supports the AI-generated-content governance point.
  5. UK Online Safety Act 2023 and Ofcom Illegal Harms Codes of Practice (Ofcom). Standard / regulator guidance, tier 2. Illegal-harms duties enforceable from 17 March 2025; risk-assessment and safety-measure obligations. https://www.ofcom.org.uk/online-safety/illegal-and-harmful-content/statement-protecting-people-from-illegal-harms-online — supports the UK regulatory point.
  6. Hill, Dalins et al. — "PDQ & TMK + PDQF: A Test Drive of Facebook's Perceptual Hashing Algorithms" (arXiv 1912.07745). Academic, tier 5. The open photo (PDQ) and video (TMK+PDQF) perceptual-hash algorithms and their scale design. https://arxiv.org/abs/1912.07745 — supports the hash-matching stage.
  7. Meta — "Open-Sourcing Photo- and Video-Matching Technology to Make the Internet Safer" (Meta Newsroom, 2019). First-party engineering, tier 3. Release of PDQ and TMK+PDQF and the Hasher-Matcher-Actioner pattern. https://about.fb.com/news/2019/08/open-source-photo-video-matching/ — supports the open hash-matching tooling.
  8. Microsoft PhotoDNA (Microsoft). First-party / tool documentation, tier 3. The licensed perceptual-hash system for known CSAM used across major platforms. https://www.microsoft.com/en-us/photodna — supports Stage 0.
  9. The Tech Coalition — Lantern and the Video Hash Interoperability Project (VHIP) (technologycoalition.org). Industry program, tier 2. Cross-platform child-safety signal sharing; 784,000+ videos hashed via VHIP. https://www.technologycoalition.org/programs/lantern/ — supports the cross-industry hash-sharing point.
  10. Amazon Rekognition pricing (Amazon Web Services, 2026). Vendor pricing, tier 4. Video content moderation at ~$0.10/minute; image moderation. https://aws.amazon.com/rekognition/pricing/ — supports the cost model.
  11. Azure AI Content Safety pricing (Microsoft Azure, 2026). Vendor pricing, tier 4. ~$1.50 per 1,000 images and ~$1.00 per 1,000 text records. https://azure.microsoft.com/en-us/pricing/details/cognitive-services/content-safety/ — supports the classifier cost figures.
  12. TIME — "Global Safety Rules Aim to Protect AI's Most Traumatized Workers" (2025). Journalism, tier 5. Content-moderator wages, wellbeing, and the proposed global protocols. https://time.com/7295662/ai-workers-safety-rules/ — supports the reviewer-wellbeing pillar.
  13. Google Cloud Vision — SafeSearch detection and pricing (Google Cloud, 2026). Vendor documentation, tier 4. Tiered per-1,000-call pricing for explicit-content detection. https://cloud.google.com/vision/pricing — supports the classifier tool field.