Published 2026-06-02 · 23 min read · By Nikolay Sapunov, CEO at Fora Soft

Why This Matters

Almost every video product accumulates an archive it cannot afford to review by hand: a streaming service with a decade of back-catalog, a social app with a moderation queue that never empties, an e-learning library no one has tagged, a surveillance system with months of stored footage. The work of going through it — labelling, moderating, captioning, checking it against a new policy — is exactly the slow, repetitive job that an agent does well and that people do badly at scale. If you build streaming, OTT, conferencing, e-learning, telemedicine, or surveillance software, this lesson shows you what an async review agent is, why "no clock" changes the whole design, and where the cost and the durability traps hide. You do not have to build the pipeline yourself, but you do need to know enough to tell a sound design from a wasteful one — to ask whether the agent reads every frame or filters first, whether the job can survive a crash, and whether a human sees the cases that matter. This is the third and last of three applied agent lessons; it reuses the same skeleton as the video investigator agent and the meeting copilot agent, pointed this time at a standing archive rather than a live moment.

What An Async Review Agent Is — And Is Not

The fastest way to understand this agent is to put it next to its two siblings from the last two lessons, because all three run the same loop and the difference is entirely about when the work happens.

The meeting copilot from the previous lesson lives under a clock. It sits in a live call and has under a second to be useful before it talks over a human. The video investigator answers one question on demand — you ask "did anyone enter the loading bay after 22:00?" and it investigates and reports, reacting to a prompt. The async review agent is the third shape: it works through a standing pile of video on its own schedule, with no one waiting and no single question. Nobody types a prompt; the job is simply "review everything in this archive against this rubric, and tell me the verdict for each one." The word that matters is async, short for asynchronous — meaning the request and the result are separated in time. You hand the agent a million videos in the morning; it hands you a million verdicts that evening, or the next day.

It also helps to fix the boundary against a thing this is not: a plain batch script. A batch script is a dumb loop — "run this one fixed operation on every file" — and it has been around for decades. An async review agent is different because each item gets the full agent treatment: the agent plans how to look at this particular video, chooses which tools to call based on what it finds, and reasons about a verdict rather than applying one rule. A batch script transcodes every video to a smaller size; an agent watches each video, decides whether it breaks the new safety policy, and writes down why. The loop, the planning, and the tool use are what make it an agent — the same three primitives from the agent primitives lesson — and we keep coming back to them because they are the spine of every agent in this section.

What does it actually produce? For each video in the archive, a structured verdict: a tag set, a moderation decision, a chapter list, a compliance flag, a caption track — whatever the job asked for — together with the evidence that backs it (the timestamp, the frame, the transcript line) and a confidence score. The archive goes in raw; it comes out labelled, decided, and searchable.

A labeled diagram contrasting three applied agents that share one loop — a real-time meeting copilot that works under a one-second clock, a video investigator that answers one question on demand, and an async review agent that works through a standing archive on its own schedule with no clock — with the async review agent highlighted as the subject of this lesson Figure 1. Three applied agents, one loop. The copilot works under a clock; the investigator answers one question on demand; the async review agent works through a whole archive on its own schedule. This lesson is about the third.

The Review Loop, Run As A Fleet

Every agent runs a loop — perceive, reason, act, observe — and we drew that loop in the agent-loop lesson. The async review agent runs exactly that loop, with one twist that defines the whole pattern: the loop runs per video, and many copies of it run at the same time, pulling work from a shared list.

Picture the archive as a long queue — a waiting line of videos, each one a job ticket. A queue here just means a list of work to be done, in order, that many workers can draw from. A pool of workers — each one a copy of the agent, running on its own machine or process — pulls the next ticket off the queue, runs the full loop on that one video, writes the verdict, and reaches for the next ticket. Ten workers chew through the line ten times faster than one; a thousand workers, a thousand times faster. This is the fan-out pattern: spread one big job across many parallel workers. It is how you review a million videos overnight instead of over a year.

Walk one ticket slowly and the loop becomes concrete. A worker pulls video #438,201. It perceives by sampling the video cheaply — pulling a transcript and a handful of still frames rather than every frame (we will see why this matters for cost in a moment). It reasons about what it sees against the job's rubric: does this clip contain the thing we are looking for? It plans the next step — "the audio mentions a product claim; pull the three frames around that timestamp and read them closely." It acts by calling a tool, the vision-language reader, on those frames. The result lands back as an observation. The agent reasons once more, reaches a verdict with a confidence score, and writes it to the catalog. Ticket done; next ticket.

Two pieces make the fleet trustworthy rather than chaotic. The first is a checkpoint store — a durable record of which tickets are done, which are in flight, and which failed — so the fleet always knows where it is. The second is an escalation gate at the end: most verdicts the agent is confident about are written straight to the catalog, but the uncertain or severe ones are routed to a human review queue instead of being auto-decided. The agent proposes; for the cases that matter, a person disposes. That gate is the same human-in-the-loop principle that ended every diagram in the last two lessons, and it is non-negotiable here too.

A diagram of the async review loop run as a fleet, showing an archive feeding a work queue, a pool of parallel worker agents each running the perceive-reason-act-observe loop on one video, a durable checkpoint store tracking which items are done in-flight or failed, results flowing to a catalog, and an escalation gate routing low-confidence or severe verdicts to a human review queue Figure 2. The review loop, run as a fleet. The archive fills a queue; many worker agents each run the loop on one video; a checkpoint store remembers progress; confident verdicts go to the catalog and the rest escalate to people.

The Tool Belt — And The Funnel That Pays For It

An agent is only as capable as the tools you give it, and the single most important design choice in this whole lesson is the order you arrange those tools in. The order is a funnel — cheap and fast at the wide top, expensive and smart at the narrow bottom — and getting it right is the difference between an archive job that costs a few hundred dollars and one that costs a hundred thousand.

Here is the mistake that funnel prevents. The obvious way to review a video with AI is to send every frame to a vision-language model — a model that can look at an image and answer questions about it in words, which from here we will call the reader. The reader is the smartest tool on the belt and by far the most expensive. A single minute of video at thirty frames per second is 1,800 frames; an hour is 108,000. Running the reader on every frame of a large archive is how you turn a sensible idea into a budget catastrophe. We made the same point about per-frame cost in the video investigator lesson; in the archive pattern it is even sharper, because there is no single question to narrow the search — the agent has to consider everything.

So the tool belt is arranged as a funnel, widest and cheapest first. The first tool is a cheap filter: shot-change detection and a fast transcript that split the video into a handful of meaningful segments and throw away the dead air, the static title cards, the empty corridors. The second tool is a keyframe sampler: instead of every frame in a segment, it picks the few frames that actually represent it — one per shot, or one per second — because adjacent frames are nearly identical and the reader learns almost nothing from the ninetieth near-copy. We cover the sampling-versus-streaming trade-off in detail in the video VLMs lesson. Only after those two tools have done their narrowing does the third tool, the reader, run — and now it runs on a few dozen frames per video instead of a hundred thousand. The fourth tool is the policy or rubric judge — the step that turns the reader's description into the structured decision the job actually wants ("does this match category 4 of the moderation policy? yes, confidence 0.91"). The fifth tool is the catalog writer — a durable write into the database, search index, or content system that will hold the verdicts. And the sixth is the escalation router — the tool that, instead of writing a verdict, files the item into the human review queue when confidence is low or the stakes are high.

Tool What it does Cost Where in the funnel
Cheap filter Shot-change + transcript; drop dead air Very low Top — runs on everything
Keyframe sampler Pick the few frames that represent each shot Low Narrows what the reader sees
Reader (VLM) Describe the sampled frames in words High Bottom — runs on a handful
Policy / rubric judge Turn the description into a structured verdict Medium After the reader
Catalog writer Write the verdict + evidence durably Low On a confident verdict
Escalation router Send uncertain / severe items to people Low The human-in-the-loop exit

The funnel is the whole game. A common and expensive mistake is to skip it — to point the reader at raw video "because the model can handle video now." It can; your invoice cannot. Build the cheap filter and the sampler first, prove they cut the volume reaching the reader by ten or a hundred times, and only then turn the expensive tool on.

A funnel diagram of the async review tool belt showing video frames entering at a wide top, passing through a cheap filter that drops dead air, a keyframe sampler that picks representative frames, a costly vision-language reader at the narrow middle, a policy judge that produces a structured verdict, and exits to a catalog writer or an escalation router, with annotations showing the volume shrinking from roughly one hundred thousand frames per hour at the top to a few dozen frames reaching the reader Figure 3. The tool belt as a funnel. Cheap, fast tools at the top drop most of the video; only a few dozen representative frames per video reach the expensive reader at the bottom. Skip the funnel and the reader's bill explodes.

The No-Clock Advantage — Why Async Saves Money

The previous lesson lived and died by a one-second budget. This lesson has the opposite freedom, and that freedom is worth real money. Because no one is waiting for the answer, the agent can use the cheapest, slowest, most patient way to call a model — and the model vendors charge dramatically less for patience.

The lever is the batch API — a way to submit a large pile of requests and get the results back later instead of instantly. All three major model providers offer one, and in 2026 they all carry the same headline: a 50% discount in exchange for accepting that results arrive within a day. OpenAI's Batch API takes a file of up to 50,000 requests and returns results within 24 hours at half the standard price. Anthropic's Message Batches API takes up to 100,000 requests or 256 MB per batch, returns within 24 hours (most finish in under an hour), keeps the results available for 29 days, and is also half price. Google's Gemini Batch Mode takes JSONL files up to 2 GB and is, again, 50% cheaper than the synchronous API. A real-time agent cannot touch any of these prices, because it cannot wait a day. An async agent can — so it gets the same intelligence for half the cost, simply by not being in a hurry.

Put the funnel and the batch discount together with a worked example, because the arithmetic is the point. Suppose you must review an archive of 100,000 videos, averaging ten minutes each. The naive design sends every frame to the reader: ten minutes at 30 fps is 18,000 frames, so the whole archive is 1.8 billion frames, and at any realistic reader price per frame that is an invoice in the millions. Now apply the funnel. The cheap filter and sampler cut each video down to, say, 40 representative frames — 18,000 down to 40 is a 450-fold reduction. That is 100,000 × 40 = 4 million frames reaching the reader instead of 1.8 billion. Then apply the batch discount: the half-price rate cuts that remaining bill in two. The funnel did the heavy lifting (roughly 450× less work), and the batch API halved what was left — together turning an impossible number into a planned line item. The exact dollar figure depends on the model and the year, which is why we keep the real cost of AI in video products as the living reference; the structure of the saving, though, does not change.

Real-time (last lesson) Async / batch (this lesson)
What's waiting A person, mid-conversation Nothing — results read later
Time budget Under ~1 second per turn Hours to a day for the whole job
Optimize for Latency Cost and completeness
Model pricing Full synchronous price Batch API — 50% off
Scale One call at a time 50,000–100,000 requests per batch
Main failure mode Lag, talking over people A crash losing a half-finished job

There is a third saving the no-clock setting unlocks, beyond batch pricing and the funnel: you can run the heavy work when compute is cheap. Cloud machines rented from the spare capacity that providers discount — called spot or preemptible instances — can be a fraction of the on-demand price, and a job with no deadline can happily use them, accepting that a worker might be reclaimed mid-task. That last part — "reclaimed mid-task" — is exactly why the next section matters, because cheap, interruptible compute is only safe if your job can survive being interrupted.

The Hard Part — Durability At Archive Scale

Here is the failure that defines this whole pattern. Your fleet is 80,000 videos into a 100,000-video job, twelve hours in, and a machine crashes — a spot instance is reclaimed, a network blips, a bad video crashes the reader. In a naive design, the job dies and you start over from video #1, losing twelve hours and twelve hours of money. The single most important engineering property of an async review agent is that this cannot happen. The job must resume at item 80,001, not item 1. The property has a name — durability, meaning the work survives failure — and it is built from four ideas worth knowing by name.

The first is checkpointing: writing down progress as you go, so the system always knows what is finished. Every time a worker completes a video, it records that verdict and marks the ticket done in a durable store before reaching for the next ticket. If the fleet dies, a fresh fleet reads the store, sees that 80,000 tickets are done, and picks up the remaining 20,000. Nothing already paid for is paid for twice.

The second is the idempotency key — a piece of jargon worth unpacking slowly. Idempotent means an operation gives the same result no matter how many times you run it; pressing a floor button in an elevator is idempotent — pressing it five times still sends you to that one floor. An idempotency key is a stable label for a unit of work so the system can recognize "I have already done this exact thing." Here the key is built from the video's ID plus the version of the model and policy being applied — for example video-438201 · reader-v3 · policy-2026-06. Before a worker processes a ticket, it checks whether that key already has a verdict; if it does, it skips it. This is what makes a re-run safe: without it, restarting the job would re-review and re-bill every video it had already done, turning a crash into a double charge.

The third is the retry with a dead-letter queue. Some videos will fail — a corrupt file, a format the reader chokes on, a transient timeout. The fleet retries a failure a few times, because most failures are temporary. But a video that fails every time — a poison item — must not wedge the whole line forever. So after a few attempts it is moved aside into a dead-letter queue: a separate holding pen for items that exhausted their retries, set aside for a human to look at later, while the fleet moves on. One bad video out of a million should cost you one manual review, not the other 999,999 verdicts.

The fourth is orchestration — the conductor that ties the other three together and runs the workflow as a whole. In 2026 the standard tool for this is a durable-execution engine, the best known of which is Temporal: it records every step of a workflow as an event log, so if the process dies at step 47 of 100 it replays the log and resumes at step 48, not step 1. (Temporal's role in production AI became mainstream enough in 2026 that it raised a $300M round in February at a $5B valuation, and a supported integration with the OpenAI Agents SDK shipped in March — a fair signal that durable execution is now table stakes for serious agent work.) You do not have to use Temporal specifically — a job queue plus a status table plus careful idempotency gets you most of the way — but you must have something playing its role. We return to running, watching, and costing fleets like this in the AgentOps lesson.

The pitfall to burn into memory: a fleet without idempotency is a duplication engine. The first crash, the first retry, the first re-run, and it quietly processes and bills for the same videos twice. Idempotency is not a nice-to-have you add later; it is the property you design the work units around from the first line of code.

A diagram of the durability spine for an archive job, showing a workflow that crashes at item 80,000 of 100,000 and resumes at item 80,001 rather than restarting, with four labeled mechanisms — a checkpoint store recording done items, an idempotency key built from video ID plus model and policy version, a retry path with a dead-letter queue holding poison items for human review, and a durable orchestration engine replaying its event log to resume mid-job Figure 4. The durability spine. Checkpointing remembers what is done; an idempotency key stops double-processing; retries with a dead-letter queue isolate poison items; a durable orchestrator resumes a crashed job mid-stream instead of from the start.

Re-Processing — Why The Archive Is Never Done

A live agent answers and forgets. An archive agent's verdicts are stored, and stored verdicts go stale — which creates a job that the real-time agents never face: re-processing. When a better reader model ships, or your moderation policy changes, or a regulator asks you to re-check everything against a new rule, you have to run the archive again. The question is whether you re-run all of it or only part.

This is exactly where the idempotency key earns its second keep. Because every verdict was stamped with the model and policy version that produced it (reader-v3 · policy-2026-06), you can re-run the job under a new version (reader-v4 · policy-2026-09) and the fleet will see that none of the existing verdicts match the new key — so it knows precisely what is stale. Better still, you can be selective: if only the policy changed and not the perception, you may only need to re-run the cheap judge step on the stored frame descriptions, never touching the expensive reader again. Versioning your verdicts turns "re-review the entire archive" from a panic into a planned, partial, affordable run. An archive agent that does not version its output condemns you to re-running everything from scratch every time anything changes — which, on a million-item archive, is the difference between an afternoon and a fortune.

Human-In-The-Loop, By Exception

A real-time copilot puts a human on every consequential action, because there are only a handful per meeting. An archive agent cannot — a million videos means a million decisions, and no team reviews a million of anything. So the human-in-the-loop principle changes shape: instead of approving everything, people review by exception, and the agent's job is to decide which exceptions.

Two signals route an item to a person. The first is confidence: the agent attaches a confidence score to each verdict, and anything below a threshold — the cases where the model is unsure — goes to the human queue instead of the catalog. The second is severity: some categories are too consequential to auto-decide even at high confidence — a suspected child-safety violation, a legal-removal flag, a medical finding — and those always go to a person regardless of the score. Everything else, the confident and the low-stakes, the agent resolves on its own. The result is leverage with a safety valve: the agent handles the 95% that is clear, and a small human team handles the 5% that is genuinely hard or genuinely dangerous. A second safety practice is sampling — having humans spot-check a small random slice of the agent's confident verdicts too, to catch the failure mode where the agent is confidently wrong. The escalation queue catches what the agent knows it does not know; sampling catches what it does not know it does not know.

A Worked Archive Job, End To End

Tie it together with one job: a social-video platform with a 100,000-video moderation backlog built up after a policy change, and a mandate to clear it in a day. The archive fills a queue; a fleet of workers scales up to drain it in parallel. Each worker pulls a video, runs the cheap filter (shot detection plus transcript) to find the few segments worth looking at, samples about 40 keyframes, and only then calls the reader on those frames — the funnel keeping the expensive step to 4 million frames across the whole archive instead of 1.8 billion. The reader's descriptions go to the policy judge, which returns a category and a confidence. Confident, low-stakes verdicts are written straight to the catalog; anything under the confidence threshold, plus every suspected child-safety case regardless of confidence, is escalated to the human review team. The reader calls go through a batch API at half price because nothing is waiting. Progress is checkpointed every video; each verdict carries the key video-N · reader-v3 · policy-2026-06; a dozen corrupt files land in the dead-letter queue for manual inspection; and when a spot instance is reclaimed eight hours in, a replacement worker resumes from the checkpoint without re-billing a single completed video. By evening the catalog holds 95,000 machine verdicts, the human queue holds 5,000 hard cases, and the whole run cost a planned, predictable amount — because the funnel cut the volume, the batch API cut the price, and durability meant the job was paid for exactly once.

The Law — Moderation Transparency And AI Disclosure

An archive agent that moderates or labels content at scale touches two bodies of law worth flagging, even though the deep treatment lives in other lessons. The first is moderation transparency: in the EU, platforms that remove or restrict user content owe the affected user a clear statement of reasons and, depending on size, public reporting on their moderation — which means an agent's verdicts cannot be a black box. The practical consequence is that the agent must store why it decided what it decided — the evidence frame, the policy clause, the confidence — not just the verdict, so a reason can be given and an appeal answered. This is one more argument for the audit trail the durability spine already gives you.

The second is AI disclosure and provenance. Where the archive job generates or alters content rather than just labelling it, the EU AI Act's transparency rules expect AI-produced or manipulated media to be disclosed as such, and the content-provenance standard called C2PA is the emerging way to attach that "made or edited by AI" label to a file durably. We cover that disclosure-and-provenance engineering in the quality, cost, C2PA and EU AI Act lesson, and the biometric limits that bound any face-related archive work in the face detection under the EU AI Act lesson. The design rule that satisfies both bodies of law is the same one the rest of this lesson already argued for: keep the evidence, version the decisions, and put a human on the cases that carry real consequences.

Build, Buy, Or Wrap

You have the same three honest options as the other applied agents, and the right one depends on how specific your archive job is. You can buy a managed video-understanding or moderation service that ingests an archive and returns tags or moderation labels — fastest, and right when your job is a standard one (generic moderation, generic metadata) that an off-the-shelf model already does well. You can wrap: take a batch API and a durable-execution engine and assemble the funnel-plus-fleet yourself around your own rubric, which is right when the policy is yours but the plumbing is standard — most teams should start here. Or you can build the whole pipeline from the primitives in this section — the funnel, the checkpoint store, the idempotency scheme, the escalation logic, an agent framework from the framework lesson — which is right when the review is your product, as it is for a specialized compliance, trust-and-safety, or archive-intelligence offering. Whichever you pick, the cost funnel and the durability spine are not optional extras you bolt on later; they are the load-bearing walls, and a vendor or framework that does not give you both is not ready for an archive at scale.

Where Fora Soft Fits In

We build video products across streaming and OTT, video conferencing, e-learning, telemedicine, surveillance, and AR/VR, and the async review agent is the pattern behind the unglamorous archive work those products eventually need — backfilling captions and chapters across an OTT back-catalog, clearing a user-generated-content moderation backlog, re-checking a media library against a new policy, or enriching a years-deep archive with searchable metadata. Our design discipline is the one in this lesson: arrange the tools as a funnel so the expensive reader sees only what survives the cheap filters, run the heavy calls through batch APIs because nothing is waiting, and build the job on a durable spine so a crash resumes instead of restarting. We version every verdict so re-processing is partial rather than total, and we route the uncertain and the severe to human reviewers rather than auto-deciding them. The same skeleton serves a streaming archive, a surveillance back-catalog, or an e-learning library without rebuilding the agent for each.

What To Read Next

Talk To Us · See Our Work · Download

  • Talk to a video engineer — scope an archive-review, moderation-backlog, or metadata-enrichment job for your video library: /services/ai-software-development
  • See our case studies — streaming, OTT, surveillance, and AI work: /portfolio
  • Download the Async Archive-Review scoping & durability checklist — the fleet loop, the cost funnel, the batch-API economics, and the checkpoint / idempotency / dead-letter / escalation spine on one page: Download the checklist

References

  1. OpenAI — "Batch API" developer guide (OpenAI Platform documentation, accessed June 2026) — https://developers.openai.com/api/docs/guides/batch — tier 4 (model-provider documentation). Source for the Batch API design: submit a file of up to 50,000 requests, results returned within a 24-hour window, flat 50% discount on input and output versus the synchronous API, for non-real-time workloads.
  2. Anthropic — "Message Batches API" (Anthropic documentation, accessed June 2026) — https://docs.anthropic.com/en/docs/build-with-claude/batch-processing — tier 4 (model-provider documentation). Source for up to 100,000 requests or 256 MB per batch, asynchronous results within 24 hours (most under an hour), results retained 29 days, and the automatic 50% discount on all Claude models for batch requests.
  3. Google — "Batch Mode in the Gemini API: Process more for less" (Google Developers Blog, 2025/2026) — https://developers.googleblog.com/scale-your-ai-workloads-batch-mode-gemini-api/ — tier 4 (model-provider engineering blog). Source for Gemini Batch Mode as an asynchronous high-throughput endpoint, JSONL inputs up to 2 GB, 24-hour target turnaround, and a 50% price reduction versus the synchronous API.
  4. Google — "Batch API" (Gemini API documentation, ai.google.dev, accessed June 2026) — https://ai.google.dev/gemini-api/docs/batch-api — tier 4 (model-provider documentation). Confirming reference for the batch endpoint mechanics, the 50% discount on per-token pricing, and context-caching compatibility.
  5. Temporal — "Durable Execution meets AI: Why Temporal is ideal for AI agents & Generative AI Apps" (Temporal blog, 2026) — https://temporal.io/blog/durable-execution-meets-ai-why-temporal-is-the-perfect-foundation-for-ai — tier 4 (vendor engineering). Source for durable execution: recording each workflow step as an immutable event history, automatic per-step checkpointing, resuming a crashed workflow at the next step rather than the start, and workflows spanning days or weeks.
  6. WorkOS — "Maxim Fateev on why durable execution matters for AI agents" (WorkOS blog, 2026) — https://workos.com/blog/maxim-fateev-temporal-durable-execution-ai-agents — tier 4 (industry interview). Source for the 2026 context around durable execution becoming standard infrastructure for production agents, Temporal's February 2026 $300M Series D at a ~$5B valuation, and the March 2026 OpenAI Agents SDK + Temporal integration.
  7. MightyBot — "Designing Fault-Tolerant AI Agent Pipelines: Idempotency, Retries, and State Management" (2026) — https://mightybot.ai/blog/fault-tolerant-ai-agent-pipelines/ — tier 6 (engineering reference). Source for the fault-tolerance toolkit used in the durability section: idempotency keys derived from stable identifiers plus version metadata, bounded retries, dead-letter queues for poison items, and state machines / circuit breakers.
  8. arXiv / ACM SIGKDD — Tang et al., "VLM as Policy: Common-Law Content Moderation Framework for Short Video Platform" (KuaiMod), arXiv:2504.14904; Proc. 31st ACM SIGKDD (2025) — https://arxiv.org/abs/2504.14904 — tier 5 (peer-reviewed / preprint). Source for vision-language models acting as policy judges over short-video archives: chain-of-thought reasoning, offline curriculum adaptation, and online deployment with policy refinement — the academic grounding for the "policy / rubric judge" tool.
  9. European Union — Regulation (EU) 2022/2065 (Digital Services Act), Articles 17 and 24 (statement of reasons; transparency reporting) — https://eur-lex.europa.eu/eli/reg/2022/2065/oj — tier 1 (primary legislation). Source for the moderation-transparency obligation: a clear statement of reasons to users whose content is restricted, and periodic transparency reporting — the legal basis for storing why each verdict was reached, not just the verdict.
  10. European Union — Regulation (EU) 2024/1689 (Artificial Intelligence Act), Article 50 (Transparency Obligations) — https://artificialintelligenceact.eu/article/50/ — tier 1 (primary legislation). Source for the obligation to disclose AI-generated or AI-manipulated content, relevant when an archive job alters rather than only labels content; transparency obligations apply from 2 August 2026.
  11. C2PA — "Coalition for Content Provenance and Authenticity, Technical Specification" (accessed June 2026) — https://c2pa.org/specifications/ — tier 2 (multi-stakeholder technical standard). Source for the content-provenance manifest used to attach durable "created or edited with AI" metadata to media files, the practical mechanism behind the Article 50 disclosure for generated/altered archive content.
  12. AWS — "What is a dead-letter queue?" (Amazon SQS Developer Guide, accessed June 2026) — https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-dead-letter-queues.html — tier 4 (platform documentation). Source for the dead-letter-queue mechanism: a separate queue that receives messages a consumer fails to process after a configured number of attempts, isolating poison items so they do not block the main queue.