Proctoring & Assessment Reference Design

This is engineering guidance, not legal advice. Confirm specifics with qualified counsel.

Why This Matters

If you are scoping an assessment product — a certification platform, a proctored exam for a professional body, a university testing system, or a corporate-compliance quiz that has to stand up in an audit — you are building the one part of e-learning where a wrong answer has legal consequences. A video that buffers is an annoyance; a credential awarded to the wrong person, or a cheating flag raised by a biased algorithm with no human review, is a lawsuit. This article gives an L&D director, EdTech founder, or product manager the complete architectural map so you can brief engineers precisely, judge a proctoring vendor against a real reference design, and know which layers you can buy and which — almost always the integrity spine and the human-review loop — you must build and own. It is written for the non-engineer making the build-versus-buy call, but it stays accurate enough for the video engineer, the assessment specialist, and the privacy counsel who will sign off on it.

First, the One Idea That Organizes Everything

A proctored assessment looks like a quiz with a webcam, and that resemblance is the most expensive misunderstanding in the field. A quiz has one job: collect answers and score them. A proctored, credential-bearing assessment has a harder job — it has to prove, later and to a skeptic, that the right person took the test, under controlled conditions, was scored fairly, and earned the result on record. Every one of those four claims can be challenged by a student, an employer, an accreditor, a regulator, or a court. The architecture exists to make each claim defensible.

So the reference design is best understood not as one system but as a pipeline of six trust gates, sitting on a shared integrity spine and under a shared compliance umbrella. Each gate answers one question, and each is backed by a named standard or a named law:

The identity gate answers "is this the right person?" The delivery gate answers "was the assessment delivered in a controlled environment?" — it serves the test, usually in a locked-down client, using the standard that makes assessment content portable, called QTI. The proctoring gate answers "was the behaviour legitimate?" The scoring gate answers "what did they earn?" The grade-passback gate answers "where does the result go?" — it returns the score to the learning system over the standards built for exactly that. And the credential gate answers "what can the learner prove afterward?" — it issues a tamper-evident credential.

Underneath all six runs the integrity spine: a tamper-evident audit log that records every event, flag, and human decision so the whole chain can be reconstructed in a hearing years later. And over all six sits the compliance umbrella: the EU AI Act, the EU's data-protection law (GDPR), US biometric-privacy laws like Illinois BIPA, the US student-records law FERPA, and the accessibility standard WCAG. Keep those three layers separate in your head — six gates, one spine, one umbrella — and the rest of this article, and the rest of your build, falls into place.

Proctoring and assessment reference design: six trust gates on an integrity spine under a compliance umbrella Figure 1. The full picture. Six trust gates — identity, locked-down delivery, proctoring, scoring, grade passback, and credentialing — run left to right on a tamper-evident integrity spine and under a compliance umbrella. The gates produce the result; the spine makes it defensible; the umbrella keeps it lawful.

Gate 1: Identity — "Is This the Right Person?"

Every defensible assessment starts by proving who is at the keyboard, because a perfect score means nothing if a paid stand-in earned it. The identity gate establishes the candidate's identity at the start and, for higher stakes, keeps checking it during the session.

There are three layers, used in combination depending on stakes. The first is an identity-document check: the candidate photographs a government ID, and software reads it and confirms it is genuine. The second is a face match: the system compares a live selfie to the photo on the ID, producing a similarity score. The third is continuous identity — periodic or constant re-checks during the exam, so a candidate cannot swap seats with an expert after the opening photo. The full treatment of these methods, their accuracy, and their accessibility trade-offs lives in identity verification for assessments; here, the architectural point is that identity is its own service with its own data, and the data it handles is the most sensitive in the whole system.

That sensitivity is the catch. A face match is biometric data, and biometric data used to identify a person is a "special category" under the EU's data-protection law — the General Data Protection Regulation, or GDPR, at Article 9 — which means you generally need explicit consent and a documented lawful basis before you collect it. In the United States, Illinois' Biometric Information Privacy Act (BIPA) goes further: it requires informed written consent before collection and lets individuals sue directly, with statutory damages set at 1,000 dollars per negligent violation and 5,000 dollars per intentional or reckless one. Those numbers multiply fast across a cohort, which is why the identity gate is where privacy review starts, not where it is bolted on. Collect consent at registration, in plain language, naming exactly what biometric data is processed and why — not on the exam-start screen where a nervous candidate will click anything to begin.

Gate 2: Locked-Down Delivery — "Was the Environment Controlled?"

Once you know who is testing, you have to deliver the test in a controlled environment and present the questions in a form that survives moving between systems. This gate has two halves: the lockdown client and the assessment content.

The lockdown client is software that takes over the device for the duration of the exam — disabling alt-tab and app-switching, blocking screenshots and screen sharing, preventing virtual machines and common bypass tools, and stopping a second browser window. Think of it as sealing the room before the exam begins. In practice teams either adopt an open-source locked-down browser (Safe Exam Browser is the common baseline) or build a custom desktop shell, usually on Electron, when they need deeper operating-system hooks such as USB-device blocking or stronger virtual-machine detection. A lockdown client is strong against local cheating — copy-paste, screen-recording, alt-tab to notes — and useless against the dominant 2026 failure mode, a second phone in the candidate's lap running a chatbot. That gap is why lockdown is one gate, not the whole system; the deterrence work is shared with assessment design, covered in anti-cheating: detection, deterrence, and assessment design.

The assessment content — the questions, the correct answers, the scoring rules, the timing — should be expressed in the standard built for exactly this, called QTI (Question and Test Interoperability), maintained by the standards body 1EdTech. QTI is the shipping container for a test: it describes an AssessmentItem (a single question) and an AssessmentTest (a structured set of them) in a portable format, so the same exam can move between an authoring tool, an item bank, and a delivery engine without being rebuilt. The current version, QTI 3.0, was finalized in May 2022; it adds a single authoring-to-delivery format that renders consistently across systems, native support for computer-adaptive testing (tests that adjust difficulty to the candidate), and accessibility features aligned with W3C web standards. Notably, the QTI specification even defines a "Proctor" as a formal actor — the standard was written with supervised delivery in mind. Using QTI means your item bank is not trapped inside one vendor; using a proprietary quiz format means it is.

Gate 3: Proctoring — "Was the Behaviour Legitimate?"

This is the gate people picture when they hear "proctoring," and it is the one most shaped by law. Its job is to observe the session and surface behaviour that needs a human judgment. There are four shapes it can take, and choosing among them is the central design decision of the whole subsystem.

The first is the live human proctor: a trained person watches one to a dozen candidates in real time over video, audio, and shared screen, and can intervene. It is the strongest deterrent and produces a clear audit trail, but it costs roughly 10–30 dollars per exam-hour and does not scale past about a thousand concurrent sessions without a workforce. The second is AI-only automated proctoring: the session is recorded, algorithms flag faces, gaze, voices, and devices, and no human reviews the result. It is cheap and scales infinitely, but false-positive rates of 15–30 percent are common, the bias against darker-skinned, neurodivergent, and head-covering candidates is documented, and — critically — it is the approach the law now disfavours. The third is browser-lockdown only: no camera at all, lowest privacy cost, useless against a second device. The fourth, and the one that fits most 2026 builds, is hybrid AI plus human review: the session records, AI ranks the most suspicious few percent of moments, and a trained reviewer confirms or dismisses each one — and only a human ever triggers an adverse action. The full comparison of these approaches, with their costs and catch rates, is in online proctoring: approaches, trade-offs, and privacy.

The reason hybrid wins is not engineering taste; it is regulation. Under the EU AI Act — the European Union's regulation on artificial intelligence, Regulation (EU) 2024/1689 — AI systems "intended to be used for monitoring and detecting prohibited behaviour of students during tests" are listed in Annex III as high-risk. High-risk status pulls in a stack of obligations: a risk-management system (Article 9), data governance (Article 10), technical documentation (Article 11), automatic event logging (Article 12), transparency to users (Article 13), and — the load-bearing one for this gate — human oversight (Article 14), meaning a competent person must be able to monitor, override, and reverse the system's outputs. An AI-only pipeline that auto-fails a candidate has no oversight loop, so it cannot meet Article 14. That single requirement is why the architecture routes every AI flag through a human and never lets the algorithm issue a verdict.

One timing note, because it is moving. The high-risk obligations for Annex III systems were originally set to apply from 2 August 2026. In May 2026 the EU institutions reached a provisional political agreement on a "Digital Omnibus" package that defers the application date for stand-alone Annex III systems to 2 December 2027 — but the substance of the obligations is unchanged, the deferral only takes legal effect once formally adopted and published, and deployer duties for oversight and transparency remain. Treat the design requirement (human in the loop) as permanent and the exact compliance date as a thing to confirm with counsel at build time.

Hybrid proctoring loop: AI ranks the top suspicious moments, a human reviewer decides, and only a human acts Figure 2. The human-in-the-loop required by EU AI Act Article 14. AI is a search engine over the recording that surfaces the top few percent of moments; a trained reviewer confirms or dismisses each; only a confirmed flag enters the institutional integrity process. No automated adverse action ever fires.

Gate 4: Scoring — "What Did They Earn?"

With a legitimate session captured, the system scores it. Scoring splits cleanly into two kinds, and most real exams use both.

Automatic scoring handles anything with a defined right answer — multiple choice, numeric entry, drag-and-drop, code that compiles and passes tests. When the assessment is authored in QTI, the scoring logic travels with the question: QTI calls this response processing, a set of rules inside the item that turns a candidate's response into an outcome such as a score. That is the deep value of the QTI standard at this gate — the test is not just portable as content, it is portable as behaviour, so the same item scores identically in any conformant engine.

Human scoring handles everything judgment-based — essays, open responses, oral defenses, portfolio work. Here the tool is a rubric: a structured grid of criteria and performance levels that turns a grader's judgment into a repeatable number, and that gives a challenged candidate a concrete basis for the mark. The design of gradebooks, auto-grading versus rubric grading, and where this data lives is the subject of grading, rubrics, and grade passback. The architectural point is that the scoring gate must record not only the number but how it was reached — the response-processing trace for auto-scored items, the rubric and reviewer identity for human-scored ones — because the integrity spine needs that evidence if the result is ever disputed.

A pitfall hides here, and it is the most common measurement error in all of e-learning: confusing "finished the video" or "answered every question" with "passed." Watching 100 percent of a lecture is an engagement signal, not a competence verdict. Completion, score, and mastery are three different things with three different sources, and the scoring gate must keep them distinct.

Gate 5: Grade Passback — "Where Does the Result Go?"

A score trapped inside the proctoring tool is useless; it has to land in the system of record — the learning-management system (LMS) gradebook, the certification body's case system, or a corporate HR platform. This gate uses named standards so the result arrives where the institution expects it, automatically and verifiably.

The dominant mechanism is LTI Advantage, a set of services layered on the standard that lets an external tool be trusted inside any learning system, called LTI 1.3 (Learning Tools Interoperability), also from 1EdTech. The relevant service is Assignment and Grade Services (AGS), which lets your assessment tool push a score, and an optional comment, back into the LMS gradebook column it belongs in. The candidate took the exam in your tool; the grade appears in the university's gradebook as if it had always lived there. The mechanics of LTI's signed-token launch — it uses an OpenID Connect flow carrying a signed token, not a password login — are covered in LTI explained: launching tools inside any LMS; the point at this gate is that AGS is the standards-correct way to return a grade, and you should design your "what is a score, and which column does it map to" model before you build it, not after.

There is a second path for richer reporting. The standard that records granular learning events, called xAPI (Experience API) from ADL, writes statements such as "Maria submitted the exam" or "Maria's attempt was flagged and cleared" into a store called a Learning Record Store (LRS). Its sibling, cmi5, wraps xAPI with the launch-and-completion rules an LMS needs, so it can both pass a pass/fail outcome back and stream the detailed event trail to analytics. Use AGS for the official grade in the gradebook; use xAPI or cmi5 for the detailed record and the analytics. A serious subsystem does both. Tracking video-specific events — if the assessment includes watching a clip and answering — follows the xAPI Video Profile.

From score to record to credential: LTI AGS to the gradebook, xAPI/cmi5 to the LRS, then an Open Badges 3.0 credential Figure 3. From score to system of record to credential. The official grade returns to the LMS gradebook over LTI Advantage AGS; the detailed trail streams to a Learning Record Store over xAPI or cmi5; and the achievement is issued as an Open Badges 3.0 credential — a W3C Verifiable Credential — that the learner holds and any verifier can check.

Gate 6: Credentialing — "What Can the Learner Prove Afterward?"

The last gate turns a passing result into something the learner can carry into the world and a third party can trust without phoning the issuer. A PDF certificate fails this test — it is trivially forged and cannot be checked. The modern answer is a verifiable credential.

The standard is Open Badges 3.0, the 1EdTech digital-credential format, finalized in 2024, which is now built directly on the W3C Verifiable Credentials Data Model 2.0 — a World Wide Web Consortium standard that became an official Recommendation on 15 May 2025. The model has three roles, and the analogy is a passport: the issuer (your assessment platform) cryptographically signs a credential the way a government signs a passport; the holder (the learner) keeps it in a wallet; and any verifier (an employer, an accreditor) can confirm it is authentic and unaltered without contacting the issuer, the way a border officer checks a passport without calling the issuing country. Because the credential is signed, tampering is detectable and the claim is machine-verifiable. Multiple credentials can be gathered into a Comprehensive Learner Record (CLR), itself verifiable, for a full transcript. The full treatment is in certificates, badges, and verifiable credentials.

The build decision here is about portability and trust, not graphics. Issuing a signed Open Badges 3.0 credential costs little more than rendering a PDF, and it makes the credential actually useful: it survives the learner leaving your platform, and it lets an employer verify it in seconds. The pitfall is shipping a decorative certificate and calling it a credential — it looks the same to the learner on day one and is worthless the first time someone tries to verify it.

The Integrity Spine and the Compliance Umbrella

Two structures run underneath and over all six gates, and they are where a serious build separates itself from a demo.

The integrity spine is a tamper-evident audit log. Every meaningful event — identity check, lockdown start, each AI flag, each reviewer decision, the score, the grade-passback call, the credential issuance — is written to an append-only log whose entries are chained with cryptographic hashes, so that altering any past entry breaks the chain and is detectable. The recording itself should be produced server-side, not in the candidate's browser, because a candidate-controlled recording is tampering-vulnerable and will not stand up in a hearing. This spine is what lets you answer, two years later, "show me exactly what happened in this exam and who decided what" — the question every integrity dispute, accreditation audit, and lawsuit eventually asks. Retention is typically five to ten years depending on the credential, and it interacts with privacy law's right to erasure, so document the legal basis for keeping the data.

The compliance umbrella is the set of laws and standards that govern the whole subsystem, and it is market access, not a nice-to-have. Beyond the EU AI Act and GDPR already discussed, three more apply. BIPA and similar US state biometric laws govern face and voice data, as covered at the identity gate. FERPA — the US Family Educational Rights and Privacy Act — applies when a proctoring vendor handles US student records under contract: the institution is the data controller and the vendor acts as a "school official," barred from using the data for other purposes. And WCAG 2.1 Level AA — the Web Content Accessibility Guidelines — is the accessibility floor across the lockdown client, the candidate experience, and the reviewer console; education is a high-litigation target for accessibility, and an accommodation flow must never force a candidate to disclose a disability to the AI flagger. Accessibility is treated in depth in WCAG 2.1 AA for educational video.

Putting It Together: How One Exam Actually Runs

Walk a single high-stakes exam through all six gates, because the choreography is the architecture. A candidate registers and, at registration, gives explicit consent for biometric processing in plain language. On exam day the identity gate reads their ID and matches a live selfie. The delivery gate launches the lockdown client, which seals the device, and serves the exam — authored in QTI — from the delivery engine. Throughout the session the proctoring gate records server-side under lockdown while AI ranks suspicious moments; nothing is decided yet. At submission the scoring gate auto-scores the objective items via QTI response processing and routes the essay to a human grader with a rubric. Two suspicious moments surfaced by the AI go to a trained reviewer, who dismisses both — and because no flag was confirmed, no adverse action fires. The grade-passback gate returns the official score to the university gradebook over LTI AGS and streams the full event trail to the LRS over cmi5. The credential gate issues a signed Open Badges 3.0 credential to the candidate's wallet. Every step of this — consent, identity result, lockdown, each AI flag, each reviewer decision, the score, the passback, the credential — is written to the integrity spine. If, a year later, the candidate disputes the result, you can reconstruct the entire session and show that a human, not an algorithm, made every consequential call.

That single walkthrough is the whole product. Every feature you will ever add lands on one of those six gates, the spine, or the umbrella.

Sizing the Build: The Four Stakes Tiers

"Proctored assessment" spans wildly different builds depending on what is riding on the result, and the architecture flexes accordingly. Four tiers cover almost every product, and naming yours up front prevents both over- and under-engineering.

Tier 1 — low-stakes (formative quizzes, training checks). The result is non-binding. Browser lockdown plus QTI auto-scoring is enough; no biometrics, no human review, minimal privacy surface. Cheap to build and run, and the least legally exposed.

Tier 2 — medium-stakes (graded coursework, internal certification). Now the grade counts. Add hybrid AI-plus-human proctoring, identity verification, and grade passback over LTI AGS. The human-review loop and the audit spine become mandatory. This is the tier most institutional builds target.

Tier 3 — high-stakes (professional certification, licensure). A credential with market value rides on the result. Everything in Tier 2 plus continuous identity, full verifiable-credential issuance, and a five-to-ten-year tamper-evident audit trail. The integrity spine is now the most important component in the system.

Tier 4 — regulated, EU-exposed high-stakes. Any candidate in the EU pulls in the AI Act high-risk regime in full: documented risk management, a fundamental-rights impact assessment, conformity assessment, post-market monitoring, and human oversight evidenced in the design. Budget a documentation and conformity workstream as a fixed line item, not an afterthought.

Four stakes tiers for a proctored assessment, from a low-stakes quiz to a regulated EU high-stakes exam Figure 4. Four tiers, one architecture. The six gates stay structurally the same; what changes with stakes is how much of each gate you turn on — from lockdown-and-auto-score at Tier 1 to full identity, hybrid proctoring, verifiable credentials, and EU AI Act conformity at Tier 4.

The Cost and Per-Exam Arithmetic, Shown Out Loud

Two cost lines matter: the one-time build and the per-exam running cost. Make both concrete.

The build. A custom hybrid proctoring-and-assessment subsystem — lockdown client, server-side capture, AI flagging pipeline, reviewer console, integrity spine, grade passback, and credentialing — is a roughly 18–22 week first release for a team of about five, landing in the range of 260,000–420,000 dollars depending on geography and seniority. The compliance documentation for an EU Tier-4 build is a real slice of that, typically a four-to-six-week workstream that teams routinely underbudget.

The per-exam running cost. Take a Tier-2/3 hybrid exam and walk the arithmetic for one exam-hour:

AI flagging compute (GPU, per exam-hour)      ≈ $0.40–$0.80
server-side recording storage (amortised)     ≈ $0.20–$0.50
human review (8% of moments surfaced,
   60–90 reviews/hour at ~$25/hr loaded)       ≈ $0.40–$1.20
WebRTC capture + bandwidth                     ≈ $0.10–$0.30
------------------------------------------------------------
total per proctored exam-hour                  ≈ $1.50–$4.00

Now scale it. A certification body running 30,000 proctored exams a year, each averaging one hour, at the midpoint of about 2.75 dollars per exam-hour:

30,000 exams × 1 hour × $2.75 = $82,500 per year to operate

That is the number to compare against vendor pricing. AI-only proctoring runs cheaper per hour (often well under a dollar), but it cannot meet the EU AI Act human-oversight requirement for a high-stakes EU exam, so the comparison is not apples-to-apples — you are buying a non-compliant capability. Live human proctoring runs far higher (10–30 dollars per exam-hour) and stops scaling around a thousand concurrent sessions. The hybrid number is what a defensible high-volume build costs, and it is why most certification bodies that operate their own reviewer pool clear comfortable margins on a 100–300 dollar exam fee.

The Pitfalls That Define a Bad Build

Automated adverse action from an AI flag. The single fastest route to a regulator complaint and a class action is auto-failing or auto-terminating a candidate on an algorithm's say-so. The EU AI Act's human-oversight requirement effectively forbids it for education; ethics and litigation history forbid it everywhere. Every adverse action passes through a human and an institutional process. No exceptions.

Trusting AI-text detectors. Detectors that claim to spot AI-written essays do not work reliably enough to base an integrity finding on — independent benchmarks put them well below usable accuracy, with double-digit false-positive rates that fall hardest on non-native English writers, and one major vendor walked back its own claims while another shut its detector down. Restructure the assessment instead — authorial attestation with draft history, oral defense, personalised problem variants — rather than betting a student's record on a coin-flip classifier.

Browser-side recording. If the candidate's machine produces the recording, the candidate can tamper with it. Produce the canonical recording server-side, or the integrity spine is built on sand.

"Watched it / answered it" equals "passed it." Completion, score, and mastery are distinct signals from distinct sources. Conflating them at the scoring gate corrupts every downstream report and credential.

Collecting biometrics without consent-before-collection. Under BIPA, consent must come before you capture a face or voice, in writing, naming the data and purpose — and the statutory damages are per-violation. A consent checkbox on the exam-start screen is too late and too coerced. Put it at registration.

Room scans in a public-sector context. A US federal court ruled in 2022 that scanning a student's private room before a remote test was an unreasonable search under the Fourth Amendment. For public institutions especially, design the environment check to be the least intrusive option that works, and document the justification.

Retrofitting grade passback and credentialing. Both shape your data model — what a "score" is, which gradebook column it maps to, what an achievement claim contains. Bolt them on at the end and you rework the scoring gate and the spine. Decide them before you build the activities that feed them.

Comparing the Build-vs-Buy Options Per Gate

The build-versus-buy decision is not one decision; it is one per gate. The table makes the realistic options explicit, including the standard each gate must speak.

Gate	Buy / off-the-shelf	Self-host open source	Custom build	Standards / laws it must speak
Identity	ID + face-match vendor (KYC-style)	Open face-match models + your flow	Rare — only if identity is the product	GDPR Art. 9, BIPA; NIST identity-assurance guidance
Locked-down delivery	Commercial lockdown browser	Safe Exam Browser + QTI engine	Custom Electron shell for deep OS hooks	QTI 3.0; WCAG 2.1 AA
Proctoring	Proctoring SaaS (verify EU AI Act fit)	mediasoup/LiveKit capture + your AI	Hybrid stack — common differentiator	EU AI Act Annex III + Art. 14; GDPR
Scoring	Built into delivery engine	QTI response processing + rubric tool	Custom for novel item types	QTI 3.0 response processing
Grade passback	Bundled in some LMS plugins	LTI tool library + cmi5/xAPI	Usually custom — your integration edge	LTI 1.3 Advantage (AGS); xAPI / cmi5
Credentialing	Badging SaaS (Open Badges 3.0)	Open-source VC issuer	Custom issuer for wallet control	Open Badges 3.0; W3C VC 2.0

The pattern most successful builds follow: buy or self-host the commoditized gates — identity, lockdown delivery, badging — and build the integrity spine, the human-review loop, and the grade-passback integration, because those are where your product is trusted by an institution and rarely available as a drop-in. The proctoring gate is the swing decision: buy it if you accept a vendor's compliance posture, build it if EU exposure or high stakes mean you must own the human-oversight design.

Where Fora Soft Fits In

Fora Soft has built assessment and proctored-exam systems alongside video conferencing, streaming, and e-learning products since 2005, and a proctored-assessment subsystem sits at the intersection of those skills — real-time and recorded video capture, a tamper-evident data spine, and standards-based integration into an LMS. The build-versus-buy trade-off we help teams make is per-gate and concrete: a commercial identity vendor and an open-source lockdown browser get a Tier-1 or Tier-2 product to market quickly, while a custom integrity spine, a human-review loop designed to evidence EU AI Act oversight, and a standards-correct grade-passback and credentialing path are what make a high-stakes certification platform defensible and institution-ready. We work across e-learning, video conferencing, streaming, surveillance, and telemedicine, so we are usually brought in when an assessment has to be both video-grade in capture and rigorous in integrity and compliance. No hype: for the lockdown and identity gates the honest answer is often "adopt the mature open-source or vendor option," and we will say so — the engineering you own should be the spine and the integrations that differentiate and protect your product.

Call to action

Talk to a e-learning engineer — book a 30-minute scoping call to talk through your proctored assessment architecture plan.
See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
Download the Proctored-Assessment Readiness Checklist — A one-page aid that pressure-tests all six trust gates of a proctored-assessment subsystem — identity, locked-down delivery, proctoring, scoring, grade passback, and credentialing — plus the integrity spine and the EU AI Act, GDPR,….

References

European Union. Regulation (EU) 2024/1689 (the AI Act), Annex III, point 3 — Education and vocational training; and point 3(d): AI systems to monitor and detect prohibited behaviour of students during tests. Official Journal version 13 June 2024. https://artificialintelligenceact.eu/annex/3/ (Tier 1, primary law). Accessed 2026-06-21.
European Union. Regulation (EU) 2024/1689 (the AI Act), Article 14 — Human Oversight (and Articles 9–13, 15, 26 on risk management, data governance, documentation, logging, transparency, accuracy, and deployer obligations). https://artificialintelligenceact.eu/article/14/ (Tier 1, primary law). Accessed 2026-06-21.
European Commission / AI Act Service Desk. Digital Omnibus — provisional agreement (May 2026) deferring stand-alone Annex III high-risk application to 2 December 2027. https://ai-act-service-desk.ec.europa.eu/en/ai-act/timeline/timeline-implementation-eu-ai-act (Tier 2, official guidance — developing; confirm at build time). Accessed 2026-06-21.
1EdTech (IMS Global). Question & Test Interoperability (QTI) 3.0 — Overview and Information Model (Final Release, 1 May 2022): AssessmentItem, AssessmentTest, response processing, results reporting; defines the Proctor actor. https://www.imsglobal.org/spec/qti/v3p0/oview (Tier 1, primary standard). Accessed 2026-06-21.
1EdTech (IMS Global). Learning Tools Interoperability (LTI) Advantage Implementation Guide, v1.3 — Assignment and Grade Services (AGS) for grade passback. https://www.imsglobal.org/spec/lti/v1p3/impl (Tier 1, primary standard). Accessed 2026-06-21.
W3C. Verifiable Credentials Data Model v2.0 — W3C Recommendation, 15 May 2025; the issuer / holder / verifier model underpinning verifiable credentials. https://www.w3.org/TR/vc-data-model-2.0/ (Tier 1, primary standard). Accessed 2026-06-21.
1EdTech (IMS Global). Open Badges 3.0 — digital-credential format built on W3C VCDM 2.0; OpenBadgeCredential signed and machine-verifiable; Comprehensive Learner Record. https://www.imsglobal.org/spec/ob/v3p0/ (Tier 1, primary standard). Accessed 2026-06-21.
European Union. Regulation (EU) 2016/679 (GDPR), Article 9 — special categories (biometric data); Article 22 — automated individual decision-making; Article 35 — DPIA. https://eur-lex.europa.eu/eli/reg/2016/679/oj (Tier 1, primary law). Accessed 2026-06-21.
State of Illinois. Biometric Information Privacy Act (BIPA), 740 ILCS 14 — informed written consent before collection; private right of action; statutory damages of $1,000 (negligent) / $5,000 (intentional or reckless) per violation. https://www.ilga.gov/legislation/ilcs/ilcs3.asp?ActID=3004 (Tier 1, primary law). Accessed 2026-06-21.
US Dept. of Education. Family Educational Rights and Privacy Act (FERPA), 20 U.S.C. § 1232g; 34 CFR Part 99 — the "school official" exception governing third-party vendors handling student records. https://studentprivacy.ed.gov/ (Tier 1, primary law). Accessed 2026-06-21.
W3C. Web Content Accessibility Guidelines (WCAG) 2.1, W3C Recommendation — Level AA for the lockdown client, candidate experience, and reviewer console. https://www.w3.org/TR/WCAG21/ (Tier 1, primary standard). Accessed 2026-06-21.
ADL. Experience API (xAPI) Specification, v1.0.3 and cmi5 — statements to a Learning Record Store for the detailed assessment-event trail. https://github.com/adlnet/xAPI-Spec (Tier 1, primary standard). Accessed 2026-06-21.
United States District Court, N.D. Ohio. Ogletree v. Cleveland State University (22 Aug 2022) — remote-test room scans held unreasonable under the Fourth Amendment. https://caselaw.findlaw.com/court/us-dis-crt-n-d-ohi-eas-div/2109381.html (Tier 5, case law). Accessed 2026-06-21.
Fora Soft. Online Proctoring and Anti-Cheating in 2026: Architecture, AI Detection, Privacy — six-layer hybrid reference architecture, cost model, and detector-accuracy findings. https://www.forasoft.com/blog/article/online-proctoring-anti-cheating-2026 (Tier 3, first-party engineering). Accessed 2026-06-21.

Where sources disagreed, the official standard or law won. Vendor framings of AI-only proctoring as compliant were overridden by the EU AI Act Annex III + Article 14 human-oversight requirement, which an AI-only pipeline cannot meet. The Annex III application date is in flux: originally 2 August 2026, deferred to 2 December 2027 by the May 2026 Digital Omnibus provisional agreement, effective only on formal adoption — cited as developing and to be confirmed with counsel. BIPA per-violation damages and the per-exam-hour cost figures are illustrative of current law and 2026 market rates; confirm against counsel and your own load tests.

Proctoring and Assessment Reference Design

Why This Matters

First, the One Idea That Organizes Everything

Gate 1: Identity — "Is This the Right Person?"

Gate 2: Locked-Down Delivery — "Was the Environment Controlled?"

Gate 3: Proctoring — "Was the Behaviour Legitimate?"

Gate 4: Scoring — "What Did They Earn?"

Gate 5: Grade Passback — "Where Does the Result Go?"

Gate 6: Credentialing — "What Can the Learner Prove Afterward?"

The Integrity Spine and the Compliance Umbrella

Putting It Together: How One Exam Actually Runs

Sizing the Build: The Four Stakes Tiers

The Cost and Per-Exam Arithmetic, Shown Out Loud

The Pitfalls That Define a Bad Build

Comparing the Build-vs-Buy Options Per Gate

Where Fora Soft Fits In

What to Read Next

Call to action

References

Related glossary terms

Proctoring and Assessment Reference Design

Why This Matters

First, the One Idea That Organizes Everything

Gate 1: Identity — "Is This the Right Person?"

Gate 2: Locked-Down Delivery — "Was the Environment Controlled?"

Gate 3: Proctoring — "Was the Behaviour Legitimate?"

Gate 4: Scoring — "What Did They Earn?"

Gate 5: Grade Passback — "Where Does the Result Go?"

Gate 6: Credentialing — "What Can the Learner Prove Afterward?"

The Integrity Spine and the Compliance Umbrella

Putting It Together: How One Exam Actually Runs

Sizing the Build: The Four Stakes Tiers

The Cost and Per-Exam Arithmetic, Shown Out Loud

The Pitfalls That Define a Bad Build

Comparing the Build-vs-Buy Options Per Gate

Where Fora Soft Fits In

What to Read Next

Call to action

References

Related glossary terms

Grade passback

cmi5

E-learning

WCAG

Rubric

Biometric data

Online proctoring

Anti-cheating