Cohort analysis is an analytical technique that groups learners by a shared starting condition — enrolment date, hire date, department, assigned learning path, or platform version — and then compares how each group's behaviour and outcomes evolve over the same relative time period. In a learning context the most common use is comparing completion rates, pass rates, and watch-time patterns across successive course intakes: if the cohort that enrolled in January completes at 70 % but the March cohort completes at 45 %, that gap deserves investigation. Without cohort analysis, the decline might be mistakenly attributed to learner motivation rather than a content or platform change that occurred between the two intakes. The data for cohort analysis comes from an LRS (Learning Record Store) or LMS database, where xAPI statements carry an actor identifier and a timestamp that make cohort assignment possible; an ETL (Extract, Transform, Load) pipeline typically joins enrolment metadata with activity records to build the analysis table. The key distinction from simple aggregate metrics is timing: aggregate completion rate tells you the overall state today, while cohort analysis reveals whether the programme is improving, degrading, or stable across successive runs. A common trap is defining cohorts too narrowly — a cohort of three learners produces statistically meaningless variance. Defining cohorts too broadly is equally problematic, masking important subgroup differences such as the gap between technical and non-technical staff taking the same course.

