Throughput counts how many items a pipeline can process each second across all users. Serving stacks raise it by batching many requests together, which lifts total volume but can add delay to any single request — the classic latency-versus-throughput trade.