A Large Language Model (LLM) is a neural network with billions of parameters trained on large-scale text data to predict the next token in a sequence; through this training it acquires the ability to generate coherent prose, answer questions, summarize text, and follow instructions. In the e-learning context, LLMs are the engine behind AI tutors, automatic quiz generation, lecture summarization, and conversational feedback on assignments. The model does not retrieve facts from a database — it encodes statistical patterns from training data into its weights, which is why outputs can be confident but wrong (hallucination). Grounding an LLM with a RAG (Retrieval-Augmented Generation) layer is the standard mitigation: relevant course passages are fetched at inference time and injected into the prompt, giving the model accurate source material to work from. LLMs are sensitive to prompt design — how a question is phrased substantially affects the quality of the response — so well-engineered system prompts and few-shot examples are part of any production AI-tutor deployment. Cost and latency are real constraints: calling a hosted LLM API for every learner message adds per-query cost and round-trip latency that affects the user experience in real time. Smaller, fine-tuned models deployed on-premises are an alternative for organizations with strict data-residency requirements.

