Transformer

Definition

The neural architecture behind modern AI, built on attention. It powers language models, vision transformers, and most multimodal systems.

The transformer processes a sequence by letting every element attend to every other, capturing context regardless of distance. Introduced for language, it now underlies LLMs, ViTs, and generative video, which is why one architecture name appears across the whole field.

Also known as

transformer architecture