GGUF

Definition

A single-file format for distributing quantized language and multimodal models, popular for running them efficiently on CPUs and consumer GPUs.

GGUF packs a model's weights, metadata, and tokenizer into one file built for the llama.cpp ecosystem. It is the common way to ship compact, quantized models that run on laptops and edge boxes without a heavyweight ML framework.

Also known as

GGML successor

Specialist software house for video, real-time and AI products. Founded 2005. 50 in-house engineers.

Knowledge base

Blog Guides Courses Glossary Downloads

Company

Services Projects Demos Calculator Contacts

+852-8193-2621

Hong Kong

+1 (914) 775-5855

New York · USA

eager2develop@forasoft.com

Your message has been sent successfully

We will contact you soon

Message not sent. Please try again.

GGUF

Related terms

Model artifact

Quantization

Open-weights model