Cloud inference

Definition

Running models in a remote data centre reached over the network. Scales easily and uses big GPUs, but adds round-trip delay and per-call cost.

Cloud inference sends video or audio to a provider's servers, runs the model on powerful shared hardware, and returns the result. It removes the burden of owning GPUs but introduces network latency and a recurring bill that grows with usage.

Also known as

server-side inference

Specialist software house for video, real-time and AI products. Founded 2005. 50 in-house engineers.

Knowledge base

Blog Guides Courses Glossary Downloads

Company

Services Projects Demos Calculator Contacts

+852-8193-2621

Hong Kong

+1 (914) 775-5855

New York · USA

eager2develop@forasoft.com

Your message has been sent successfully

We will contact you soon

Message not sent. Please try again.

Cloud inference

Related terms

Edge AI

GPU

Latency