Published 2026-06-01 · 24 min read · By Nikolay Sapunov, CEO at Fora Soft

Why This Matters

You have already picked a model, or you are about to, and now someone needs to answer two questions that sound simple and are not: "what will this cost us per month?" and "how long will the integration take?" Both questions have traps. The cost trap is that every vendor prices in a different made-up unit — Sora bills per second, Runway and Kling and Pika and Luma bill in credits that each convert to dollars differently, and a "700-credit plan" tells you nothing until you know what a clip costs in credits. The integration trap is the opposite: the work looks larger than it is, because all five APIs share one pattern, so a team that builds the pattern once can support all five with a thin adapter layer. This lesson is written for the engineering lead, product manager, or founder who has to scope that work and defend that budget. It builds directly on the generative-video landscape lesson, which explains what each model is and how to choose between renting and self-hosting; if you have not read it, start there, then come back here to integrate and cost the model you chose.

What "Closed-API Integration" Means In Practice

Start with the plain-language version. A closed video model lives on the vendor's servers; you never see its weights, the giant table of learned numbers that holds what the model knows. You reach it the same way you reach any web service: your code sends a request over the internet, the vendor's machines do the work, and you get a result back. Renting the result this way is called using a closed API — short for application programming interface, which is just the agreed set of messages a program sends to ask a service to do something.

The everyday analogy is ordering from a kitchen you cannot enter. You hand a written order through a window, the cooks you never see prepare the dish, and a finished plate comes back out. You do not need to know the recipe, own an oven, or hire a chef. You pay per plate. The price of that convenience is that you are at the mercy of the kitchen's menu, its prices, and its hours — and, as we will see, video kitchens are slow, so they do not hand the plate straight back. They give you a numbered ticket first.

One more term to define before we lean on it. Throughout this lesson, asynchronous — often shortened to "async" — describes any process where you ask for something and the answer arrives later, not in the same breath. A text message is asynchronous; a phone call is not. Every video API is asynchronous, and understanding why is the single most useful engineering fact in this lesson.

The One Pattern All Five APIs Share

Here is the fact that makes integrating five different vendors far less work than it sounds. You cannot call a video API the way you call a normal web service, where you send a request and get the answer back immediately. Generating a clip takes anywhere from ten seconds to several minutes — far longer than a normal web request is allowed to wait before it times out. So every video API, without exception, makes you use the same three-step asynchronous dance.

First, you submit the job: you send the prompt and the settings (resolution, length, model tier), and the API immediately hands back a job identifier — a ticket number — without the clip. Second, you wait and check: your code either keeps asking "is job 12345 done yet?" at sensible intervals — this is called polling — or it registers a web address the vendor calls the moment the clip is ready — this is called a webhook. Third, you collect the result: once the job's status reads "completed", you fetch the finished video file from a download link.

The dry-cleaner analogy from the landscape lesson holds exactly. You hand over the coat, take a numbered ticket, and either keep walking back to ask "ready yet?" (polling) or leave a phone number so they call you when it is done (webhook). Polling is simpler to build and fine for low volume; webhooks are more efficient and the right choice at scale, because they do not waste requests asking a job that has not finished.

[Image: Diagram comparing the asynchronous API call sequence across five generative-video vendors. The center shows one shared three-step pattern as a vertical spine: step one 'Submit job — POST a prompt + settings, receive a job ID instantly'; step two 'Wait — poll the status endpoint every 5 to 20 seconds, or register a webhook to be called when done'; step three 'Collect — once status is completed, fetch the finished MP4 from a download URL'. To the left and right, five labelled columns map each vendor's real endpoints onto the same three steps: OpenAI Sora shows 'POST /v1/videos', 'GET /v1/videos/{id}', 'GET /v1/videos/{id}/content'; Runway shows 'POST task', 'GET /v1/tasks/{id} (poll 5s+ with jitter)', 'download outputs'; Kling, Pika, and Luma show their queue-submit, queue-status, and result-fetch steps, most reached through the fal.ai or Replicate aggregator. A caption beneath reads: build the three-step pattern once behind a thin adapter; swapping vendors becomes a configuration change, not a rewrite.] Figure 1. The one asynchronous pattern every video API forces on you, with each vendor's real endpoints mapped onto the same three steps. The shape never changes; only the URLs do.

The practical consequence is the whole reason this lesson groups five vendors together: because the shape is identical, you build it once. Put a thin internal layer — an adapter — in front of the vendors, exposing your own "make a video" function to the rest of your product. Behind that function, each vendor is a small module that knows its own submit, poll, and fetch URLs. Swapping Sora for Kling, or failing over from a model that is down to one that is up, becomes a configuration change rather than a rewrite. This is not gold-plating; it is the difference between a feature that survives the next model release and one that breaks on it. The Sora API alone is scheduled to shut down on 24 September 2026, which means any team that hard-wired Sora's endpoints into their product is already facing a forced migration. A team that put Sora behind an adapter changes one config value.

Two gates belong in this pattern and are not optional in 2026. Before you submit, moderate the prompt — check the user's request against a safety filter, because once you submit the job you have paid for it whether the result is usable or not. After you collect, preserve the provenance label — the "made by AI" marking the vendor attaches — because re-encoding or editing the clip can strip it, and from August 2026 that label is legally required for European users. We cover both in the landscape lesson and in depth in the quality, cost, and disclosure lesson; here, just know the gates clip onto the pattern at submit-time and collect-time.

The Pricing Problem: Every Vendor Speaks A Different Currency

Now the hard part, and the reason most "AI video pricing" articles leave you more confused than before you read them. Each vendor invented its own pricing unit, and the units do not convert to each other on their face. Sora bills you per second of generated video, directly in dollars. Runway, Kling, Pika, and Luma all bill in credits — an internal token you buy in bulk — but a credit is worth a different amount of video at each vendor, and even within one vendor a credit buys less video at higher resolution, longer duration, or with audio turned on. A "700-credit plan" is meaningless until you know how many credits one clip costs.

The only honest way to compare is to convert everything to one unit. We use dollars per second of finished 1080p video, because it normalizes away the different clip lengths and the different credit values, and because 1080p is the resolution most products actually ship. Do the conversion once per vendor and the picture snaps into focus. Below we walk each of the five, give the real mid-2026 numbers, and show the arithmetic out loud so you can re-do it when the prices change — and they will, monthly.

A note on the credit math before we start. When a vendor sells credits, find two numbers: the dollar value of one credit (how much you pay per credit, usually visible in the per-month plan price divided by the credits it includes) and the credits per second of 1080p output (from the vendor's cost table). Multiply them and you have dollars per second. That is the entire trick, and it works for all four credit-based vendors.

Sora (OpenAI) — Priced Per Second, So The Math Is Easy

Sora is the simplest to cost because OpenAI bills in plain dollars per second, no credits to decode. As of mid-2026, the standard Sora 2 model costs about ten cents per second of 720p video. The higher-quality Sora 2 Pro model costs about thirty cents per second at 720p, rising with resolution — roughly fifty cents per second at 1024p and seventy cents per second at true 1080p. There is also a Batch tier at roughly half the standard price, in exchange for up to a 24-hour wait, useful for non-urgent bulk generation like overnight B-roll rendering.

Let us do one second-to-clip conversion out loud, because it is the pattern you will repeat for every vendor:

Sora 2 Pro, 1080p, one 10-second clip:
  cost = 10 seconds × $0.70 per second
       = $7.00 per clip

Seven dollars for a single ten-second premium clip is the number that should anchor your expectations — Sora 2 Pro at 1080p is the most expensive option in this lesson, and you would only reach for it when top-tier coherence and dialogue matter more than cost. The standard Sora 2 at ten cents a second is far gentler: a 720p eight-second clip is eighty cents.

The integration is the canonical async pattern. You POST /v1/videos with your prompt, size, and seconds, and receive a job object with an id and a status of queued. You then either poll GET /v1/videos/{id} until the status reads completed, or register a webhook to be notified, and finally fetch the MP4 from GET /v1/videos/{id}/content. The states you will see are queued, in_progress, completed, and failed — design your code to handle all four.

The one thing you must build into a Sora integration from day one is the model swap, because of that 24 September 2026 sunset date. Do not treat Sora as permanent infrastructure; treat it as one interchangeable module behind your adapter. When OpenAI's replacement endpoint ships, you change a config value, not your product.

Runway — Credits At One Cent Each, Best For Editing Footage

Runway prices in credits, and helpfully, in the developer API a credit is worth exactly one cent. That makes Runway's API the easiest credit-based vendor to reason about: multiply credits by $0.01 and you have dollars. From the published cost table, Gen-4 video generation costs about 12 credits per second, and the faster, lighter Gen-4 Turbo costs about 5 credits per second. Convert:

Runway Gen-4, one second of video (API):
  cost = 12 credits × $0.01 per credit
       = $0.12 per second

Runway Gen-4 Turbo, one second:
  cost = 5 credits × $0.01 = $0.05 per second

So a ten-second Gen-4 clip is about $1.20 through the API, and a Turbo clip about fifty cents. Those are mid-range prices — more than budget vendors, well below Sora 2 Pro.

Runway also sells consumer subscriptions, and the distinction matters when you scope a project. The subscription plans — Standard at about fifteen dollars a month for 625 credits, Pro at about thirty-five dollars for 2,250 credits, Unlimited at about ninety-five dollars — are billed per seat for people using Runway's web app, and their credits do not power the developer API. The API has its own organization-level billing where you buy credits at a cent each and move up usage tiers automatically as your 30-day spend grows. Build on the API billing, not the seat plans, for anything programmatic.

Runway's distinguishing strength, covered in the landscape lesson, is editing existing footage rather than generating from scratch — relighting a scene, removing an object, restyling a clip. If that is your use case, Runway is usually the first API to try, and its transparent one-cent credit makes the cost trivial to forecast. The integration is, again, the async pattern: you start a task, poll GET /v1/tasks/{id} — Runway recommends a polling interval of five seconds or more, with jitter and exponential backoff on errors — until the status reads SUCCEEDED, FAILED, or CANCELED, then download the outputs.

Kling (Kuaishou) — Credits That Change Value With Audio And Resolution

Kling is where the credit trap bites hardest, because its credit cost per second changes with both resolution and whether you turn on audio. From the Kling 3.0 cost table, generation ranges from about 6 credits per second at 720p with no audio to about 12 credits per second at 1080p with native audio. Through API aggregators, the real-money price lands around seven to fourteen cents per second depending on the tier — for example, one aggregator lists Kling at roughly $0.084 per second on its standard tier and $0.112 per second on its pro tier, both without audio.

Work a realistic example, because Kling's own marketing quotes credits and you need dollars:

Kling 3.0, 1080p + native audio, one 10-second clip, 3 iterations:
  per-clip credits ≈ 12 credits/sec × 10 sec = 120 credits
  with 3 iterations to get a keeper ≈ 360 credits

That 360-credit figure matters because Kling's consumer plans are sized in credits: Standard at about seven dollars a month gives 660 credits, Pro at about twenty-six dollars gives 3,000, and the top Ultra plan at about a hundred and eighty dollars gives 26,000. A single polished ten-second clip eating 360 credits means the seven-dollar plan yields fewer than two finished clips a month — the kind of arithmetic that surprises a product owner who saw "660 credits" and pictured hundreds of videos.

Kling's quality is genuinely top-tier on blind-preference leaderboards, and it has features no one else matches, including Motion Brush motion control and multi-character lip-sync. The integration trade-off, as the landscape lesson notes, is that Kling is a China-based vendor usually reached through a third-party aggregator rather than a polished first-party developer portal, which raises data-residency diligence. The async pattern is identical; only the submit and status URLs differ.

Pika — The Budget, High-Volume Option

Pika is the value play for products that generate many drafts and keep few. Its consumer plans run from a free Basic tier (80 credits a month, 480p, watermarked, no commercial use) up through Standard at about ten dollars a month for 700 credits with all resolutions and commercial rights, Pro at about thirty-five dollars for 2,300 credits, and Fancy at about ninety-five dollars for 6,000. A single ten-second 1080p clip costs roughly 80 credits. Convert using the Standard plan's credit value:

Pika credit value (Standard plan):
  $10 per month ÷ 700 credits = $0.0143 per credit

One 10-second 1080p clip:
  80 credits × $0.0143 ≈ $1.14 worth of plan credits

For programmatic use, Pika exposes an API through aggregators such as fal.ai; published figures put a batch of 100 clips at 1080p at roughly forty-five dollars, which lands near forty-five cents per clip — cheaper per clip than the consumer-plan credit math above, because aggregator pricing is usage-based rather than plan-bundled. Pika is not chasing the quality crown; it is chasing speed and price for use cases — social content, ad-variant testing, concept proofs — where you generate fifty candidates to publish three. When that is your workflow, paying tens of cents per draft instead of several dollars changes the economics of the whole feature.

Luma — Credits With A Painful HDR Premium

Luma's Dream Machine, running the Ray3 and newer Ray3.14 models, prices in credits whose value depends sharply on resolution and, above all, on high-dynamic-range output. High dynamic range, or HDR, means a much wider span between the darkest and brightest parts of the picture, closer to what a film camera captures — and Luma is the model to reach for when AI clips must sit next to professionally shot footage in a color-grading pipeline. But that capability carries a steep credit premium, and you must see it to scope a budget.

From Luma's cost table, Ray3 at 1080p standard dynamic range costs 330 credits for a 5-second clip; the same model at 540p with HDR plus EXR export costs 1,120 credits for 5 seconds — over three times as much for a lower resolution, because the HDR pipeline is the expensive part, not the pixel count. The newer Ray3.14 is markedly cheaper for standard output: 400 credits for a 5-second 1080p SDR clip. The arithmetic to internalize:

Luma Ray3, 540p HDR + EXR, one 5-second clip:
  = 1,120 credits  (the premium pro-pipeline path)

Luma Ray3.14, 1080p SDR, one 5-second clip:
  = 400 credits    (the everyday path — far cheaper)

The lesson in those two lines: at Luma, the HDR/EXR professional path costs roughly three times the standard path, so only pay for it when a colorist genuinely needs it. For a plain social clip, HDR is wasted credits. Luma also runs a separate Dream Machine API with its own billing — consumer credits do not transfer to the API — starting around thirty-two cents per generation. As with the others, the integration is the async submit-poll-collect pattern.

The Five At A Glance — Converted To One Honest Unit

Put all five vendors in one table, every price converted to the same unit, and the comparison finally becomes fair. The figures are mid-2026 snapshots and move often; re-verify before you commit budget. The right-hand column is the one to act on.

Vendor Pricing unit ≈ $/sec of 1080p video API integration Best fit
Sora 2 (OpenAI) per second (USD) ~$0.10 (720p std) POST /v1/videos → poll/webhook → fetch content Coherent scenes + dialogue; brand recognition
Sora 2 Pro (OpenAI) per second (USD) ~$0.70 (1080p) same as Sora 2 Top-tier quality when cost is secondary
Runway Gen-4 credits @ $0.01 ~$0.12 ($0.05 Turbo) POST task → poll GET /v1/tasks/{id} → download Editing existing footage (V2V); transparent cost
Kling 3.0 credits (resolution/audio-scaled) ~$0.08–$0.14 aggregator submit → poll → fetch Top blind-vote quality; motion control; lip-sync
Pika 2.5 credits (~$0.014 each) ~$0.05–$0.11 (aggregator) aggregator (fal.ai) submit → poll → fetch High-volume drafts at the lowest price
Luma Ray3 / Ray3.14 credits (HDR premium) SDR moderate; HDR/EXR ~3× Dream Machine API submit → poll → fetch Pro color pipelines needing HDR/EXR

Table 1. Five closed video APIs, every price converted to dollars per second of 1080p output so they compare honestly. The integration column is nearly identical across vendors — that is the point. Match the "best fit" column to your use case; the per-second column tells you what it costs once you have.

A Common Mistake: Comparing Subscription Plans Instead Of API Costs

Here is the pitfall that wrecks more cost estimates than any other in this category. A product team reads the vendors' pricing pages, sees "$15/month" next to "$95/month", and builds a budget around the cheap-looking subscription. Then they ship, and the bill is nothing like the estimate — because the consumer subscription and the developer API are two different products with two different prices, and a programmatic integration uses the API.

The subscription plans are sized for individual humans clicking buttons in a web app: a fixed monthly fee, a fixed credit allowance, and a watermark or commercial-use restriction on the cheaper tiers. They are irrelevant to a product that calls the API thousands of times a month on behalf of its users. The API is billed differently — usually usage-based, per second or per credit consumed, with no monthly seat fee but no ceiling either. Runway makes this explicit: its API credits, bought at a cent each under organization billing, are a separate pool from the seat-plan credits. Luma states outright that Dream Machine consumer credits do not transfer to the Dream Machine API.

The fix is a discipline: when you scope cost, ignore the subscription page entirely and find the API pricing page. Convert to dollars per second of the resolution you will actually ship. Then multiply by your real volume — including the regenerations users need, which the landscape lesson shows typically triples the naive estimate. A budget built on the subscription sticker price is a budget built on the wrong product.

Putting It Together — One Adapter, Five Vendors, A Predictable Bill

Pull the two halves of this lesson together and the engineering plan writes itself. The integration is one async pattern behind one adapter: a "generate video" function your product calls, with five small vendor modules behind it, each knowing its own submit, poll, and fetch URLs and its own credit-to-dollar conversion. The cost is one arithmetic you do per vendor — dollars per credit times credits per second, or just dollars per second for Sora — converted to a common unit and multiplied by real volume.

[Image: Decision and cost flow for routing a generative-video request across vendors behind one adapter. On the left, a box labelled 'Your product calls generateVideo()'. An arrow leads to a central diamond decision node 'Route by hard constraint': branches read 'editing existing footage -> Runway', 'needs HDR/EXR for color grade -> Luma', 'top quality + motion control -> Kling', 'high-volume cheap drafts -> Pika', 'coherent dialogue + brand -> Sora'. All branches converge into a single box 'Adapter: submit -> poll or webhook -> fetch', which sits above a cost ledger strip listing each vendor's converted dollars-per-second-of-1080p rate. A final arrow leads to 'Monthly cost = clips x seconds x $/sec x regenerations, capped by a hard spend ceiling'. A caption beneath reads: the routing decision is the only vendor-specific logic; the submit-poll-fetch pipeline and the cost formula are shared.] Figure 2. One adapter routes each request to the right vendor by hard constraint, then runs the same submit-poll-fetch pipeline. The cost formula is identical across vendors once each rate is converted to dollars per second.

Two design choices make this durable. First, set a hard spend ceiling — a monthly dollar cap and a maximum-regenerations-per-job limit — so a runaway feature or an abusive user cannot blow the launch budget. Video generation is iterative and users rarely keep the first clip; without a cap, the bill grows faster than anyone expects. Second, keep the routing logic — "editing footage goes to Runway, HDR goes to Luma" — as the only vendor-specific code in your product. Everything downstream of the routing decision, the submit-poll-fetch pipeline and the cost accounting, is shared. That separation is what lets you add a sixth vendor, or drop the one that just sunset its API, in an afternoon. We work the full cost model, including how to set and enforce the ceiling, in the real-cost-of-AI lesson.

Where Fora Soft Fits In

We build video products across OTT and Internet TV, video conferencing, e-learning, telemedicine, surveillance, and AR/VR, and in most of them the question is not "which model" but "how do we wire one in without the bill or the integration surprising us." The work that separates a demo from a product is the adapter layer, the per-vendor cost conversion, the spend ceiling, and the moderation and provenance gates — not the model call itself, which is a few lines. We treat the vendor as the swappable part and the plumbing around it as the engineering, because that is what survives the next price change or API sunset. A team that builds it this way spends an afternoon migrating when a vendor retires an endpoint; a team that hard-wired one vendor spends a sprint.

What To Read Next

Talk To Us / See Our Work / Download

  • Talk to a video engineer — bring your chosen model and your expected volume, and we will scope the adapter and tell you the monthly cost before you write code.
  • See our case studies — real video products we have shipped across OTT, conferencing, e-learning, telemedicine, and surveillance.
  • Download the Closed-API Pricing & Integration Cheat Sheet — a one-page reference with every vendor's converted dollars-per-second rate, the shared async pattern, the credit-to-dollar formula, and the integration checklist.

References

  1. OpenAI. "Video generation with Sora" and "Sora 2 Model" — Videos API endpoints (POST /v1/videos, GET /v1/videos/{id}, /content), job states, per-second pricing (Sora 2 ~$0.10/s; Sora 2 Pro ~$0.30–$0.70/s by resolution), Batch tier, and the 24 September 2026 sunset of the Videos API and sora-2 models. developers.openai.com/api/docs/guides/video-generation. Accessed 2026-06-01. (First-party vendor.)
  2. OpenAI. "Pricing — OpenAI API" — Sora 2 / Sora 2 Pro per-second rates and Batch tier discount. developers.openai.com/api/docs/pricing. Accessed 2026-06-01. (First-party vendor.)
  3. Runway. "API Pricing & Costs" — credits priced at $0.01 each in organization billing; Gen-4 ~12 credits/s, Gen-4 Turbo ~5 credits/s; subscription seat plans (Standard ~$15, Pro ~$35, Unlimited ~$95) separate from API billing. docs.dev.runwayml.com/guides/pricing. Accessed 2026-06-01. (First-party vendor.)
  4. Runway. "API Getting Started Guide" and "API Reference" — asynchronous task model: start task, poll GET /v1/tasks/{id} (interval ≥5 s, jitter, exponential backoff), SUCCEEDED / FAILED / CANCELED states, download outputs. docs.dev.runwayml.com. Accessed 2026-06-01. (First-party vendor.)
  5. Kuaishou / Kling AI. "Kling AI Pricing" and Kling 3.0 release notes — credit plans (Standard ~$6.99 / 660 cr, Pro ~$25.99 / 3,000 cr, Premier ~$64.99 / 8,000 cr, Ultra ~$180 / 26,000 cr); Kling 3.0 ~6 credits/s (720p, no audio) to ~12 credits/s (1080p + audio); Motion Brush motion control, multi-character lip-sync. klingapi.com/pricing. Accessed 2026-06-01. (First-party vendor.)
  6. fal.ai. "GenAI API Pricing" and "Asynchronous Inference" — queue-backed submit/poll/webhook model; per-second model output pricing (e.g., Kling standard ~$0.084/s, pro ~$0.112/s without audio; Wan 2.5 ~$0.05/s; Veo 3 ~$0.40/s); GPU-second billing (H100 ~$1.89/h). fal.ai/pricing; fal.ai/docs/documentation/model-apis/inference/queue. Accessed 2026-06-01. (Aggregator / first-party.)
  7. Pika Labs. "Subscription Pricing" — Basic (free, 80 cr, 480p, watermark), Standard ~$10/mo (700 cr, all resolutions, commercial), Pro ~$35/mo (2,300 cr), Fancy ~$95/mo (6,000 cr); ~80 credits per 10-second 1080p clip; API via aggregators (~$45 per 100 1080p clips). pika.art/pricing. Accessed 2026-06-01. (First-party vendor.)
  8. Luma AI. "Dream Machine Plans: Pricing and Credits" and "Dream Machine API prices" — Ray3 credit table (1080p SDR 330 cr / 5 s; 540p HDR+EXR 1,120 cr / 5 s); Ray3.14 cheaper SDR (1080p 400 cr / 5 s); separate Dream Machine API billing from ~$0.32 per generation; consumer credits do not transfer to API. lumalabs.ai/learning-hub; lumaai-help.freshdesk.com. Accessed 2026-06-01. (First-party vendor.)
  9. Replicate. Model API documentation — versioned per-model endpoints, predictable per-second GPU billing, prediction create/get polling model. replicate.com/docs. Accessed 2026-06-01. (Aggregator / first-party.)
  10. C2PA. "Content Credentials" technical specification — signed, tamper-evident provenance manifest model, referenced as the marking method for AI-generated output your pipeline must preserve. c2pa.org/specifications. Accessed 2026-06-01. (Official standard.)
  11. European Union. Regulation (EU) 2024/1689 (AI Act), Article 50 — transparency obligations for AI-generated and manipulated content, in force 2 August 2026; relevant to the provenance-preservation gate in the integration pipeline. eur-lex.europa.eu/eli/reg/2024/1689/oj. Accessed 2026-06-01. (Official regulation.)