LTX Models (LTX-2, LTX-2.3) vs Competitors

Compare LTX Models to other top video generation models including pricing, features, and workflows.

Wan 2.2
HunyuanVideo 1.5
Sora 2
Veo 3.1
Kling 3.0
Seedance 2.0
Runway
Luma Ray 3
CogVideoX 1.5

LTX-2.3 vs Wan

LTX delivers native 4K at 50fps, audio-to-video, and on-prem deployment at $0.04/sec, built for production pipelines in 2026. Wan 2.2 is an open-source research model designed for experimentation, not enterprise output.

Wan 2.2

Developer

Lightricks
Alibaba (Wan-AI)

Parameters

22B
27B MoE (14B active)

Open Source

Yes
Yes

On-Prem

Yes (self-host)
Yes (self-host)

OUTPUT QUALITY

Native 4K Rendering

Yes (3840Γ—2160)
No (720p max)

Max Video Length

20 sec (Fast) / 10 sec (Pro)
~5 sec (81 frames @ 16fps)

Frame Rate (fps)

Up to 50 fps
16–24 fps

SPEED & COST

8 sec FHD Generation Time

~15 sec (H100 cloud)
~1–2 min (cloud API)

API Pricing
(per second of video)

$0.04/sec (Fast 1080p) $0.06/sec (Pro 1080p) $0.16/sec (Fast 4K) $0.24/sec (Pro 4K)
~$0.08/sec (fal.ai, 720p A14B)

Free Access

Yes – open-source + free Desktop app
Yes – self-host open weights

Subscription Plans
(non-API access)

Free (self-host & Desktop)
Free (self-host)

CAPABILITIES

Text-to-Video

Yes
Yes

Image-to-Video

Yes
Yes

Retake

Yes (LTX Retake)
No

HDR Output

Yes
No

Extend

Yes
No

LipDub

Yes
No

Audio-to-Video

Yes – native multimodal
Yes – via Speech-to-Video 14B variant

Multi-modal Inputs
(text + image + audio + video)

All four
Text + Image + Audio (via S2V variant)

Motion Control

Yes – full control
Limited

Character Consistency

Yes – via LoRA fine-tuning
Yes - via LoRA

Content Moderation / Limits

No limits (open source)
No limits (open source)

DEVELOPER & ENTERPRISE

LoRA / Fine-tuning

Yes – LoRA + IC-LoRA
Yes – LoRA

Fully Customizable

Yes
Yes

Runs on Consumer-Grade GPUs

Yes
Yes

ComfyUI / Diffusers Support

Yes
Yes

SUMMARY

Best For

Enterprise teams needing on-prem deployment, full model customization, IP protection, and zero marginal cost at scale
Teams self-hosting open-source models on their own infrastructure

LTX-2.3 vs HunyuanVideo

LTX is the enterprise video standard for 2026: native 4K, audio-to-video, on-prem deployment, and a production API at $0.04/sec. HunyuanVideo 1.5 tops out at 720p and lacks the speed and infrastructure readiness production teams require.

HunyuanVideo 1.5

Developer

Lightricks
Tencent

Parameters

22B
8.3B

Open Source

Yes
Yes

On-Prem

Yes (self-host)
Yes (self-host)

OUTPUT QUALITY

Native 4K Rendering

Yes (3840Γ—2160)
No (720p native; 1080p upscaled)

Max Video Length

20 sec (Fast) / 10 sec (Pro)
~5 sec (85–129 frames)

Frame Rate (fps)

Up to 50 fps
24 fps

SPEED & COST

8 sec FHD Generation Time

~15 sec (H100 cloud)
~1–2 min (H100 optimized)

API Pricing
(per second of video)

$0.04/sec (Fast 1080p) $0.06/sec (Pro 1080p) $0.16/sec (Fast 4K) $0.24/sec (Pro 4K)
~$0.075/sec (fal.ai, 720p)

Free Access

Yes – open-source + free Desktop app
Yes – self-host open weights

Subscription Plans
(non-API access)

Free (self-host & Desktop)
Free (self-host)

CAPABILITIES

Text-to-Video

Yes
Yes

Image-to-Video

Yes
Yes

Retake

Yes (LTX Retake)
No

HDR Output

Yes
No

Extend

Yes
No

LipDub

Yes
No

Audio-to-Video

Yes – native multimodal
No

Multi-modal Inputs
(text + image + audio + video)

All four
Text + Image

Motion Control

Yes – full control
Limited

Character Consistency

Yes – via LoRA fine-tuning
Limited

Content Moderation / Limits

No limits (open source)
No limits (open source)

DEVELOPER & ENTERPRISE

LoRA / Fine-tuning

Yes – LoRA + IC-LoRA
Yes – LoRA

Fully Customizable

Yes
Yes

Runs on Consumer-Grade GPUs

Yes
Yes

ComfyUI / Diffusers Support

Yes
Yes

SUMMARY

Best For

Enterprise teams needing on-prem deployment, full model customization, IP protection, and zero marginal cost at scale
Developers running open-source video generation on consumer GPUs

LTX-2.3 vs Sora

LTX gives enterprise teams full model ownership, on-prem deployment, LoRA fine-tuning, and native 4K at $0.04/sec with no lock-in. In 2026, the teams that own their models own their competitive advantage.

Sora 2

Developer

Lightricks
OpenAI

Parameters

22B
Undisclosed

Open Source

Yes
No

On-Prem

Yes (self-host)
No

OUTPUT QUALITY

Native 4K Rendering

Yes (3840Γ—2160)
No (720p Std; 1024p Pro)

Max Video Length

20 sec (Fast) / 10 sec (Pro)
15 sec (Plus) / 25 sec (Pro)

Frame Rate (fps)

Up to 50 fps
24 fps

SPEED & COST

8 sec FHD Generation Time

~15 sec (H100 cloud)
Not disclosed (cloud only)

API Pricing
(per second of video)

$0.04/sec (Fast 1080p) $0.06/sec (Pro 1080p) $0.16/sec (Fast 4K) $0.24/sec (Pro 4K)
$0.10/sec (Std 720p) $0.30/sec (Pro 720p) $0.50/sec (Pro 1080p)

Free Access

Yes – open-source + free Desktop app
No – free access removed Jan 2026; ChatGPT Plus ($20/mo) minimum required

Subscription Plans
(non-API access)

Free (self-host & Desktop)
ChatGPT Plus $20/mo (basic Sora only); ChatGPT Pro $200/mo (Sora 2 full access)

CAPABILITIES

Text-to-Video

Yes
Yes

Image-to-Video

Yes
Yes

Retake

Yes (LTX Retake)
No

HDR Output

Yes
No

Extend

Yes
No

LipDub

Yes
No

Audio-to-Video

Yes – native multimodal
Yes – audio-synced generation

Multi-modal Inputs
(text + image + audio + video)

All four
Text + Image + Audio

Motion Control

Yes – full control
Yes – camera controls

Character Consistency

Yes – via LoRA fine-tuning
Limited

Content Moderation / Limits

No limits (open source)
Strict (NSFW, real people & IP blocked; 3-stage pre/mid/post filter; C2PA metadata)

DEVELOPER & ENTERPRISE

LoRA / Fine-tuning

Yes – LoRA + IC-LoRA
No

Fully Customizable

Yes
No

Runs on Consumer-Grade GPUs

Yes
No – cloud only

ComfyUI / Diffusers Support

Yes
No

SUMMARY

Best For

Enterprise teams needing on-prem deployment, full model customization, IP protection, and zero marginal cost at scale
Consumers and ChatGPT subscribers generating short-form video

LTX-2.3 vs Veo 3.1

LTX delivers native 4K, on-prem deployment, and full LoRA customisation without Google Cloud dependency. For 2026 enterprise pipelines, model ownership and data sovereignty are non-negotiable.

Veo 3.1

Developer

Lightricks
Google DeepMind

Parameters

22B
Undisclosed

Open Source

Yes
No

On-Prem

Yes (self-host)
No

OUTPUT QUALITY

Native 4K Rendering

Yes (3840Γ—2160)
No (1080p native; 4K upscale available)

Max Video Length

20 sec (Fast) / 10 sec (Pro)
4–8 sec per generation (extendable to ~148 sec via Extend)

Frame Rate (fps)

Up to 50 fps
24 fps (default) / up to 60 fps

SPEED & COST

8 sec FHD Generation Time

~15 sec (H100 cloud)
~3 min (Standard); ~1 min (Fast)

API Pricing
(per second of video)

$0.04/sec (Fast 1080p) $0.06/sec (Pro 1080p) $0.16/sec (Fast 4K) $0.24/sec (Pro 4K)
$0.15/sec (Fast) $0.40/sec (Standard) $0.75/sec (Full / Veo 3.0) +~50% for audio

Free Access

Yes – open-source + free Desktop app
Limited – via Gemini app (requires Google AI Pro $19.99/mo); Flow tool has limited free credits

Subscription Plans
(non-API access)

Free (self-host & Desktop)
Google AI Pro $19.99/mo; Google AI Ultra (higher limits)

CAPABILITIES

Text-to-Video

Yes
Yes

Image-to-Video

Yes
Yes

Retake

Yes (LTX Retake)
No

HDR Output

Yes
No

Extend

Yes
No

LipDub

Yes
No

Audio-to-Video

Yes – native multimodal
Yes – native audio (dialogue, effects, music)

Multi-modal Inputs
(text + image + audio + video)

All four
Text + Image + Audio

Motion Control

Yes – full control
Yes – camera controls

Character Consistency

Yes – via LoRA fine-tuning
Yes – reference images (1–3 images)

Content Moderation / Limits

No limits (open source)
Strict (NSFW & violence blocked; SynthID invisible watermark on all outputs; visible watermark on most tiers)

DEVELOPER & ENTERPRISE

LoRA / Fine-tuning

Yes – LoRA + IC-LoRA
No

Fully Customizable

Yes
No

Runs on Consumer-Grade GPUs

Yes
No – cloud only

ComfyUI / Diffusers Support

Yes
No

SUMMARY

Best For

Enterprise teams needing on-prem deployment, full model customization, IP protection, and zero marginal cost at scale
Organizations already in the Google Cloud / Vertex AI ecosystem

LTX-2.3 vs Kling

LTX is the enterprise infrastructure choice for 2026: native 4K, on-prem deployment, and a first-party API at $0.04/sec with zero vendor dependency. Kling 3.0 is a closed cloud platform where your data and outputs remain outside your control.

Kling 3.0

Developer

Lightricks
Kuaishou

Parameters

22B
Undisclosed

Open Source

Yes
No

On-Prem

Yes (self-host)
No

OUTPUT QUALITY

Native 4K Rendering

Yes (3840Γ—2160)
No (1080p Std; claimed 4K Pro)

Max Video Length

20 sec (Fast) / 10 sec (Pro)
3–15 sec

Frame Rate (fps)

Up to 50 fps
24 fps (Std) / up to 60 fps (Pro)

SPEED & COST

8 sec FHD Generation Time

~15 sec (H100 cloud)
30–120 sec (cloud)

API Pricing
(per second of video)

$0.04/sec (Fast 1080p) $0.06/sec (Pro 1080p) $0.16/sec (Fast 4K) $0.24/sec (Pro 4K)
$0.084/sec (Std) $0.112/sec (Pro) $0.126/sec (with audio)

Free Access

Yes – open-source + free Desktop app
Limited – 66 free credits per day on free plan

Subscription Plans
(non-API access)

Free (self-host & Desktop)
Free (66 cr/day); Std $5.99/mo; Pro $29.99/mo; Premier $54.99/mo

CAPABILITIES

Text-to-Video

Yes
Yes

Image-to-Video

Yes
Yes

Retake

Yes (LTX Retake)
No

HDR Output

Yes
No

Extend

Yes
No

LipDub

Yes
No

Audio-to-Video

Yes – native multimodal
Yes – native (Omni)

Multi-modal Inputs
(text + image + audio + video)

All four
Text + Image + Audio (Omni)

Motion Control

Yes – full control
Yes – camera, motion brush

Character Consistency

Yes – via LoRA fine-tuning
Yes – Elements system

Content Moderation / Limits

No limits (open source)
Strict (no NSFW; no toggle; humans allowed; political/government content filtered; IP restricted)

DEVELOPER & ENTERPRISE

LoRA / Fine-tuning

Yes – LoRA + IC-LoRA
No

Fully Customizable

Yes
No

Runs on Consumer-Grade GPUs

Yes
No – cloud only

ComfyUI / Diffusers Support

Yes
No

SUMMARY

Best For

Enterprise teams needing on-prem deployment, full model customization, IP protection, and zero marginal cost at scale
Creators producing cinematic content with native audio

LTX-2.3 vs Seedance

LTX offers open weights, on-prem deployment, and native 4K at $0.04/sec with no third-party platform dependency. In 2026, running production on someone else's infrastructure is a risk most enterprises will not accept.

Seedance 2.0

Developer

Lightricks
ByteDance

Parameters

22B
Undisclosed

Open Source

Yes
No

On-Prem

Yes (self-host)
No

OUTPUT QUALITY

Native 4K Rendering

Yes (3840Γ—2160)
No (2K / 2048p max)

Max Video Length

20 sec (Fast) / 10 sec (Pro)
4–15 sec

Frame Rate (fps)

Up to 50 fps
24–30 fps

SPEED & COST

8 sec FHD Generation Time

~15 sec (H100 cloud)
1–5 min (cloud API)

API Pricing
(per second of video)

$0.04/sec (Fast 1080p) $0.06/sec (Pro 1080p) $0.16/sec (Fast 4K) $0.24/sec (Pro 4K)
$0.24/sec (Fast 720p fal.ai) $0.30/sec (Std 720p fal.ai) ~$0.14/sec (official Volcengine)

Free Access

Yes – open-source + free Desktop app
Limited – free daily credits on Dreamina (~120 cr/day after signup)

Subscription Plans
(non-API access)

Free (self-host & Desktop)
Free (daily credits); Dreamina $18/mo (intl); Jimeng ~$9.60/mo (China)

CAPABILITIES

Text-to-Video

Yes
Yes

Image-to-Video

Yes
Yes

Retake

Yes (LTX Retake)
Yes

HDR Output

Yes
No

Extend

Yes
No

LipDub

Yes
No

Audio-to-Video

Yes – native multimodal
Yes – native (Dual-Branch Diffusion)

Multi-modal Inputs
(text + image + audio + video)

All four
All four (up to 12 files)

Motion Control

Yes – full control
Yes – dolly zoom, rack focus, tracking

Character Consistency

Yes – via LoRA fine-tuning
Yes – multi-shot storytelling

Content Moderation / Limits

No limits (open source)
Moderate (real faces restricted; IP/brand content filtered; NSFW blocked; political content blocked)

DEVELOPER & ENTERPRISE

LoRA / Fine-tuning

Yes – LoRA + IC-LoRA
No

Fully Customizable

Yes
No

Runs on Consumer-Grade GPUs

Yes
No – cloud only

ComfyUI / Diffusers Support

Yes
No

SUMMARY

Best For

Enterprise teams needing on-prem deployment, full model customization, IP protection, and zero marginal cost at scale
Teams producing multi-shot narratives with native audio-video sync

LTX-2.3 vs Runway

LTX delivers native 4K, on-prem deployment, and LoRA fine-tuning at $0.04/sec with no subscription lock-in. Runway is a creative app at $0.25/sec, not enterprise infrastructure.

Runway

Developer

Lightricks
Runway

Parameters

22B
Undisclosed

Open Source

Yes
No

On-Prem

Yes (self-host)
No

OUTPUT QUALITY

Native 4K Rendering

Yes (3840Γ—2160)
No (720p native; 4K upscale +$0.02/sec)

Max Video Length

20 sec (Fast) / 10 sec (Pro)
5–10 sec

Frame Rate (fps)

Up to 50 fps
24 fps

SPEED & COST

8 sec FHD Generation Time

~15 sec (H100 cloud)
~30 sec (Gen-4 Turbo, i2v only); ~2–4 min (Gen-4 Std); ~2 min (Gen-4.5)

API Pricing
(per second of video)

$0.04/sec (Fast 1080p) $0.06/sec (Pro 1080p) $0.16/sec (Fast 4K) $0.24/sec (Pro 4K)
$0.05/sec (Gen-4 Turbo) $0.12/sec (Gen-4) $0.25/sec (Gen-4.5)

Free Access

Yes – open-source + free Desktop app
Limited – Basic plan (625 cr/mo, watermarked, 720p export only)

Subscription Plans
(non-API access)

Free (self-host & Desktop)
Basic (free, 625 cr/mo); Standard $12/mo; Pro $28/mo; Unlimited $76/mo

CAPABILITIES

Text-to-Video

Yes
Limited – Gen-4 Turbo: No (image required); Gen-4.5: Yes

Image-to-Video

Yes
Yes

Retake

Yes (LTX Retake)
No

HDR Output

Yes
No

Extend

Yes
No

LipDub

Yes
No

Audio-to-Video

Yes – native multimodal
No

Multi-modal Inputs
(text + image + audio + video)

All four
Text + Image

Motion Control

Yes – full control
Yes – camera controls, Act One

Character Consistency

Yes – via LoRA fine-tuning
Yes – Act One

Content Moderation / Limits

No limits (open source)
Strict (CSAM strictly blocked; NSFW blocked; impersonation blocked; humans allowed for legitimate use)

DEVELOPER & ENTERPRISE

LoRA / Fine-tuning

Yes – LoRA + IC-LoRA
No

Fully Customizable

Yes
No

Runs on Consumer-Grade GPUs

Yes
No – cloud only

ComfyUI / Diffusers Support

Yes
No

SUMMARY

Best For

Enterprise teams needing on-prem deployment, full model customization, IP protection, and zero marginal cost at scale
Filmmakers and VFX artists using cloud-based generation tools

LTX-2.3 vs Luma

LTX delivers native 4K, on-prem deployment, and full model ownership at $0.04/sec: the foundation enterprise teams are building on in 2026. Luma Ray 3 is cloud only at $0.38/sec with no customisation path.

Luma Ray 3

Developer

Lightricks
Luma AI

Parameters

22B
Undisclosed

Open Source

Yes
No

On-Prem

Yes (self-host)
No

OUTPUT QUALITY

Native 4K Rendering

Yes (3840Γ—2160)
No (1080p native; 4K HDR upscale)

Max Video Length

20 sec (Fast) / 10 sec (Pro)
5–18 sec (extendable via Luma Extend)

Frame Rate (fps)

Up to 50 fps
24 fps

SPEED & COST

8 sec FHD Generation Time

~15 sec (H100 cloud)
~30–60 sec (cloud)

API Pricing
(per second of video)

$0.04/sec (Fast 1080p) $0.06/sec (Pro 1080p) $0.16/sec (Fast 4K) $0.24/sec (Pro 4K)
~$0.10/sec (fal.ai 720p) ~$0.38/sec (official API 1080p)

Free Access

Yes – open-source + free Desktop app
Limited – free tier (30 gen/mo, watermarked, no commercial use)

Subscription Plans
(non-API access)

Free (self-host & Desktop)
Free (30 gen/mo watermarked); Plus $30/mo; Pro $90/mo; Ultra $300/mo

CAPABILITIES

Text-to-Video

Yes
Yes

Image-to-Video

Yes
Yes

Retake

Yes (LTX Retake)
No

HDR Output

Yes
No

Extend

Yes
No

LipDub

Yes
No

Audio-to-Video

Yes – native multimodal
No

Multi-modal Inputs
(text + image + audio + video)

All four
Text + Image

Motion Control

Yes – full control
Yes – keyframes, char reference

Character Consistency

Yes – via LoRA fine-tuning
Yes – character reference

Content Moderation / Limits

No limits (open source)
Strict (NSFW & deepfakes blocked; humans allowed for legitimate use; enterprise can request custom policy)

DEVELOPER & ENTERPRISE

LoRA / Fine-tuning

Yes – LoRA + IC-LoRA
No

Fully Customizable

Yes
No

Runs on Consumer-Grade GPUs

Yes
No – cloud only

ComfyUI / Diffusers Support

Yes
No

SUMMARY

Best For

Enterprise teams needing on-prem deployment, full model customization, IP protection, and zero marginal cost at scale
Post-production teams needing natural motion and HDR output

LTX-2.3 vs CogVideoX

LTX is the open-weights enterprise model for 2026: 22B parameters, native 4K, audio-to-video, and LoRA fine-tuning at $0.04/sec. CogVideoX 1.5 is a 5B research model capped at 768p, a starting point and not a production foundation.

CogVideoX 1.5

Developer

Lightricks
Zhipu AI (ZAI)

Parameters

22B
5B

Open Source

Yes
Yes

On-Prem

Yes (self-host)
Yes (self-host)

OUTPUT QUALITY

Native 4K Rendering

Yes (3840Γ—2160)
No (768p / 1360Γ—768)

Max Video Length

20 sec (Fast) / 10 sec (Pro)
~10 sec (161 frames @ 16fps)

Frame Rate (fps)

Up to 50 fps
16 fps

SPEED & COST

8 sec FHD Generation Time

~15 sec (H100 cloud)
~1 min (H100)

API Pricing
(per second of video)

$0.04/sec (Fast 1080p) $0.06/sec (Pro 1080p) $0.16/sec (Fast 4K) $0.24/sec (Pro 4K)
~$0.02/sec ($0.20/10-sec via fal.ai)

Free Access

Yes – open-source + free Desktop app
Yes – self-host open weights

Subscription Plans
(non-API access)

Free (self-host & Desktop)
Free (self-host)

CAPABILITIES

Text-to-Video

Yes
Yes

Image-to-Video

Yes
Yes

Retake

Yes (LTX Retake)
No

HDR Output

Yes
No

Extend

Yes
No

LipDub

Yes
No

Audio-to-Video

Yes – native multimodal
No

Multi-modal Inputs
(text + image + audio + video)

All four
Text + Image

Motion Control

Yes – full control
Limited

Character Consistency

Yes – via LoRA fine-tuning
Limited

Content Moderation / Limits

No limits (open source)
No limits (open source)

DEVELOPER & ENTERPRISE

LoRA / Fine-tuning

Yes – LoRA + IC-LoRA
Yes – CogKit

Fully Customizable

Yes
Yes

Runs on Consumer-Grade GPUs

Yes
Yes

ComfyUI / Diffusers Support

Yes
Yes

SUMMARY

Best For

Enterprise teams needing on-prem deployment, full model customization, IP protection, and zero marginal cost at scale
Developers fine-tuning lightweight open-source models on modest hardware

Video Generation Capabilities

Use LTX models across multiple video generation and editing workflows.

Text to Video

Generate cinematic video directly from text prompts. Control motion, composition, and visual flow using natural language.

TEXT INPUT
Woman in a fluffy pink coat standing in a field of pink and yellow flowers, soft overcast sky, calm confident pose

Image to Video

Animate still images into coherent video. Preserve visual identity while adding motion, transitions, and cinematic depth.

TEXT INPUT
Young man riding a bicycle on a rural road, leaning forward with intense focus, green fields and mountains in the background.
IMAGE INPUT

Video to Video

Edit and transform videos with precise control β€” refine scenes, enhance quality, and adjust motion while preserving continuity and character consistency.

Video Input
Open Pose

Audio to Video

Generate video directly from audio, where sound drives motion, timing, and scene structure. Ideal for music, voice, and audio-led storytelling.

IMAGE INPUT
rap-song.mp3

LTX API Pricing

Usage-based pricing by endpoint and output quality.

Model version
Model type
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Text-to-Video

LTX-2
Fast

Designed for quick iteration, previews, and fast creative exploration.

URL path:
/v1/text-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
  • 2560Γ—1440 β€” /sec
  • 3840Γ—2160 β€” /sec
Notes:
  • Same pricing applies for text input and pure prompt-based generation.

Text-to-Video

LTX-2
Pro

Optimized for higher fidelity and increased temporal stability. Best for production-ready output and final renders.

URL path:
/v1/text-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
  • 2560Γ—1440 β€” /sec
  • 3840Γ—2160 β€” /sec
Notes:
  • Deal for client-facing content or polished deliverables.
  • Higher compute level β†’ higher visual quality.

Text-to-Video

LTX-2.3
Pro

Optimized for higher fidelity and increased temporal stability. Best for production-ready output and final renders.

URL path:
/v1/text-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
  • 2560Γ—1440 β€” /sec
  • 3840Γ—2160 β€” /sec
Notes:
  • Deal for client-facing content or polished deliverables.
  • Higher compute level β†’ higher visual quality.

Text-to-Video

LTX-2.3
Fast

Designed for quick iteration, previews, and fast creative exploration.

URL path:
/v1/text-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
  • 2560Γ—1440 β€” /sec
  • 3840Γ—2160 β€” /sec
Notes:
  • Same pricing applies for text input and pure prompt-based generation.

Image-to-Video

LTX-2
Fast

Designed for quick iteration, previews, and fast creative exploration.

URL path:
/v1/image-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
  • 2560Γ—1440 β€” /sec
  • 3840Γ—2160 β€” /sec
Notes:
  • Same compute cost as Text-to-Video Fast.
  • Resolution and duration determine total cost.

Image-to-Video

LTX-2
Pro

For detailed, stable motion derived from a still image. Best for high-quality sequences, storytelling, and production use.

URL path:
/v1/image-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
  • 2560Γ—1440 β€” /sec
  • 3840Γ—2160 β€” /sec
Notes:
  • Uses the Pro rendering path for maximum fidelity.
  • Ideal when visual consistency is critical.

Image-to-Video

LTX-2.3
Pro

For detailed, stable motion derived from a still image. Best for high-quality sequences, storytelling, and production use.

URL path:
/v1/image-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
  • 2560Γ—1440 β€” /sec
  • 3840Γ—2160 β€” /sec
Notes:
  • Uses the Pro rendering path for maximum fidelity.
  • Ideal when visual consistency is critical.

Image-to-Video

LTX-2.3
Fast

Designed for quick iteration, previews, and fast creative exploration.

URL path:
/v1/image-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
  • 2560Γ—1440 β€” /sec
  • 3840Γ—2160 β€” /sec
Notes:
  • Same compute cost as Text-to-Video Fast.
  • Resolution and duration determine total cost.

Retake - Video Editing

LTX-2
Pro

Refine only the parts that need adjustment - no need to regenerate the whole video. Perfect for fixing scenes, adjusting elements, or improving localized areas.

URL path:
/v1/retake
Pricing:
  • 1920Γ—1080 β€” /sec
Notes:
  • Currently available in 1080p only.
  • Billed per second of input video.

Retake - Video Editing

LTX-2.3
Pro

Refine only the parts that need adjustment - no need to regenerate the whole video. Perfect for fixing scenes, adjusting elements, or improving localized areas.

URL path:
/v1/retake
Pricing:
  • 1920Γ—1080 β€” /sec
Notes:
  • Currently available in 1080p only.
  • Billed per second of input video.

Audio to Video (A2V)

LTX-2
Pro

Generate video directly from audio β€” where voice, music, and sound define structure, pacing, and motion.

URL path:
/v1/audio-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
Supported inputs:
  • Audio: WAV, MP3, M4A, OGG
  • Image (optional): PNG, JPEG, WEBP
Notes:
  • Billed per second of input audio.
  • Generates up to ~20 seconds per request.
  • Full-length videos can be created by chaining multiple requests.
  • Currently available in 1080p only.

Audio to Video (A2V)

LTX-2.3
Pro

Generate video directly from audio β€” where voice, music, and sound define structure, pacing, and motion.

URL path:
/v1/audio-to-video
Pricing:
  • 1920Γ—1080 β€” /sec
Supported inputs:
  • Audio: WAV, MP3, M4A, OGG
  • Image (optional): PNG, JPEG, WEBP
Notes:
  • Billed per second of input audio.
  • Generates up to ~20 seconds per request.
  • Full-length videos can be created by chaining multiple requests.
  • Currently available in 1080p only.

Beta - HDR Video Generation

LTX-2.3
Pro

Convert SDR video to 16-bit HDR for greater dynamic range and post-production flexibility β€” built for professional grading and finishing workflows.

URL path:
/v2/video-to-video-hdr
Pricing:
  • Up to 1920Γ—1080 β€” $0.20/sec
    (~7s max per request)
  • Up to 2560Γ—1440 β€” $0.40/sec
    (~4s max per request)
Notes:
  • Video-to-video only (SDR β†’ HDR)
  • Output delivered as per-frame 16-bit EXR (ZIP)
  • Billed per second of input video
  • Max duration depends on resolution tier (up to ~7s at 1080p)