LTX-2 is a professional-grade video generation model built for long-form, high-fidelity output with precise creative control. It is designed to support real production workflows rather than short experimental clips.

What makes LTX-2 different from the LTXV model?

While the LTXV model focuses on creator-friendly generation and editing, LTX-2 is optimized for developers and production teams that require predictable performance, deeper control, and measurable specs.

Does LTX-2 offer a downloadable model for local use?

Yes. LTX-2 is available for download in supported configurations for local and private deployment.

Is LTX-2 available as an open-source model?

Yes. The open-source version is available on GitHub and Hugging Face, enabling customization, fine-tuning, and local execution.

How much does LTX-2’s API cost?

Pricing for the LTX-2 API depends on usage, resolution, and frame rate. Refer to the LTX-2 pricing page for detailed cost information regarding AI video generation API usage.

What video generation models are supported by LTX-2?

Supported variants include LTX-2-fast and LTX-2-pro, optimized for speed or maximum visual quality.

Does LTX-2 offer a free trial?

Yes. Developers can experiment with the model via the LTX-2 Playground to explore capabilities and test prompts before integrating via API.

Does LTX-2 integrate with ComfyUI or Fal?

Yes. LTX-2 supports integration with tools such as ComfyUI and Fal for custom pipelines and experimentation.

LTX-2.3: Introducing LTX's Latest AI Video Model

FAQs

What is LTX-2.3?

LTX-2.3 is the latest release of the LTX-2 model family, a diffusion transformer (DiT) foundation model that generates high-fidelity video and synchronized audio from a single model. It supports text-to-video, image-to-video, and audio-to-video generation at up to 1080p, including native portrait (9:16) video. LTX-2.3 is available both as an open-source model and through the LTX API.

What is the difference between LTX-2 and LTX-2.3?

LTX-2.3 brings four major improvements over LTX-2.

A redesigned VAE produces sharper fine details, more realistic textures, and cleaner edges.

A new gated attention text connector means prompts are followed more closely — descriptions of timing, motion, and expression translate more faithfully into the output.

Native portrait video support lets you generate vertical (1080×1920) content without cropping from landscape.

And audio quality is significantly cleaner, with silence gaps and noise artifacts filtered from the training set.

Is LTX-2.3 available as an open-source model?

Yes. LTX-2.3 model weights are freely available on HuggingFace under an open license. The release includes the base dev checkpoint, a quantized fp8 variant, and the distilled model for faster inference. Training code, ComfyUI custom nodes, and reference workflows are all available on the LTX-Video GitHub repository.

How much does LTX-2.3's API cost?

LTX-2.3 API usage is billed per second of generated video. Two model variants are available — ltx-2-3-fast for rapid iteration and ltx-2-3-pro for production-quality output — at both 720p and 1080p resolutions. Portrait and landscape generations are priced identically at the same resolution tier.

See the pricing page for current rates.

Does LTX-2.3 integrate with ComfyUI or Fal?

Yes, ComfyUI and Fal are fully supported. LTX-2.3 ships with updated ComfyUI custom nodes and reference workflows for text-to-video, image-to-video, and multi-stage generation with latent upscaling — available via the ComfyUI-LTXVideo repository.

Where can I find documentation for LTX-2.3?

Full documentation is available at docs.ltx.video, covering the API, open-source model setup, ComfyUI workflows, PyTorch integration, and usage guides for text-to-video, image-to-video, and portrait video. A migration guide for users upgrading from LTX-2 is also available there.

Does LTX-2.3 offer a downloadable model for local use?

Yes. Multiple checkpoints are available for local use on HuggingFace, including the full dev model (bf16), a quantized fp8 variant for lower VRAM setups, the distilled model for faster inference, and spatial and temporal latent upscalers.

Should I upgrade from LTX-2 to LTX-2.3?

Yes — LTX-2.3 delivers sharper output, better prompt adherence, cleaner audio, and significantly improved image-to-video across the board. The one exception: if your workflow relies on custom LoRAs, those will need to be retrained for the 2.3 latent space before you migrate. See the Migration Guide for details.

‍

LTX-2.3 Video Engine

Sharper Fine Detail

Tighter Prompt Adherence

Stronger Image-to-Video

Cleaner Audio

Now in Portrait. Native.

All LTX-2 Capabilities. Upgraded.

Audio to Video

20 sec Clip

50 FPS Performance

Native 4K 50 FPS

Generation Flows

Fast

Pro

Designed to be built on

LTX Desktop

Image to Video

Video to Video

From Local to Enterprise

Open Source

LTX API

Licensing Program

Designed to be built on

LTX Desktop

LTX MCP

LTX CLI