News

The Road Ahead for LTX-2

A look at the direction behind LTX-2 and the principles guiding its evolution, from stronger control to deeper alignment with real creative workflows.

Table of Contents:
Key takeaways:

The January drop is progress, not a finish line. It tackles immediate workflow friction, but deeper improvements are already in motion. Below is where we’re heading with the next LTX-2 version, planned for later this quarter.

*This post is about what’s coming. If you’re looking for what’s already available, the End-of-January LTX-2 drop post covers what shipped, why we built it, and how to use it.

What's coming

A new VAE for better fine-detail preservation

Fast generation in LTX depends on compressing video into a compact token space. That tradeoff works, but aggressive compression can cost fine detail.

The next LTX-2 version will introduce a new VAE that we’re training to preserve more of the original signal while keeping generation efficient. The goal is sharper textures, more stable fine structure, and less detail loss across longer sequences.

Improved consistency and fidelity to conditioning inputs

We’re working on improving how tightly the model adheres to its conditioning inputs, especially in image-to-video and retake workflows.

This work focuses on stronger alignment to reference frames, more consistent outputs across runs, and fewer unexpected drifts away from the source material.

Cleaner, more reliable audio

We’re working on reducing silent outputs and improving the overall stability of audio generation.

This includes lowering noise, reducing instability during generation, and making audio behavior more predictable in real workflows. The aim is to make audio something you can rely on, not something that requires repeated retries.

Meaningfully improved image-to-video behavior

We’re working on several image-to-video improvements driven directly by real-world usage:

  • Reducing frozen clips
  • Reducing low-motion or static outputs
  • Improving handling of scene changes and transitions

The focus is on motion that feels intentional and continuous, not technically valid but visually inert.

Better prompt understanding

We’re working on improving prompt understanding through updates to the text encoding connector and its integration with the rest of the pipeline.

This work is aimed at stronger prompt interpretation and adherence.

Stay tuned

We’ll keep building LTX the same way we shipped the January drop: in the open, grounded in real usage, and guided by feedback from people running the model in production and local setups.

The goal isn’t flashy demos. It’s a model that keeps delivering as you push it further, stress it harder, and build more ambitious workflows on top. That’s the direction we’re heading this quarter.

No items found.