LTX Release Notes
Stay up-to-date with the latest features, updates, and fixes—designed to keep your creative workflow seamless.
Video to Video
Three control modes. Infinite creative directions. Video to Video lets you lock in the structure that works — pose, depth, or edges — and transform the style completely. Upload a reference, choose your mode, and generate with precision.
ChatGPT Image 2.0
ChatGPT Image 2.0 is now live in LTX Studio. More detail, more accuracy, more creative control, all without leaving your workflow. Create exactly what you imagine — down to the last detail.
Async API expanded to all video generation endpoints (Beta)
New async endpoints: [.url-highlight]/v2/text-to-video[.url-highlight], [.url-highlight]/v2/image-to-video[.url-highlight], [.url-highlight]/v2/audio-to-video[.url-highlight], [.url-highlight]/v2/retake[.url-highlight], [.url-highlight]/v2/extend[.url-highlight]. Submit a job, poll for status, download from result.video_url when complete.
See Async Jobs for more details.
Canvas
Canvas introduces a collaborative infinite workspace in LTX Studio, where teams can generate, iterate, and align on concpets in real time or async, without switching tools. Combine image and video moodboards, visual ideation, and team collaboration in one place, and move seamlessly from concept to production.
Seedance 2.0
Seedance 2.0 is now available in LTX Studio. One of the most capable video generation models available brings sharper motion, stronger prompt adherence, and more cinematic output, all inside your existing LTX Studio workflow. Generate with precision and take your productions further without switching platforms.
HDR Video Conversion (Beta)
New [.url-highlight]/v2/video-to-video-hdr[.url-highlight] endpoint for converting SDR videos to HDR. Returns per-frame EXR images suitable for professional color grading and HDR rendering pipelines.
Introducing Async API support: submit a job, poll for status, download when complete. Currently only the HDR endpoint is supported; more endpoints will follow soon. See Async Jobs.
SDR to HDR
SDR to HDR turns any existing video into true High Dynamic Range - instantly. Richer colors, brighter highlights, and deeper shadows are added automatically, with no reshooting or manual grading required. Professional-quality output, ready for modern screens and production workflows, in seconds.
Video Upscale
Upscale takes any video, generated or uploaded, to 4K or 8K resolution directly in LTX Studio. With Topaz-powered models and full control over resolution and frame rate, you go from quick draft to broadcast-ready master without switching tools or starting over.
Editing Space
Editing Space brings all image editing tools into one unified environment inside LTX Studio. Brush any area and edit it directly by prompt, upscale with full slider controls across quality and speed, shift camera perspective across Rotation, Tilt, and Distance without prompting, and fine-tune light, color, and effects in real time - all in one continuous flow, without switching tabs or searching for what you need next.
Veo 3.1 Lite
Veo 3.1 Lite is now available in LTX Studio - Google׳s most cost-effective generation model, built for teams that need to move fast and scale without the cost overhead.
Veo 3.1 Fast Price Update
Veo 3.1 Fast just got more accessible. We've reduced the price, so teams can generate high-quality video at speed without the cost holding them back. More generation, same fast results.
HDR Video Generation (IC-LoRA)
[.model-v-highlight]LTX-2.3[.model-v-highlight]
New IC-LoRA trained on LTX-2.3-22b enabling 16-bit High Dynamic Range generation — supports both T2V/I2V generation natively in HDR, and SDR-to-HDR video conversion.
Key details:
- Full ACES color standard support; output is float16 EXR, scene-linear, ready for professional post-production workflows including re-exposure and complex color timing
- ComfyUI integration out of the box, and available via API as a video2video endpoint
- Backed by a research paper: LumiVid: HDR Video Generation via Latent Alignment with Logarithmic Encoding
Auto Top-Up
You can now configure auto top-up to automatically purchase credits when your balance falls below a threshold. Set a threshold and top-up amount, and the system handles the rest — no more manual credit purchases to keep your API running.
Video Editor Upgrades
LTX Studio's editor just got a major upgrade. Eight native controls - transitions, blending modes, speed, reverse, layer opacity, and more, bring production-ready editing into the same platform where you generate and build. No tab-switching. No third-party tools. Just one complete creative workflow.
22B Model · Native Audio · Portrait Video · Desktop Editor
[.model-v-highlight]LTX-2.3[.model-v-highlight]
Model
- New 22-billion parameter architecture (upgraded from 19B)
- Generates synchronized audio and video in a single forward pass
- Up to 4K resolution at 50 FPS, up to 20 seconds per clip
- Apache 2.0 license — fully open weights, commercial use allowed
Visual Quality
- Redesigned high-fidelity VAE — sharper edges, hair, fabric grain, chrome highlights
- Reduced texture drift across longer sequences
- Improved image-to-video quality and motion coherence
- Fixed Ken Burns over-application in image-to-video mode
Audio
- Upgraded HiFi-GAN vocoder for cleaner, higher-quality output
- Stereo audio at 24 kHz
- Improved audio reliability and coherence with visual content
New Features
- Native portrait (9:16) vertical video — no more cropping workarounds
- 24 and 48 FPS output options
- Last-frame interpolation support
- Spatial and temporal upscalers for enhanced detail and smoothness
- Desktop video editor for local generation on consumer hardware
Prompt Understanding
- 4× larger text connector for significantly better prompt adherence
- Improved multi-element scene handling
- Reduced prompt drift in later frames
LoRA Adapters Released
- LTX-2.3-22b-IC-LoRA-Motion-Track-Control
- LTX-2.3-22b-IC-LoRA-Union-Control (depth, pose, canny)
- LTX-2-19b-IC-LoRA-Detailer (also compatible with 2.3)
Model Checkpoints
[.url-highlight]ltx-2.3-22b-dev.safetensors[.url-highlight] [.url-highlight]ltx-2.3-22b-distilled.safetensors[.url-highlight]
LTX-2.3
Introducing LTX-2.3 — a major leap in quality and speed for AI video generation. LTX-2.3 is now the default model across the API, available in two variants:
- LTX-2.3 Pro ([.url-highlight]ltx-2-3-pro[.url-highlight]) — best-in-class quality, supports all endpoints including audio-to-video, retake, and extend.
- LTX-2.3 Fast ([.url-highlight]ltx-2-3-fast[.url-highlight] — blazing fast generation, supports text-to-video and image-to-video.
What’s new
- Sharper fine details — new latent space with an updated VAE delivers noticeably crisper output.
- Cleaner audio — improved data filtering reduces background noise and artifacts.
- Stronger image-to-video — more natural motion, fewer static clips, and better visual consistency.
- Better prompt understanding — improved text connector architecture for closer adherence to complex prompts.
- Last-frame interpolation — provide a first and last frame to the image-to-video endpoint, and the model generates the video in between.
- Portrait video support — native 9:16 vertical video generation across all resolutions.
- 24/48 FPS — new frame rate options alongside the existing 25/50 FPS.
Dubbing and Captions
Dubbing and Captions introduce built-in localization tools to LTX Studio, enabling teams to adapt a single video across multiple languages and markets within the same workflow.Generate once and localize seamlessly with synchronized dubbing and accurate captions, no external tools or vendors required. Available for Enterprise users only.
Motion Control
Motion Control introduces reference-based motion transfer directly inside the Gen Space. Upload a reference video and apply full-body movement and gestures to your character with high accuracy and consistency.
Nano Banana 2
Nano Banana 2 (Gemini 3.1 Flash Image) combines pro-level intelligence with lightning-fast image generation. Quickly turn concepts into polished visuals, maintain consistency across multiple iterations, generate precise and legible text for any asset, and create diagrams, storyboards, or data visualizations seamlessly.
Kling 3.0 Pro
Kling 3.0 Pro, Kuaishou’s most advanced AI video model to date, is now available in LTX Studio.It brings cinematic-quality text-to-video and image-to-video generation, with longer videos up to 15 seconds, smoother motion, and stronger visual consistency. Designed to support richer storytelling and more expressive creative output, Kling 3.0 Pro gives creators even more flexibility to turn ideas into high-impact video — directly within LTX Studio.
Extend Video Endpoint
New [.url-highlight]v1/extend[.url-highlight] endpoint for extending video duration by generating additional frames at the beginning or end.
Brand Kit
Brand Kit is now available for Enterprise customers, giving teams a centralized way to manage brand identities across LTX Studio. Creative Admins can define brand guidelines with logos, styles, fonts, products and more, as Elements and publish them for consistent use across projects, with project-level permissions to keep teams on-brand at scale.
Custom Styles as an Element
Define your visual style once and use it across your entire project. Save Custom Styles as Elements to turn real visual references into reusable aesthetics that guide every generation—keeping results consistent and on-brand, without complex text prompts.
Logos & Fonts as Elements
Brand consistency starts with saved assets. Upload logos and fonts as Elements, then apply them instantly across all generated images in your project. Maintain exact brand specifications, eliminate manual uploads, and ensure visual consistency from initial concepts through final delivery.
Brush
Meet Brush - a more intuitive way to edit images. By marking a specific area and guiding the change with a simple text prompt, you can make precise, focused edits exactly where they matter. Remove, add, or refine elements with greater control and speed, across all image models.
Z-Image
Alibaba's Tongyi Lab brings Z-Image, a speed-optimized text-to-image model, directly into LTX Studio's Gen Space. Generate photorealistic visuals with precise prompt control, maintain consistent results across iterations, and deliver high-quality images faster. Available on all tiers.
Audio-to-Video
Start with sound, end with video. Upload voice recordings, music, or sound effects and generate visuals that automatically sync to your audio's rhythm, tone, and pacing. Maintain consistent voice performance across scenes while LTX matches visuals to your audio timeline.
Upload Endpoint
New [.url-highlight]v1/upload[.url-highlight] endpoint for uploading assets via signed URLs.
Open-Source Release — Full Weights & Codebase
[.model-v-highlight]LTX-2[.model-v-highlight]
Release
- Full open-weights release on Hugging Face and GitHub
- First production-ready audio-video generation model with truly open weights
- Complete transparency in model architecture and training methodology
- Technical report published: LTX-2: Efficient Joint Audio-Visual Foundation Model (arXiv 2601.03233)
Model Variants
- Fast — optimized for speed, from $0.04/sec
- Pro — balance of speed and quality, from $0.08/sec
- Ultra — maximum cinematic quality, from $0.16/sec
Capabilities
- Text-to-video, image-to-video, video-to-video, audio-to-video, and more
- Multi-input conditioning: text, image, video, audio, depth maps, reference video
- 3D camera logic for sophisticated motion sequences
- Keyframe interpolation pipeline
- LoRA fine-tuning — motion, style, and identity training in under 1 hour
Resolutions & Output
- HD (720p) · FHD (1080p) · QHD (1440p) · UHD (2160p / 4K)
- Up to 10 seconds at launch (extended in later updates)
- Up to 50% lower compute cost vs. competing models
- Runs on consumer-grade GPUs
Model Checkpoints
[.url-highlight]ltx-2-19b-dev[.url-highlight] [.url-highlight]ltx-2-19b-dev-fp8[.url-highlight] [.url-highlight]ltx-2-19b-dev-fp4[.url-highlight] [.url-highlight]ltx-2-19b-distilled[.url-highlight] [.url-highlight]ltx-2-19b-distilled-lora-384[.url-highlight] [.url-highlight]ltx-2-spatial-upscaler-x2[.url-highlight] [.url-highlight]ltx-2-temporal-upscaler-x2[.url-highlight]
Storyboard Builder
The new Storyboard Builder converts briefs or scripts into shot-level storyboards, automatically creating reusable Elements for characters and objects. Model selection and aspect ratio are set at initiation to ensure consistency across all frames, from pitch through production.
Camera Motion Effects
Camera motion support for image-to-video and text-to-video endpoints, giving direct control over camera movement in generated videos. Options: [.url-highlight]dolly_in[.url-highlight], [.url-highlight]dolly_out[.url-highlight], [.url-highlight]dolly_left[.url-highlight], [.url-highlight]dolly_right[.url-highlight], [.url-highlight]jib_up[.url-highlight], [.url-highlight]jib_down[.url-highlight], [.url-highlight]static[.url-highlight], and [.url-highlight]focus_shift[.url-highlight].
Developer Console Public Launch
The developer console is now open to the public with self-service signup, automatic organization creation, and prepaid billing with credit top-up.
Color Picker
Perfect color matching for brand-consistent visuals across every generation. Color Picker lets you select any exact shade while prompting, ensuring your final images match your chosen palette without guesswork or iteration. Maintain precise brand colors across all outputs—available for Nano Banana Pro, Nano Banana 2, and FLUX.2.
FLUX.2 Max
FLUX.2 Max brings advanced precision to image generation and editing for professional workflows. Generate publish-ready product imagery with stronger prompt adherence, upscale low-resolution photos into high-detail visuals, create motion-picture-quality keyframes, and produce accurate color variations or new 3D scene views. Fast performance and unmatched consistency across complex, multi-reference edits.
Audio-to-Video API
New audio-to-video endpoint for generating videos driven by audio input, with dedicated prompt enhancement.
Retake
Directorial control doesn't end once a scene is rendered. Retake lets you select any segment within a video and regenerate just that moment while keeping everything else consistent.
Rephrase dialogue without rerecording, redirect emotion or pacing, and reimagine scene endings while preserving the performance and surrounding footage—all without breaking continuity or starting from scratch.
Retake (Video Editing) Endpoint
New retake endpoint for editing specific sections of existing videos using text prompts.
FLUX.2
Black Forest Labs—the team behind Stable Diffusion and FLUX.1—has launched FLUX.2, a completely reimagined text-to-image model trained for visual intelligence, not just pixel generation. FLUX.2 delivers production-ready images with photorealistic detail, accurate lighting, and real-world spatial logic, all integrated seamlessly into LTX's Gen Space.
Nano Banana Pro
Google's latest image model, Nano Banana Pro, brings advanced editing, improved reasoning, and superior text rendering directly into LTX's Gen Space. Enable multimodal editing with natural language and image references, generate sharper typography for clean, legible text, and produce higher-quality visuals with improved resolution and better consistency across iterations.
Elements
A new way to create, save, and tag the core parts of your scene — from characters and props to environments and objects. Elements make it easy to organize, tag, and reuse assets across your project, keeping every shot consistent and connected. Ideal for branded content, recurring characters, and maintaining continuity across AI video production.
Available now under 'Elements' in each Project in LTX, for Standard, Pro, and Enterprise users.
LTX-2 Announced — First Unified Audio-Video Foundation Model
[.model-v-highlight]LTX-2[.model-v-highlight]
Highlights
- First DiT-based audio-video foundation model with synchronized generation
- Audio and video generated in one coherent process — motion, dialogue, ambience, and music
- Native 4K fidelity at up to 50 FPS
- Up to 10 seconds of synchronized generation per clip
- Multi-keyframe conditioning for advanced creative control
- Up to 50% lower compute cost versus competing models
Veo 3.1
Google's latest video model, brings dual keyframe control, sharper visuals, and more grounded realism to your AI video workflow. Get better image quality, more cinematic precision, and smoother motion, all at the same pricing.
60-Second Generation · IC-LoRA Detailer · Distilled 2B Model
[.model-v-highlight]v0.9.8[.model-v-highlight]
New Features
- Extended video generation — up to 60 seconds (up from ~32 seconds)
- New 2B distilled model for lightweight deployment
- FP8 quantized weight options for memory-efficient inference
- Optimized for real-time generation on H100 GPU
LoRA Released
LTX-Video-ICLoRA-detailer-13b-0.9.8— enhances fine visual details, edges, and textures without altering composition- Full fine-tuning and LoRA training support for both 2B and 13B variants
- Control LoRA support: depth, pose, canny
Quality Improvements
- Improved prompt understanding and adherence
- Enhanced detail generation across all model variants
- Better overall motion quality
Distilled 13B — HD Video in 10 Seconds
[.model-v-highlight]v0.9.7[.model-v-highlight]
Performance
- HD video generation in ~10 seconds on H100 GPU
- Low-res preview available after just 3 seconds
- No classifier-free guidance required — faster inference
- Only 1GB VRAM required for LoRA variant
Model Checkpoints
[.url-highlight]ltxv-13b-0.9.7-distilled[.url-highlight] [.url-highlight]ltxv-13b-0.9.7-distilled-fp8[.url-highlight]
Resolution Expansion · 15× Faster Distilled Variant
[.model-v-highlight]v0.9.6[.model-v-highlight]
Improvements
- Expanded resolution support — default 1216×704 at 30 FPS
- 15× faster distilled model variant:
ltxv-2b-0.9.6-distilled - Stochastic inference support for diverse output generation
- Improved motion quality and fine detail generation
- Better overall video coherence
13B Model — Multi-Scale Rendering Pipeline
[.model-v-highlight]v0.9.7[.model-v-highlight]
New Features
- New 13B model with breakthrough prompt adherence and physical understanding
- Multi-scale rendering pipeline for enhanced detail at all resolutions
- Temporal upscalers for improved motion quality
- Spatial upscalers for enhanced resolution detail
Commercial License · Keyframes · Video Extension
[.model-v-highlight]v0.9.5[.model-v-highlight]
Licensing
- Commercial use now permitted under OpenRail-M license
New Features
- Keyframe conditioning — precise control over video content at specific timestamps
- Video extension — extend existing video clips
- Higher resolution capabilities added
- Improved VAE architecture for better temporal consistency
Ecosystem
- LTX-Studio web application launched alongside this release
- Online demo available for users without local GPU access
STG / PAG Support · macOS · Research Paper
[.model-v-highlight]v0.9.1[.model-v-highlight]
New Features
- Spatiotemporal Guidance (STG) implementation for improved quality control
- Perturbed Attention Guidance (PAG) support
- Improved timestep-conditioned VAE decoder for better temporal consistency
- CPU offloading for unused model components — lower peak memory usage
Platform Support
- macOS MPS (Metal Performance Shaders) support added
Research
- Research paper published alongside this release
Initial Release
[.model-v-highlight]v0.9.0[.model-v-highlight]
Capabilities
- Text-to-video generation
- Image-to-video generation
- 5-second video clips at 768×512 resolution
- 5-second videos generated in ~4 seconds on H100 GPU
- Focus on motion consistency and generation speed
Integrations
- ComfyUI integration
- Hugging Face model repository
- fal.ai hosted inference
- Native Python inference
License
- Released under OpenRail-M license