Tutorials

Using LoRA Adapters with LTX-2.3: A Developer Guide

Use LoRA adapters with the LTX-2.3 open-source pipeline for custom AI video generation. Covers standard, audio-video, and IC-LoRA types with setup instructions.

LTX Team
Start Now
Using LoRA Adapters with LTX-2.3: A Developer Guide
Table of Contents:
Key Takeaways:
  • LTX-2.3 supports three LoRA types — standard (styles/effects), audio-video (joint modality training), and IC-LoRA (structural control via pose/depth/edge) — each with a different use case and application method in the pipeline CLI.
  • Standard and audio-video LoRAs load via the --lora-paths argument; IC-LoRA adapters use the ICLoraPipeline and take a reference video as input for control signal extraction.
  • Use the LTX-2.3 trainer for character or style LoRAs when off-the-shelf adapters don’t fit your use case — a minimum of 20–30 high-quality reference images and 500–2000 training steps covers most domain-specific customization needs.

You trained a LoRA adapter — or downloaded one from the community — and now you need to actually use it. LTX-2.3 supports multiple LoRA types with different application methods, and the documentation for how to load and combine them is spread across the repository. This guide consolidates the practical steps for using LoRA adapters with the LTX-2.3 — from the open-source pipeline CLI through the hosted API.

LoRA Types Supported by LTX-2.3

LTX-2.3 supports several LoRA types, each suited to different use cases:

Standard LoRAs are trained on video data to encode style, subject identity, or motion patterns. They apply directly to the generation process and influence the visual and motion characteristics of output. Community-trained LoRAs for character subjects, art styles, and specific motion types fall into this category.

Audio-Video LoRAs are trained jointly on audio and video data. They extend standard LoRA training to cover the audio stream as well as the video stream, enabling fine-tuning of both the visual and acoustic output together. This is relevant for workflows that require consistent audio characteristics — a character's voice, a specific environment's acoustic signature, or synchronized motion-sound correlations.

IC-LoRA adapters (Image-Conditioned LoRA) are structural control adapters. They don't encode style or content in the usual sense — they enable conditioning the video generation on structural signals extracted from a reference video: pose skeletons, depth maps, or edge maps. IC-LoRA adapters require the ICLoraPipeline and only work with the distilled model checkpoint.

Applying Standard and Audio-Video LoRAs

Via the Pipeline CLI

Standard LoRAs and audio-video LoRAs load through the pipeline CLI using the --lora-paths argument. Pass the path to your safetensors LoRA file alongside the standard pipeline arguments. For multiple LoRAs, pass multiple paths. LoRA scale controls the blend strength.

Key parameters:

--lora-paths: Path or list of paths to LoRA safetensors files

--lora-scale: Blend strength, typically 0.8–1.2. Higher values apply the LoRA more strongly; lower values blend it with the base model's behavior

Loading Multiple LoRAs

LTX-2.3 supports loading multiple LoRAs simultaneously. Pass multiple paths to --lora-paths. LoRA interactions are additive — both adapt the same model weights, and the effects combine. If two LoRAs were trained on conflicting objectives (e.g., two different character identities), the output may reflect an average rather than either specific intent. Verify multi-LoRA outputs and adjust scales to find the balance point.

ComfyUI Integration

LTX-2.3 integrates with ComfyUI through the official ComfyUI-LTX plugin. This provides a visual, node-based interface for LoRA-based video generation. LTX-2.3 integrates with ComfyUI through the official ComfyUI-LTX plugin and handles conversion between ComfyUI and native LTX-2.3 formats. For the full ComfyUI setup, see the plugin's documentation.

Using IC-LoRA Adapters

Available IC-LoRA Variants

LTX-2.3 includes three IC-LoRA adapters:

Union Control: Combines Canny edge, depth map, and pose skeleton conditioning in a single adapter. Supports multiple simultaneous control signals. Requires the most VRAM of the three.

Pose Control: Dedicated pose skeleton adapter. Extracts body joint positions from a reference video and transfers the motion to generated subjects. Best for motion transfer and character consistency across scenes.

Detailer: Enhances fine visual detail in generated output. Use when the base pipeline output lacks the level of detail needed for the final render.

IC-LoRA adapter weights are available on HuggingFace under the Lightricks organization.

Setting Up the ICLoraPipeline

IC-LoRA adapters use a different pipeline than standard LoRAs: the ICLoraPipeline. This pipeline is only compatible with the distilled model checkpoint (ltx-2.3-22b-distilled.safetensors). Do not use it with the dev checkpoint.

The ICLoraPipeline requires a reference video in addition to the standard text prompt. The reference video is the source from which control signals are extracted — pose skeletons for Pose Control, depth maps for depth conditioning, edge maps for Canny.

Key parameters specific to IC-LoRA:

ic-lora-strength: Controls how strongly the control signal influences the generation. Default 1.0. Lower values give the model more generation freedom; higher values enforce stricter adherence to the reference structure.

• IC-LoRA group selection: When using Union Control, you can select which conditioning signals to activate (Canny, Depth, Pose). Running only one group at a time reduces VRAM requirements.

VRAM Management for IC-LoRA

IC-LoRA adapters require additional VRAM on top of the base model footprint. On GPUs with limited VRAM, run only one IC-LoRA group at a time. The Union Control adapter supports selecting a single group (e.g., Pose only) to reduce memory usage. Disable unused preprocessors explicitly when configuring the pipeline.

Community LoRAs and Format Compatibility

Community-trained LoRAs for LTX-2.3 are shared through HuggingFace and CivitAI. When using community LoRAs, verify format compatibility before loading:

Standard LoRAs trained on LTX-2.3 are compatible with both the full and distilled checkpoints, provided the LoRA was trained at the same rank and dimension as the target model configuration

IC-LoRA adapters require the distilled checkpoint and the ICLoraPipeline specifically

Audio-video LoRAs require audio generation to be enabled in the pipeline to apply correctly

Format incompatibilities produce silent failures or incorrect output rather than explicit errors. If a LoRA isn't producing the expected results, verify the checkpoint type (dev vs distilled) and pipeline compatibility.

Training Your Own LoRA

The LTX-2.3 trainer provides configuration-driven LoRA training. Common use cases include training a character LoRA for consistent subject identity, a style LoRA for domain-specific visual aesthetics, or an audio-video LoRA for consistent audio-visual relationships.

Training prerequisites:

• Linux system with CUDA 13+ and Nvidia GPU

• Dataset: minimum 20-30 high-quality reference clips or images

• Steps: 500-2000 training steps depending on dataset size and desired fidelity

The trainer configuration file specifies model paths, dataset location, training hyperparameters, and output location. Start with the example configuration included in the trainer package and adjust for your dataset.

Using LoRAs with the LTX-2.3 API

For hosted generation at scale, the LTX-2.3 API handles inference without local GPU infrastructure. Standard LoRAs can be applied through the API by passing the LoRA download URL or HuggingFace path in the request payload. Check the API documentation for the current LoRA application parameters and supported formats.

Practical LoRA Workflow

LoRA adapters give you control over what LTX-2.3 generates without retraining the full model. For production use, the typical workflow is:

1. Identify whether you need style, character, motion, or structural control (IC-LoRA)

2. Check HuggingFace and the community for existing LoRAs that cover your use case

3. Test with the base pipeline before applying LoRAs to establish a baseline

4. Apply the LoRA at default scale, verify output, and adjust scale if needed

5. If no existing LoRA meets your requirements, train one using the LTX-2.3 trainer

For hosted generation at scale, the LTX-2.3 API handles inference without local GPU requirements. For the full open-source pipeline with local control, the open-source pipeline provides the complete LoRA application and training infrastructure.

No items found.