Apr 28, 2026

How ElevenLabs and LTX Are Redefining AI-Powered Video Creation

Region:
Global
Company size:
201-500
Industry:
AI & Technology
Solution:
70%
of all video generated within 30 days was done using LTX A2V
2.3x
A2V outpaced every other model combined

Discover how ElevenLabs integrated LTX's Audio-to-Video API and what early adoption data reveals about where AI content creation is heading.

This is some text inside of a div block.

Introduction

When ElevenLabs integrated with LTX, the goal was to close a gap that creators had long worked around: the disconnect between AI-generated audio and AI-generated video. For the first time, a single workflow could take a voice, a sound, or a piece of audio and turn it directly into a finished video. No separate tools, no manual sync, no stitching assets together after the fact.

The results since launch have been hard to ignore.

ElevenLabs is an AI research and product company building tools that transform how people interact with technology, spanning voice, speech, sound, music, and video across 70+ languages. By integrating LTX's Audio-to-Video API, ElevenLabs launched their first third-party audio-to-video model directly in their platform, enabling users to move from audio to finished video without ever leaving ElevenLabs.


The Integration

The collaboration pairs ElevenLabs' expressive voice generation with LTX's performance-driven video synthesis, unlocking an audio-first workflow from audio to motion that doesn't exist anywhere else.

{{cs-quote-1}}

What makes this different isn't just convenience. For the first time, audio isn't retrofitted onto video after the fact, the video is generated from the audio itself. Voice, music, or sound effects define the scene structure, pacing, and emotion. The sound drives the visual. That's a fundamental shift in how AI-generated content gets made.

For the first time, audio isn't retrofitted onto video after the fact, the video is generated from the audio itself.

What the Integration Unlocks

Together, the two platforms cover the full audio-to-video pipeline. Creators across both are already putting it to work.

Agencies are feeding client voiceover scripts directly into A2V to generate pitch-ready video concepts in seconds, turning audio briefs into visual drafts before a production team is ever involved. Film and TV creators are using generated dialogue to drive pre-visualization, exploring scene pacing and performance from narrated scripts without manual shot planning. Educators and content creators are converting recorded audio (podcasts clips, tutorials, explainers) into animated visual assets ready to publish, with no editing tools required.

{{cs-data}}

The through-line: audio that was already being created now has somewhere to go.


The Impact

The industry's response was immediate. Within 30 days of launch, Audio-to-Video accounted for approximately 70% of all video generation activity on ElevenCreater, ElevenLabs' Image & Video generation platform, outpacing every other model combined by 2.3x. It didn't grow into the top spot. It started there.

{{cs-stats}}

That kind of adoption doesn't happen because a feature is novel. It happens because it removes a real bottleneck. When audio stops being something you sync after the fact and starts being something that generates the video itself, the entire production process compresses. Creators move faster. Concepts reach screens in seconds instead of days. And the gap between what you hear in your head and what you can show a client, or an audience, effectively disappears.

Concepts reach screens in seconds instead of days.

For creators and platforms building with AI, the message is clear: the workflows are ready. The question is whether you're built to take advantage of them.

{{cs-quote-2}}

70%
of all video generated within 30 days was done using LTX A2V
2.3x
A2V outpaced every other model combined
Agencies
Audio briefs → Pitch-ready video
Voiceover scripts feed directly into A2V to generate video concepts in seconds — before a production team is ever involved.
Film & TV
Narrated scripts → Pre-visualized scenes
Generated dialogue drives pre-visualization, exploring scene pacing and performance — no manual shot planning needed.
Educators
Audio → Publish-ready video
Voiceover scripts feed directly into A2V to generate video concepts in seconds — before a production team is ever involved.
Imogen Mulliner
Growth @ElevenLabs

"LTX is a driving force in AI video, so when the opportunity came up to work together, it felt like a no-brainer. Audio driving video, where sound becomes the control layer is the first of its kind in AI video generation. So when we launched LTX's model within ElevenCreative and saw such a notable volume in video generations, it wasn't a surprise. Creators were ready, they just needed the tools to catch up."

Luke Harries
Growth @ElevenLabs

“Exclusively providing our users with LTX’s unmatched audio to video generative capabilities enables our community to tap into their incredible creativity, and build professional-grade videos quickly. We are extremely excited about this partnership with Lightricks because we have always believed that AI should empower creators to quickly and easily get past technical roadblocks to achieve their full vision.”

//

Related Stories