What 36,388 AI Videos Reveal About How People Actually Use AI Video Generators

April 17, 2026

Most "state of AI video" numbers come from surveys, vendor benchmarks, or synthetic tests. What's rare is actual production data — real videos that real users built and rendered. That's what this is.

We analyzed a sample of 36,388 Agent Opus projects created by 11,416 distinct users between January 14 and February 23, 2026 — a six-week production window. All metrics are aggregate and anonymized. This is a sample, not our full dataset — a recent, clean slice of production behavior.

TL;DR: The 2026 Production Snapshot

Median video length: 43 seconds. p90: 103 seconds. AI video is vertical-social shaped, not YouTube shaped.
Avatar adoption: 24.1%. Voice cloning: 14.2%. Burned-in captions: 9.7%.
97.5% of projects feed in images. 46.9% pull from YouTube. Tweets are essentially irrelevant (<0.1%).
The median project uses 14 images, 4 YouTube clips, and 5 stock assets.
119 languages captured across the sample. English is 57% — the long tail is bigger than people assume.
"canon," "pastel," and "fuji" are the three most-used visual styles — cinematic, warm, documentary. Not what the "AI slop" discourse would predict.

The headline finding: most people who generate AI video are not trying to make a shorter YouTube video. They're trying to make a better TikTok.

Finding 1: AI Video Is Vertical-Social Shaped

The single most telling number in the dataset: the median video is 43 seconds long. The 90th percentile is 103 seconds. Only 4.7% of projects exceed 120 seconds.

Length bucket	Share
0–15s	2.6%
15–30s	14.1%
30–60s	54.2%
60–90s	14.8%
90–120s	9.6%
120s+	4.7%

More than half of all projects land in the 30–60 second bucket. That's the native TikTok / Reels / Shorts sweet spot — not the 5–10 minute YouTube length people often associate with "AI video."

The mental model of AI video as "automated YouTube" is wrong at the production layer. People are using these tools to make social-shaped content — the same length, roughly the same cadence, as the short-form videos we've all been trained to consume.

Finding 2: What People Actually Feed In

Asset type	Projects using	Share	Avg per project (when used)
Images	35,487	97.5%	14.1
Stock assets	28,158	77.4%	5.1
YouTube clips	17,048	46.9%	4.1
Articles (URL ingest)	3,462	9.5%	1.5
Tweets	23	<0.1%	1.1

Two surprises.

The image is the atom of AI video. Almost every project starts with images. The median creator uploads ~14 of them. Any AI-video tool without a first-class image ingestion flow is operating against the grain.

Tweets are dead as a video source. In 36K projects, only 23 pulled from tweets. Platform policy plus workflow friction has killed this as a primary behavior. Screenshot-and-upload still wins over URL-ingest for X content.

Finding 3: Avatars Beat Captions (And That's Weird)

Adoption of the three signature AI-video features in our sample:

Avatar usage: 24.1%
Voice cloning: 14.2%
Burned-in captions: 9.7%

Captions are the least-adopted of the three. This is genuinely surprising, because TikTok and Reels — the platforms these videos are made for — treat captions as table stakes. The creator-economy consensus is that silent autoplay demands captioned video.

A few interpretations:

Platform-native captions cover it. TikTok and Reels auto-generate captions. Creators may be relying on that layer.
AI voices are crisp. Unlike lo-fi creator audio, AI voiceover is clean — reducing the perceived need for captions.
The workflow is clunky enough to skip. If enabling captions adds friction, users route around.

We suspect (3). Caption adoption is a feature-discovery and feature-friction problem, not a user-preference problem. For any AI-video tool looking for a durable wedge, making burned-in captions frictionless — auto-on, style-templated — would hit a real creator pain point.

Finding 4: Feature Adoption Varies Massively by Niche

Niche	Avatar %	Voice-clone %	Caption %	Avg length
Lifestyle & Aesthetic	38.7%	6.7%	6.7%	39.5s
Tech & Innovation	34.8%	30.7%	10.4%	58.7s
Finance & Commerce	25.9%	16.8%	9.8%	57.7s
Trends & Commentary	19.8%	9.0%	8.3%	49.1s
Narrative & Documentary	15.8%	15.6%	13.9%	60.8s

Several patterns worth calling out:

Lifestyle & Aesthetic is the avatar category. Nearly 4 in 10 projects use an avatar — the "personal-brand-without-being-on-camera" use case.
Tech & Innovation leans hard on voice cloning. Developer and tech-explainer creators want their own voice at scale.
Narrative & Documentary is the only category where captions cross 10%. Long-form narrative benefits from readability.
Narrative is the longest bucket (60.8s avg). Lifestyle is the shortest (39.5s). Longer videos correlate with more editorial intent.

Finding 5: The "AI Slop" Aesthetic Is a Minority

Visual style	Projects	Share	Avatar use
canon	6,309	17.3%	24.4%
pastel	6,139	16.9%	19.7%
fuji	5,951	16.4%	27.1%
eggshell	4,741	13.0%	32.6%
parchment	4,273	11.7%	18.6%
graffiti	3,704	10.2%	19.4%
center	2,651	7.3%	15.5%
ugc	1,595	4.4%	43.3%

If you read the AI-slop discourse, you'd expect a sea of over-saturated synthetic imagery. What the data actually shows is cinematic, warm, documentary as the dominant modes. canon, pastel, fuji, and eggshell together are nearly two-thirds of the sample.

ugc (mimicking amateur phone-shot creator video) is only 4.4% — smaller than most people assume. When it's used, it's the style with the highest avatar adoption (43.3%). That fits: if you're imitating talking-head creator content, you want a face.

The visual output people are generating trends toward realism and warmth, not away from it.

Finding 6: Language Diversity Is the Biggest Story Nobody's Telling

Language	Projects	Share
en-US	11,370	45.0%
en-GB	3,070	12.2%
nl-NL	2,200	8.7%
fr-FR	979	3.9%
de-DE	966	3.8%
ru-RU	815	3.2%
it-IT	507	2.0%
uk-UA	477	1.9%
zh-CN	466	1.8%

Of the top 15 languages, nine are non-English. Two things jump out.

Dutch punches way above its weight. The Netherlands has ~17 million native speakers — well under 1% of the global population. It's 8.7% of projects in our sample. Dutch creators are AI-video early adopters at a rate wildly out of proportion to their population.

Ukrainian is present at 1.9%. A country at war, with creators still making content. This is a cultural note that deserves more attention than "AI is automating English slop."

Finding 7: The Use-Case Matrix

Crossing niche with use case reveals what people are actually making. The dominant use case across every niche is educational explainer:

Narrative & Documentary → Educational Explainers: 5,545 projects (1,797 users)
Finance & Commerce → Educational Explainers: 2,300 projects (1,139 users)
Tech & Innovation → Educational Explainers: 1,890 projects (815 users)
Lifestyle & Aesthetic → Educational Explainers: 1,428 projects (623 users)

Not entertainment, not performance marketing, not viral stunts — just "explain something." AI video is most useful when an idea needs to be transmitted quickly, not when it needs to be felt.

What This Implies for the Industry

The format is short-form social, not long-form YouTube. Building AI-video tooling for "automated podcasts" or "long-form summaries" is swimming against gravity.
Image input is non-negotiable. 97.5% of projects use images. Any AI-video tool without great image-ingest is operating against the grain.
Captions are under-adopted. Frictionless captions — auto-on, high-contrast, style-templated — would hit a real gap.
Aesthetic is warmer than the discourse implies. AI slop is a real visual mode but it's not where production volume is concentrated.
Language is a moat. 119 languages captured. English-only tools cede huge share.
Educational explainer is the killer use case. Build for explainers first, everything else second.

What This Data Can't Tell You

Post-publish performance. We have production data, not distribution data. A project that rendered well might never get shared.
Pro vs hobbyist segmentation. We kept the analysis fully anonymized — no plan-tier cuts.
Generalization to other AI-video tools. This is one platform, one window. Treat our numbers as "what Agent Opus production looks like in early 2026" — not "what AI video creation looks like everywhere."

Methodology

Based on a sample of 36,388 Agent Opus projects from 11,416 distinct users created between January 14 and February 23, 2026 — a six-week production window. All queries were run against production BigQuery tables; all metrics are aggregate-level, anonymized, and exclude PII. Scene and shot counts come from dwd_storyboard_shot. Project-level metrics come from dim_derived_project (production environment only). This is a subset of Agent Opus's project data — not the full dataset. Different windows, different product states, and different user cohorts may produce materially different numbers.

Start Creating

Agent Opus turns scripts, images, and source material into cinematic AI video — storyboard, assets, avatars, voice, and render in one end-to-end flow. Try Agent Opus →

Frequently Asked Questions

What's the average length of an AI-generated video in 2026?

In our sample of 36,388 Agent Opus projects, the median video length is 43 seconds and the 90th percentile is 103 seconds. Only 4.7% of projects exceed 120 seconds. AI video production is concentrated in the short-form social format (TikTok, Reels, Shorts), not long-form YouTube.

How many people use AI avatars in their videos?

In our sample, 24.1% of projects use an avatar. Adoption varies dramatically by niche: Lifestyle & Aesthetic creators use avatars 38.7% of the time, while Narrative & Documentary creators use them just 15.8% of the time.

What inputs do people use to create AI videos?

97.5% of projects feed in images (median 14 per project). 77.4% use stock assets. 46.9% pull from YouTube clips. Tweets and article URLs are minority inputs. If you're building an AI-video tool, a strong image-ingest flow is non-negotiable.

How many languages are represented in AI video production?

Our sample captured 119 languages. English is 57% of projects (en-US and en-GB combined). Dutch (nl-NL) is 8.7% — dramatically outsized relative to the Netherlands' population. Ukrainian creators represent 1.9% of projects despite the ongoing war.

Use our Free Forever Plan

Find the moment. Skip the scrubbing.

From script to polished video — in one click.

Create and post one short video every day for free, and grow faster.

OpusSearch uses AI to surface the exact clip you need from hours of footage — in seconds, not afternoons.

Agent Opus runs the entire video pipeline for you: research, scriptwriting, storyboarding, motion, voice, and edit. Upload the idea, post the result.

Try OpusClip

Try OpusSearch free

Generate a video free

Try OpusClip

Try OpusSearch free

Generate a video free

Try OpusClip

Try OpusSearch free

Generate a video free

Try OpusClip

Try OpusSearch free

Generate a video free

What 36,388 AI Videos Reveal About How People Actually Use AI Video Generators

TL;DR: The 2026 Production Snapshot

Median video length: 43 seconds. p90: 103 seconds. AI video is vertical-social shaped, not YouTube shaped.
Avatar adoption: 24.1%. Voice cloning: 14.2%. Burned-in captions: 9.7%.
97.5% of projects feed in images. 46.9% pull from YouTube. Tweets are essentially irrelevant (<0.1%).
The median project uses 14 images, 4 YouTube clips, and 5 stock assets.
119 languages captured across the sample. English is 57% — the long tail is bigger than people assume.
"canon," "pastel," and "fuji" are the three most-used visual styles — cinematic, warm, documentary. Not what the "AI slop" discourse would predict.

The headline finding: most people who generate AI video are not trying to make a shorter YouTube video. They're trying to make a better TikTok.

Finding 1: AI Video Is Vertical-Social Shaped

The single most telling number in the dataset: the median video is 43 seconds long. The 90th percentile is 103 seconds. Only 4.7% of projects exceed 120 seconds.

Length bucket	Share
0–15s	2.6%
15–30s	14.1%
30–60s	54.2%
60–90s	14.8%
90–120s	9.6%
120s+	4.7%

More than half of all projects land in the 30–60 second bucket. That's the native TikTok / Reels / Shorts sweet spot — not the 5–10 minute YouTube length people often associate with "AI video."

Finding 2: What People Actually Feed In

Asset type	Projects using	Share	Avg per project (when used)
Images	35,487	97.5%	14.1
Stock assets	28,158	77.4%	5.1
YouTube clips	17,048	46.9%	4.1
Articles (URL ingest)	3,462	9.5%	1.5
Tweets	23	<0.1%	1.1

Two surprises.

Finding 3: Avatars Beat Captions (And That's Weird)

Adoption of the three signature AI-video features in our sample:

Avatar usage: 24.1%
Voice cloning: 14.2%
Burned-in captions: 9.7%

A few interpretations:

Platform-native captions cover it. TikTok and Reels auto-generate captions. Creators may be relying on that layer.
AI voices are crisp. Unlike lo-fi creator audio, AI voiceover is clean — reducing the perceived need for captions.
The workflow is clunky enough to skip. If enabling captions adds friction, users route around.

Finding 4: Feature Adoption Varies Massively by Niche

Niche	Avatar %	Voice-clone %	Caption %	Avg length
Lifestyle & Aesthetic	38.7%	6.7%	6.7%	39.5s
Tech & Innovation	34.8%	30.7%	10.4%	58.7s
Finance & Commerce	25.9%	16.8%	9.8%	57.7s
Trends & Commentary	19.8%	9.0%	8.3%	49.1s
Narrative & Documentary	15.8%	15.6%	13.9%	60.8s

Several patterns worth calling out:

Lifestyle & Aesthetic is the avatar category. Nearly 4 in 10 projects use an avatar — the "personal-brand-without-being-on-camera" use case.
Tech & Innovation leans hard on voice cloning. Developer and tech-explainer creators want their own voice at scale.
Narrative & Documentary is the only category where captions cross 10%. Long-form narrative benefits from readability.
Narrative is the longest bucket (60.8s avg). Lifestyle is the shortest (39.5s). Longer videos correlate with more editorial intent.

Finding 5: The "AI Slop" Aesthetic Is a Minority

Visual style	Projects	Share	Avatar use
canon	6,309	17.3%	24.4%
pastel	6,139	16.9%	19.7%
fuji	5,951	16.4%	27.1%
eggshell	4,741	13.0%	32.6%
parchment	4,273	11.7%	18.6%
graffiti	3,704	10.2%	19.4%
center	2,651	7.3%	15.5%
ugc	1,595	4.4%	43.3%

The visual output people are generating trends toward realism and warmth, not away from it.

Finding 6: Language Diversity Is the Biggest Story Nobody's Telling

Language	Projects	Share
en-US	11,370	45.0%
en-GB	3,070	12.2%
nl-NL	2,200	8.7%
fr-FR	979	3.9%
de-DE	966	3.8%
ru-RU	815	3.2%
it-IT	507	2.0%
uk-UA	477	1.9%
zh-CN	466	1.8%

Of the top 15 languages, nine are non-English. Two things jump out.

Ukrainian is present at 1.9%. A country at war, with creators still making content. This is a cultural note that deserves more attention than "AI is automating English slop."

Finding 7: The Use-Case Matrix

Crossing niche with use case reveals what people are actually making. The dominant use case across every niche is educational explainer:

Narrative & Documentary → Educational Explainers: 5,545 projects (1,797 users)
Finance & Commerce → Educational Explainers: 2,300 projects (1,139 users)
Tech & Innovation → Educational Explainers: 1,890 projects (815 users)
Lifestyle & Aesthetic → Educational Explainers: 1,428 projects (623 users)

Not entertainment, not performance marketing, not viral stunts — just "explain something." AI video is most useful when an idea needs to be transmitted quickly, not when it needs to be felt.

What This Implies for the Industry

The format is short-form social, not long-form YouTube. Building AI-video tooling for "automated podcasts" or "long-form summaries" is swimming against gravity.
Image input is non-negotiable. 97.5% of projects use images. Any AI-video tool without great image-ingest is operating against the grain.
Captions are under-adopted. Frictionless captions — auto-on, high-contrast, style-templated — would hit a real gap.
Aesthetic is warmer than the discourse implies. AI slop is a real visual mode but it's not where production volume is concentrated.
Language is a moat. 119 languages captured. English-only tools cede huge share.
Educational explainer is the killer use case. Build for explainers first, everything else second.

What This Data Can't Tell You

Post-publish performance. We have production data, not distribution data. A project that rendered well might never get shared.
Pro vs hobbyist segmentation. We kept the analysis fully anonymized — no plan-tier cuts.
Generalization to other AI-video tools. This is one platform, one window. Treat our numbers as "what Agent Opus production looks like in early 2026" — not "what AI video creation looks like everywhere."

Methodology

Start Creating

Agent Opus turns scripts, images, and source material into cinematic AI video — storyboard, assets, avatars, voice, and render in one end-to-end flow. Try Agent Opus →

Frequently Asked Questions

What's the average length of an AI-generated video in 2026?

How many people use AI avatars in their videos?

What inputs do people use to create AI videos?

How many languages are represented in AI video production?

Creator name

Creator type

Team size

Channels

Pain point

Time to see positive ROI

About the creator

Don't miss these

No items found.

How Audacy Drove 1B+ Views by Taking a Tech-Forward Approach to Radio with OpusClip

How All the Smoke makes hit compilations faster with OpusSearch

YouTube

Growth

How All the Smoke makes hit compilations faster with OpusSearch

Growing a new channel to 1.5M views in 90 days without creating new videos

YouTube

Growth

TL;DR: The 2026 Production Snapshot

Finding 1: AI Video Is Vertical-Social Shaped

Finding 2: What People Actually Feed In

Finding 3: Avatars Beat Captions (And That's Weird)

Finding 4: Feature Adoption Varies Massively by Niche

Finding 5: The "AI Slop" Aesthetic Is a Minority

Finding 6: Language Diversity Is the Biggest Story Nobody's Telling

Finding 7: The Use-Case Matrix

What This Implies for the Industry

What This Data Can't Tell You

Methodology

Start Creating

Frequently Asked Questions

What's the average length of an AI-generated video in 2026?

How many people use AI avatars in their videos?

What inputs do people use to create AI videos?

How many languages are represented in AI video production?

On this page

Use our Free Forever Plan

Find the moment. Skip the scrubbing.

From script to polished video — in one click.

TL;DR: The 2026 Production Snapshot

Finding 1: AI Video Is Vertical-Social Shaped

Finding 2: What People Actually Feed In

Finding 3: Avatars Beat Captions (And That's Weird)

Finding 4: Feature Adoption Varies Massively by Niche

Finding 5: The "AI Slop" Aesthetic Is a Minority

Finding 6: Language Diversity Is the Biggest Story Nobody's Telling

Finding 7: The Use-Case Matrix

What This Implies for the Industry

What This Data Can't Tell You

Methodology

Start Creating

Frequently Asked Questions

What's the average length of an AI-generated video in 2026?

How many people use AI avatars in their videos?

What inputs do people use to create AI videos?

How many languages are represented in AI video production?

Creator name

Creator type

Team size

Channels

Pain point

Time to see positive ROI

About the creator

Don't miss these

How Audacy Drove 1B+ Views by Taking a Tech-Forward Approach to Radio with OpusClip

How All the Smoke makes hit compilations faster with OpusSearch

Growing a new channel to 1.5M views in 90 days without creating new videos

Boost your social media growth with OpusClip

Related blogs

How OpusClip saves marketing agencies 40 hours monthly and boosts productivity 8X

How OpusClip helps marketing agencies boost revenue by 148%

Valuetainment Gained 512K New Subscribers in 90 Days Using OpusClip

What 36,388 AI Videos Reveal About How People Actually Use AI Video Generators

TL;DR: The 2026 Production Snapshot

Finding 1: AI Video Is Vertical-Social Shaped

Finding 2: What People Actually Feed In

Finding 3: Avatars Beat Captions (And That's Weird)

Finding 4: Feature Adoption Varies Massively by Niche

Finding 5: The "AI Slop" Aesthetic Is a Minority

Finding 6: Language Diversity Is the Biggest Story Nobody's Telling

Finding 7: The Use-Case Matrix

What This Implies for the Industry

What This Data Can't Tell You

Methodology

Start Creating

Frequently Asked Questions

What's the average length of an AI-generated video in 2026?

How many people use AI avatars in their videos?

What inputs do people use to create AI videos?

How many languages are represented in AI video production?

Ready to start streaming differently?

On this page

Try OPUS today

About the Author

Derek Coleman

Related blogs

How to Start Streaming in 2023: The Ultimate Guide

7 Common Mistakes When Live Streaming and How to Avoid Them