Dark | Light
[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

![karpathy Avatar](https://lunarcrush.com/gi/w:24/cr:twitter::33836629.png) Andrej Karpathy [@karpathy](/creator/twitter/karpathy) on x 1.4M followers
Created: 2025-06-02 20:22:17 UTC

Very impressed with Veo X and all the things people are finding on r/aivideo etc. Makes a big difference qualitatively when you add audio.

There are a few macro aspects to video generation that may not be fully appreciated:

X. Video is the highest bandwidth input to brain. Not just for entertainment but also for work/learning - think diagrams, charts, animations, etc.
X. Video is the most easy/fun. The average person doesn't like reading/writing, it's very effortful. Anyone can (and wants to) engage with video.
X. The barrier to creating videos is -> X.
X. For the first time, video is directly optimizable.

I have to emphasize/explain the gravity of (4) a bit more. Until now, video has been all about indexing, ranking and serving a finite set of candidates that are (expensively) created by humans. If you are TikTok and you want to keep the attention of a person, the name of the game is to get creators to make videos, and then figure out which video to serve to which person. Collectively, the system of "human creators learning what people like and then ranking algorithms learning how to best show a video to a person" is a very, very poor optimizer. Ok, people are already addicted to TikTok so clearly it's pretty decent, but it's imo nowhere near what is possible in principle.

The videos coming from Veo X and friends are the output of a neural network. This is a differentiable process. So you can now take arbitrary objectives, and crush them with gradient descent. I expect that this optimizer will turn out to be significantly, significantly more powerful than what we've seen so far. Even just the iterative, discrete process of optimizing prompts alone via both humans or AIs (and leaving parameters unchanged) may be a strong enough optimizer. So now we can take e.g. engagement (or pupil dilations or etc.) and optimize generated videos directly against that. Or we take ad click conversion and directly optimize against that.

Why index a finite set of videos when you can generate them infinitely and optimize them directly.

I think video has the potential to be an incredible surface for AI -> human communication, future AI GUIs etc. Think about how much easier it is to grok something from a really great diagram or an animation instead of a wall of text. And an incredible medium for human creativity. But this native, high bandwidth medium is also becoming directly optimizable. Imo, TikTok is nothing compared to what is possible. And I'm not so sure that we will like what "optimal" looks like.


XXXXXXXXX engagements

![Engagements Line Chart](https://lunarcrush.com/gi/w:600/p:tweet::1929634696474120576/c:line.svg)

**Related Topics**
[veo](/topic/veo)
[coins ai](/topic/coins-ai)
[coins entertainment](/topic/coins-entertainment)
[$googl](/topic/$googl)
[stocks communication services](/topic/stocks-communication-services)

[Post Link](https://x.com/karpathy/status/1929634696474120576)

[GUEST ACCESS MODE: Data is scrambled or limited to provide examples. Make requests using your API key to unlock full data. Check https://lunarcrush.ai/auth for authentication information.]

karpathy Avatar Andrej Karpathy @karpathy on x 1.4M followers Created: 2025-06-02 20:22:17 UTC

Very impressed with Veo X and all the things people are finding on r/aivideo etc. Makes a big difference qualitatively when you add audio.

There are a few macro aspects to video generation that may not be fully appreciated:

X. Video is the highest bandwidth input to brain. Not just for entertainment but also for work/learning - think diagrams, charts, animations, etc. X. Video is the most easy/fun. The average person doesn't like reading/writing, it's very effortful. Anyone can (and wants to) engage with video. X. The barrier to creating videos is -> X. X. For the first time, video is directly optimizable.

I have to emphasize/explain the gravity of (4) a bit more. Until now, video has been all about indexing, ranking and serving a finite set of candidates that are (expensively) created by humans. If you are TikTok and you want to keep the attention of a person, the name of the game is to get creators to make videos, and then figure out which video to serve to which person. Collectively, the system of "human creators learning what people like and then ranking algorithms learning how to best show a video to a person" is a very, very poor optimizer. Ok, people are already addicted to TikTok so clearly it's pretty decent, but it's imo nowhere near what is possible in principle.

The videos coming from Veo X and friends are the output of a neural network. This is a differentiable process. So you can now take arbitrary objectives, and crush them with gradient descent. I expect that this optimizer will turn out to be significantly, significantly more powerful than what we've seen so far. Even just the iterative, discrete process of optimizing prompts alone via both humans or AIs (and leaving parameters unchanged) may be a strong enough optimizer. So now we can take e.g. engagement (or pupil dilations or etc.) and optimize generated videos directly against that. Or we take ad click conversion and directly optimize against that.

Why index a finite set of videos when you can generate them infinitely and optimize them directly.

I think video has the potential to be an incredible surface for AI -> human communication, future AI GUIs etc. Think about how much easier it is to grok something from a really great diagram or an animation instead of a wall of text. And an incredible medium for human creativity. But this native, high bandwidth medium is also becoming directly optimizable. Imo, TikTok is nothing compared to what is possible. And I'm not so sure that we will like what "optimal" looks like.

XXXXXXXXX engagements

Engagements Line Chart

Related Topics veo coins ai coins entertainment $googl stocks communication services

Post Link

post/tweet::1929634696474120576
/post/tweet::1929634696474120576