There's a moment in 1Q84 I keep returning to.
Aomame is in an upscale taxi, locked inside a traffic jam on the Tokyo expressway. The driver tells her about an emergency staircase, then adds a strange line: "Things are not what they seem. There's always only one reality."
She makes a decision. Takes off her heels. Slips them into her bag. Climbs over the steel barrier. Starts walking down the staircase, barefoot through her stockings.
That scene is heavier than most action scenes I've read.
It's heavy because of the pressure before she moves, not because she moves. Aomame is on her way to carry out something she cannot be late for. Outside, the city has stopped. Inside her, a decision long since made, the kind you cannot take back once it's done. When her bare foot touches the cold metal for the first time, something has already changed.
The emotion in that scene doesn't appear out of nowhere.
It's generated by specific causes. Outer pressure, the dead traffic and the driver's quiet suggestion, meets inner pressure, a task waiting and a decision already made, at one precise point. When the causes are clear enough and strong enough, the emotion arrives as a consequence. It doesn't need to be performed.
That's the difficulty when trying to bring this kind of feeling into sound and image.
Music and image cannot rely on the slow accumulation of context that literature provides. They need to generate their own pressure. A sound can suggest weight. A frame can hold tension. But without a clear reason for the emotion to exist, both often become only atmosphere.
In this piece, I chose to work with a single static image rather than moving footage.
The emotional core of that moment felt closer to stillness than to motion. The decision, the weight, and the sense of stepping out of one reality and into another seemed better held in one frame than dispersed across many.
The music is built around a single pulse. Not as metaphor, but as something literal, a steady rhythm placed at the center, suggesting that even in apparent quiet, something remains alive and under tension.
This isn't an illustration of the scene Murakami wrote.
It is an attempt to isolate what the scene made me feel, then find the most direct way for that feeling to exist in sound and light. The process is slow because this kind of emotion isn't loud. It asks for attention more than stimulation.
Every decision, the angle of the image, the moment the pulse enters and recedes, has to return to one question: what is the cause of the feeling I'm trying to hold, and does this choice serve it.
Built for stereo. Mono playback is a different feeling.
@XFreeze I have a concern that's been puzzling me: why would most of the co-founders leave a company whose mission is to develop artificial intelligence for the benefit of humanity? Is it that those who left don't see it as a noble idea?
AI can make almost anything look beautiful. Sometimes beautiful enough to stop the scroll. It still can't make you remember it.
This isn't a limitation of AI. It's a problem with how it's being used.
Before talking about prompts or workflows, there's a question almost no AI video creator has asked themselves: what do you want the viewer to feel? Not what they see. What they feel. From which emotional state to which other state are you trying to move them? Without an answer to that, every decision after it, no matter how striking the image, is drifting without direction.
Think about a pop song. Strip the lyrics. What's left still tells a story. The verse builds expectation. The pre-chorus pushes tension higher. The chorus releases it. The bridge breaks the pattern so the final chorus lands harder than the first. No words required. The body responds before the mind catches up. But for any of that to work, the songwriter had to know where they wanted the listener to arrive before choosing a single note.
This isn't music theory. It's how the human nervous system processes experience with depth. Expectation. Tension. Release. The same shape runs through theatre, poetry, and the way stories have been told around fires for thousands of years. Aristotle wrote it down two thousand years ago in the Poetics. He didn't invent it. He simply noticed something already built into how humans receive the world.
We have eyes, ears, and a brain. None of that has changed in seventy thousand years. Any creator who ignores how these three actually receive experience will find their work sliding past the viewer, no matter how powerful the tool.
What's happening on X right now goes further than most realise. Many creators no longer work shot by shot. One prompt generates the entire storyboard. That storyboard is pushed straight into another model to render the video. No selection. No cutting. No emotional editing. The act of storytelling itself has been handed over to AI from beginning to end.
The question "what do I want the viewer to feel" hasn't been answered wrong. It was never asked.
The result is work that looks beautiful, moves smoothly, holds its colour, and arrives nowhere. Like hearing a sequence of pleasant chords with no progression. Each moment agreeable on its own. The whole thing flat.
AI doesn't replace the capacity to make work with depth. It only amplifies how you already think. When that thinking starts from a clear emotional intention and is shaped by structure, AI will take you further, faster. When it doesn't, AI just helps you make more beautiful things that no one remembers, faster than ever before.
Your entire creative workflow just collapsed into one infinite canvas.
In @imagine Agent Mode, you can brainstorm, write, generate and edit images, then turn them into videos without leaving the page.
Try it at https://t.co/UF5IvltrQc, on desktop.
I’ve been using X for 5 years, but mostly to read, not to post.
Recently, I changed jobs and finally have more time, so I decided to start creating content based on my 20+ years of experience.
After a week of posting original content, the result: I gained only 1 follower and a few hundred views in total.
Am I doing something wrong?
Making original work has never been easier. Yet many still choose to steal.
With today’s tools, a serious creator can produce a complete original video in just a few hours. AI handles the execution. What remains, and what has always been the hardest part, is the idea itself.
Twenty years as Creative Director, Art Director, DP and Director have taught me this: ideation is still the most expensive and irreplaceable part of any production. Technique can be accelerated. The way you see cannot.
I began making video in the early 2000s with a small Sony digital camera. There was no YouTube, no Vimeo, no instant sharing. Every clip I finished could only reach others by being burned onto a CD and handed to a friend.
It wasn’t that references didn’t exist. Films, music videos, photographs, and books were all around. What didn’t exist was the frictionless copy and paste culture we have today. To learn from something, I had to watch it repeatedly, take notes, internalize it, and then transform it into something that was truly mine. That high friction forced genuine digestion. You couldn’t simply lift and repost.
That era shaped a permanent reflex in me, and in most creators of my generation: if you’re going to put in the work, make sure the result actually belongs to you.
Now X is enforcing rules against reposting, clipping, and context-stripping. I don’t see this as a minor policy adjustment. It is a fundamental redefinition of value, finally closing the gap between those who create and those who merely redistribute.
For accounts like MontageMind that invest real thought and craft into original work, this is the first time in years the field is no longer tilted in favor of whoever can steal the fastest.
Making something that is truly yours has always been difficult. What’s new is that it is finally beginning to be rewarded.
https://t.co/or3EKiQA9P
Important clarification: the account that posted the “original video” on X is not the original creator or owner of that content, Anastasia is. The account you’re calling the “original” post is itself another aggregator that reposted someone else’s content as their own.
So to be very clear: neither Mario nor Context2x created or own this video, and neither should or will earn Revenue Share for it. Being the first account to repost someone else’s content on X does not make that account the original creator.
Accounts are still free to post content they didn’t create, and can monetize that in other ways if they want to. But Revenue Sharing is specifically designed to reward original, value-add content, content that gives people a reason to follow an account and keep coming back because they expect something unique from that creator, not recycled IP.
For a hundred years, cinema has expanded where cameras can go.
Drones. Probes. Motion control rigs. Deep water housings. Lenses small enough to slip through a keyhole.
The position of the camera is no longer the constraint.
What hasn’t changed is the optics.
Every cinema lens still belongs to a family of focal lengths designed around the human eye.
24mm. 35mm. 50mm. 85mm...
These numbers are not technical accidents. They are calibrated to match how the human visual system reads field of view, depth, compression, peripheral falloff.
A drone shot from a thousand feet up still uses optics tuned to human seeing.
The position changed.
The eye behind the position did not.
This is the constraint nobody talks about because it became invisible the moment cinema agreed on it.
AI is the first medium in the history of moving image that does not begin from a lens.
It begins from a description.
And a description has no eye behind it.
A description is not bound to 24mm or 85mm.
It does not have to obey the depth of field math that comes from a physical aperture.
A description can compose a shot at a scale no lens can resolve, with a falloff no glass can produce, holding a moment no shutter can catch.
For the first time, the visual medium is unhooked from the optics of the human eye.
Most creators using AI right now have not noticed this.
They prompt AI to generate shots that read as 35mm or 85mm because those are the looks that feel cinematic.
They use a tool with no lens to reproduce a vocabulary built around lenses.
They have new reedom, and they spend it imitating optics that were always a compromise with biology.
The opportunity is not to make better cinematic shots with AI.
That race will be won by whoever has the most compute.
The opportunity is to ask the question cinema could not ask for a hundred years:
What does the world look like through optics that have never existed?
From inside a structure no lens has entered.
At a scale no glass can focus.
With a depth of field that obeys no physical aperture.
From a position that has no eye behind it at all.
Every time AI generates an unfamiliar look and a creator says
“fix it, make it look like a normal lens,”
a possibility closes.
Every time AI generates an unfamiliar look and a creator says
“what is this showing me,”
a possibility opens.
Cinema has been telling stories through one optical system.
The system was never neutral.
It was always a perspective, calibrated to a single species of eye so common we mistook it for the world.
We are not obligated to keep telling stories through that lens.
The optics that limited us are no longer the only way to see.
The Impressionists didn't fight photography. They studied it.
When the camera arrived in 1839, most painters panicked. Photography reproduced reality more accurately than they could. The academy doubled down on realism, trying to out-compete the machine at its own game. That entire generation of painters is now forgotten.
A small minority did the opposite. They asked a different question. Not "how do we beat photography," but "what is photography showing us about seeing that painting has never noticed?"
Degas owned a camera. He used photographs as references. He developed a candid compositional style, painting figures as if caught unaware. Bodies cut off mid-leg, faces pushed to the edge of the frame, dancers seen from behind in unposed angles. Pre-photography painting placed subjects whole and centered. Photography revealed that real seeing is partial, off-balance, accidental. Degas painted that, and composition was never the same again.
In The Dance Class (1873), a spiral staircase occupies a quarter of the canvas. We see only the legs of dancers descending it. On the right edge, a dancer is cut off by the frame. The dancers, supposedly the main subject, are scattered across the room. The painting feels as if they are being photographed without knowing it, not posing. Composition that pre-photography painting would have refused on every count.
Then there is Dancer Posing for a Photographer (1875). The title says photographer but the camera is not in the frame. What Degas paints instead is the dancer in front of a mirror, checking her pose before the lens captures it. She is rehearsing herself for the photograph. Degas saw what photography had done. Not just to images, but to subjects. The dancer no longer simply dances. She prepares herself to be seen. The painter watching the new medium reshape the people inside it. He didn't look away.
The Impressionists studied light because photography forced them to. The camera made it impossible to keep painting "natural light" as a stylized convention. Monet painted the same haystack at different hours of the day. Light became the subject. That move came directly from contending with what the photograph was teaching.
So here's the question for anyone making AI video right now.
AI is the photography of our moment. The panic response is loud and visible. Some creators are trying to out-compete AI at its own game (faster generation, more output, better prompts). Others are converting completely, abandoning craft to chase prompts.
The third response is rarer. Study what AI is showing us about visual storytelling that cinema has never noticed.
What does AI reveal about composition that no physical camera could? What does it teach us about time, about register, about the boundaries between mediums? What is AI doing now, badly, that points toward something cinema hasn't built vocabulary for yet?
The Impressionists learned from the machine that scared their generation. They didn't copy it. They didn't fear it. They watched what it was doing to seeing, and let that change how they painted.
AI is doing something to seeing right now. Most creators are too busy panicking or copying to notice.
What is AI teaching you that cinema hasn't taught yet?
A question for anyone making AI video right now:
Do you study references, or do you study prompts?
Most creators I see on this platform spend hours collecting prompt techniques. Almost none spend the same hours building visual vocabulary. Studying composition. How light falls in a Caravaggio painting. How Wong Kar-wai used motion blur to make memory feel physical. Anything that wasn't made in the last six months.
Here's what's happening. The model is a translator. It takes your reference and renders it. If your reference is "cinematic, moody, dramatic lighting," the model gives you the average of every clip ever tagged that way. You'll get something competent. Generic. Same as everyone else typing the same words.
But if your reference is the exact chiaroscuro from "The Calling of St. Matthew," the way that single beam cuts diagonally across the room, the way faces emerge from darkness, now the model has something specific to translate. Your output stops looking like everyone else's. Not because the model got better. Because you got more specific.
The prompt isn't where craft lives. The reference behind the prompt is.
This isn't about becoming an art historian. It's about giving yourself something to draw from when you sit down to work. Every senior creative you respect has a private library in their head: paintings, films, photographs, architecture, fashion, design. They've spent years collecting these images. When they brief a project, the references are already there. They don't have to invent from nothing.
Most AI creators are inventing from nothing. Or worse, inventing from the same handful of viral clips everyone else is referencing this week.
Without that library, you're working with the same vocabulary as every other creator who watched the same tutorial. With it, you have a vocabulary almost no one else is using.
The library is open. Almost no one is in it.
Which one are you building next week?
Sameness is where creativity goes to die.
Not bad work. Not amateur work. Sameness. The clip that looks like a hundred others. The aesthetic that's already been done by everyone you follow.
Last week one prompt was making rounds on this platform: "4x4 grid of candid nostalgic iPhone photos of a young couple on vacation. Camera shake, amateur framing, emotional/vintage aesthetic." I watched dozens of accounts run it and post the result. The outputs were nearly identical. Same warm color register. Same compositional rhythm. Same emotional pitch. Each post framed as discovery.
This isn't an aesthetic. It's a recipe. And when a recipe gets copied at scale, what comes out the other end isn't creative work.
Here's what scares me about this pattern, and it has nothing to do with the clips themselves.
Every time a creator copies a prompt instead of writing their own, they skip the part of the work that actually builds them. The part where you sit with a blank page and have to ask: what do I want to see? What feeling am I after? What reference is alive in my head right now? That muscle is the entire job. And like any muscle, it atrophies the moment you stop using it.
The creators copying prompts today are training themselves to never need to think again. They'll get faster. They'll get more output. They'll also lose, slowly and without noticing, the one capacity that separates a creative mind from an operator. The capacity to imagine something nobody told them to imagine.
A year from now, when the prompts they've been copying stop working, they won't have anything to fall back on. The thinking muscle is gone. They've outsourced it.
The habit of stealing is more dangerous than the theft itself. The theft makes one bad clip. The habit makes a creator who can no longer create anything original, ever.
Everyone on this platform calls themselves creative. But if your work this week came from a prompt someone else wrote, that word doesn't apply yet.
The question isn't whether your last clip looked like someone else's.
The question is: when was the last time you sat with a blank page and figured out what you wanted to see, before anyone else told you what was working this week?
Until you can answer that, you're not creating. You're practicing secondhand seeing.