@icreatelife Me. It's a fun hobby. There are far too many people who are much better than I am, for me to fantasize much about making any money off it.
@mvalsmith@fuckpoasting How is it artificial? Seems more like a supply and demand problem due to data center appetite sucking up the orders. Chip manufactures are already at capacity, and it takes years and billions to build new ones.
I agree that better optimization will continue, but RAM supply is what's keeping costs high. Data centers are gobbling up the supply faster than chip manufactures can ramp up.
I would be great if some magic wand can turn a $1000 computer into an AI workhorse, but I'm not aware of anything like that on the horizon.
@sflorimm Running locally. No subscription, unlimited use, no worry about frontier models disappearing overnight due to politics. That's the new advantage.
Except it's moving fast in the wrong direction.
Six months ago, an RTX 5090 cost ~$1,999-$2,500, while current prices are around $4,000-$4,300+ for new cards due to supply shortages, high AI demand, and GDDR7 memory constraints.
I don't see that situation improving anytime within the next few years.
Too bad RAM is so expensive! It's insane how much the prices have risen even over the last few months. A year ago, I could have afforded an RTX 5090, but not now. Unfortunately, I don't see the situation improving any time soon.
I am running on an RTX 4070 with 8GB VRAM. It really isn't adequate for serious LTX work, but my new RTX 5060 Ti with 16 GB VRAM will be arriving soon. Hopefully double the VRAM (and 64 GB system RAM) will be sufficient to move forward using LTX more.
Testing Grok Imagine Video 1.5 audio speech quality and lip sync. I have three examples:
1. Imagine output unchanged.
2. Imagine audio processed through ComfyUI CosyVoice
3. Same prompt with LTX-2.3 running locally.
Summary: Grok Imagine 1.5 voice quality is about the same as it was before. Out of 6 renders, this is the only one that came close to quality lip-syncing. Most of the other renders were garbled beyond recognition.
Note: LTX succeeded on the first render.
Processing the Imagine audio through CosyVoice helps improve the overall voice quality and provides consistency between clips.
Prompt: Close-up shot of a man in his late 30s speaking clearly and naturally to camera with perfect lip sync. He says the exact words in an excited voice: "Hi, I am testing the lip sync and voice quality on Grok Imagine version 1.5 right now to see how well it works." Clean simple background, cinematic color grading.
@imagine@xai@grok@ComfyUI@ltx_io
@michaelricks I'll be trying Ingredient as soon as I finish my new 5060Ti build. It will still be short on VRAM, but given current insane pricing, I can't afford a 5090. Hopefully I'll see some improvement over my existing 4070. Double the VRAM can't hurt. π
Possibly in a couple of years AI will be indistinguishable from real, but not yet. It depends on style. AI cartoons are great, because there is no expectation for them to look real in the first place.
However, it the intention is to look real, physics and the rest have to be flawless, because humans are able to pick up on the slightest flaws - and AI video is still rife with subtle and not so subtle inconsistencies.
I'm guessing that the best approach for full-length movies is a hybrid approach.
Use AI video for special effects and scenes that would be hard/expensive to do otherwise but still use actors and physical camera work whenever possible for the sake of realism.
At least that is how I see the current AI state of the art.