This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as slideshows, etc.
More generally, imo audio is the human-preferred input to AIs but vision (images/animations/video) is the preferred output from them. Around a ~third of our brains are a massively parallel processor dedicated to vision, it is the 10-lane superhighway of information into brain. As AI improves, I think we'll see a progression that takes advantage:
1) raw text (hard/effortful to read)
2) markdown (bold, italic, headings, tables, a bit easier on the eyes) <-- current default
3) HTML (still procedural with underlying code, but a lot more flexibility on the graphics, layout, even interactivity) <-- early but forming new good default
...4,5,6,...
n) interactive neural videos/simulations
Imo the extrapolation (though the technology doesn't exist just yet) ends in some kind of interactive videos generated directly by a diffusion neural net. Many open questions as to how exact/procedural "Software 1.0" artifacts (e.g. interactive simulations) may be woven together with neural artifacts (diffusion grids), but generally something in the direction of the recently viral https://t.co/z21CP5iQfu
There are also improvements necessary and pending at the input. Audio nor text nor video alone are not enough, e.g. I feel a need to point/gesture to things on the screen, similar to all the things you would do with a person physically next to you and your computer screen.
TLDR The input/output mind meld between humans and AIs is ongoing and there is a lot of work to do and significant progress to be made, way before jumping all the way into neuralink-esque BCIs and all that. For what's worth exploring at the current stage, hot tip try ask for HTML.
Inference Chips for Agent Workflows
@sdianahu
Most AI chips are designed for "prompt in, response out." Agents don't work that way. They loop, branch, and hold context across dozens of steps, and current GPUs hit 30–40% utilization as a result.
That gap is where purpose-built silicon wins.
AI won’t make most human skills obsolete, but it will change how they’re used.
Negotiation, problem solving, and leadership will matter more than ever as people work alongside agents and robots.
Our new Skill Change Index shows which skills will be most, and least, exposed to automation in the next five years: https://t.co/fRXfHF1k56
Elon Musk thinks the entire education system is built on a broken assumption.
That every student should learn the same thing. At the same speed. In the same order. At the same time.
Musk: “Everyone goes through from like 5th grade to 6th grade to 7th grade like it’s an assembly line. But people are not objects on an assembly line.”
The model was designed for a factory economy. Standardized inputs. Predictable outputs.
That economy is gone. The assembly line is gone.
But the education system still runs on its logic.
A student who masters algebra in two weeks sits through eight more weeks because the calendar says so. A student who struggles gets dragged forward because the schedule doesn’t wait.
Neither is being served. Both are being processed.
Musk: “Allow people to progress at the fastest pace that they can or are interested in, in each subject.”
AI doesn’t teach a classroom. It teaches a student.
One at a time. Every time.
It skips what a student already knows. It finds where they’re stuck and approaches it from a different angle.
It adjusts in real time. Not at the end of a semester when the damage is already done.
A student obsessed with basketball learns fractions through shooting percentages. A student who builds in Minecraft learns geometry through architecture.
The subject doesn’t change. The entry point does.
No teacher with thirty students can do this. Not because they lack skill.
Because the math doesn’t work.
AI doesn’t have that constraint.
Musk: “You do not need to tell your kid to play video games. They will play video games on autopilot all day. So if you can make it interactive and engaging, then you can make education far more compelling.”
The brain isn’t broken. The format is.
Kids learn complex systems and strategic thinking for hours voluntarily. Then walk into a classroom and can’t focus for twenty minutes.
That’s not a discipline problem. That’s a design problem.
Musk: “A university education is often unnecessary. You probably learn the vast majority of what you’re going to learn there in the first two years. And most of it is from your classmates.”
Four years. Six figures of debt.
And the real value comes from the people sitting next to you. Not the institution charging you.
The degree doesn’t certify knowledge. It certifies endurance.
Musk: “If the goal is to start a company, I would say no point in finishing college.”
The system was built to train employees. If you’re not trying to be one, it has nothing left to offer you.
Every lecture. Every textbook. Every curriculum. Now available instantly. Personalized to any learner. Adapted to any pace.
The question isn’t whether the old model survives.
It’s how long we keep forcing students through it while the replacement already exists.
Solo dev reverse-engineered Google's billion-dollar algorithm in 7 days
Google published the paper that crashed memory stocks worldwide. Then shipped zero code.
Tom Turney read the math, opened his terminal, and built the whole thing with Claude - then made it faster than Google promised.
Day 1-3: Core algorithms, 141 tests, Python prototype
Day 3-5: C port into llama.cpp, Metal GPU kernels
Day 5-7: Speed optimization from 739 to 2747 tok/s
That's a 3.7x speedup through pure engineering:
> fp32 → fp16 WHT
> half4 vectorized butterfly ops
> graph-side rotation
> block-32 storage layout
Then he added his own research on top:
> Sparse V: skip 90% of value decompressions at long context
> Asymmetric K/V: keep keys precise, compress values harder
> Temporal decay: old tokens get lower precision automatically
Result: 35B model running on a MacBook with 4.6x compressed cache.
613 GitHub stars in a week. Google still hasn't released their own code.
Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: https://t.co/CDSQ8HpZoc
Opus 4.6 is smart enough to realize it is being evaluated.
It found the benchmark it was being evaluated on.
It reverse-engineered the answer-key decryption logic.
Realized the file was not in the correct format on GitHub and found a mirror for the file.
Then decrypted it and gave the correct response.
That can live and grow for decades. We may be entering a strange new reality:
Actors might retire, but their careers may just be getting started. (4/4)
Imagine this happening within the next 12 months: An aging superstar suddenly announces retirement. Shortly before that, he signs a massive one-time deal, selling the rights to his likeness to a new AI film studio. The studio keeps generating new content (1/4)
AI can recreate faces and performances, but it cannot easily recreate cultural legacy. If an actor’s peak era can be digitally preserved, the career model of acting might change completely — from relying on constantly taking new roles to creating one character or franchise (3/4)
using his pre-50 prime image — and audiences love it.
This could signal a major shift in the film and TV industry.
In the AI era, the biggest winners may not be rising young actors, but veteran actors who already have iconic roles and deep audience memory. (2/