AlexKerr

@KA4869

England

Joined September 2009

192 Following

25 Followers

474 Posts

KA4869 retweeted

Andrej Karpathy

@karpathy

28 days ago

This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as slideshows, etc. More generally, imo audio is the human-preferred input to AIs but vision (images/animations/video) is the preferred output from them. Around a ~third of our brains are a massively parallel processor dedicated to vision, it is the 10-lane superhighway of information into brain. As AI improves, I think we'll see a progression that takes advantage: 1) raw text (hard/effortful to read) 2) markdown (bold, italic, headings, tables, a bit easier on the eyes) <-- current default 3) HTML (still procedural with underlying code, but a lot more flexibility on the graphics, layout, even interactivity) <-- early but forming new good default ...4,5,6,... n) interactive neural videos/simulations Imo the extrapolation (though the technology doesn't exist just yet) ends in some kind of interactive videos generated directly by a diffusion neural net. Many open questions as to how exact/procedural "Software 1.0" artifacts (e.g. interactive simulations) may be woven together with neural artifacts (diffusion grids), but generally something in the direction of the recently viral https://t.co/z21CP5iQfu There are also improvements necessary and pending at the input. Audio nor text nor video alone are not enough, e.g. I feel a need to point/gesture to things on the screen, similar to all the things you would do with a person physically next to you and your computer screen. TLDR The input/output mind meld between humans and AIs is ongoing and there is a lot of work to do and significant progress to be made, way before jumping all the way into neuralink-esque BCIs and all that. For what's worth exploring at the current stage, hot tip try ask for HTML.

19K

21K

KA4869 retweeted

Thariq

@trq212

about 1 month ago

https://t.co/MXt5XS4xBX

17K

34K

14M

KA4869 retweeted

RjckyTang

@RjckyTang

about 1 month ago

@qkl2058 https://t.co/IsQwEpAb7W Title: I Made the Quant Roadmap Channel: Roman Paolucci

KA4869 retweeted

Y Combinator

@ycombinator

about 1 month ago

Inference Chips for Agent Workflows @sdianahu Most AI chips are designed for "prompt in, response out." Agents don't work that way. They loop, branch, and hold context across dozens of steps, and current GPUs hit 30–40% utilization as a result. That gap is where purpose-built silicon wins.

405

310

707K

Who to follow

Do not go gentle into that goodnight. Rage, rage, against the ending of the light. ------ Contro la paura della massa, la cieca obbedienza e le nobili bugie.

Owen Blandy

@Blandosa

Team Chef at @EFProCycling Host Chef and Baker at @42_acres

KA4869 retweeted

indigo

@indigox

about 1 month ago

是什么卡住了模型的上下文长度？不是计算，而是内存带宽瓶颈！Dwarkesh 最新播客首次启用的黑板讲座 - 嘉宾Reiner Pope 曾在 Google 负责 TPU 架构，现在创立了芯片初创公司 Maddox，他用数学推导解释了 LLM 推理和训练的底层经济学👀 推理一个 token 需要的时长取决于“计算与内存时间“这两个瓶颈中更慢的那个：实际推理时间 = max(T_compute, T_memory)。在小 batch size 时，内存带宽是瓶颈（要加载全部权重但只服务一个用户）；在大 batch size 时，计算成为瓶颈。两者相等的交叉点就是最优 batch size。计算成本随上下文长度几乎不变（因为注意力的计算量相对权重矩阵乘法很小）；但内存带宽成本随上下文长度线性增长（需要加载 KV cache）。稀疏注意力可以帮助（DeepSeek 论文中是平方根改善），但不是无限的——太稀疏会损失质量。 "我实际上看不到解决内存墙的好路径。HBM 就是现在这个水平，不会大幅改善。" 这直接回应了 Dario Amodei 的观点（"不需要持续学习，in-context learning 就够了"）——如果你需要等同于"与你工作一个月的同事"的 context，那可能需要 1 亿 token 的上下文窗口，在现有内存架构下成本极高。内存（HBM）是真正的瓶颈！Pope 的分析从第一性原理证明了 Dylan Patel 反复强调的"DRAM 还要翻 2-3 倍"。内存带宽决定了上下文长度上限、推理成本下界、最优 batch size。SK Hynix、三星、Micron 是直接受益者⚡️

indigox's tweet photo. 是什么卡住了模型的上下文长度？不是计算，而是内存带宽瓶颈！Dwarkesh 最新播客首次启用的黑板讲座 - 嘉宾Reiner Pope 曾在 Google 负责 TPU 架构，现在创立了芯片初创公司 Maddox，他用数学推导解释了 LLM 推理和训练的底层经济学👀

推理一个 token 需要的时长取决于“计算与内存时间“这两个瓶颈中更慢的那个：实际推理时间 = max(T_compute, T_memory)。

在小 batch size 时，内存带宽是瓶颈（要加载全部权重但只服务一个用户）；在大 batch size 时，计算成为瓶颈。两者相等的交叉点就是最优 batch size。

计算成本随上下文长度几乎不变（因为注意力的计算量相对权重矩阵乘法很小）；但内存带宽成本随上下文长度线性增长（需要加载 KV cache）。

稀疏注意力可以帮助（DeepSeek 论文中是平方根改善），但不是无限的——太稀疏会损失质量。

"我实际上看不到解决内存墙的好路径。HBM 就是现在这个水平，不会大幅改善。"

这直接回应了 Dario Amodei 的观点（"不需要持续学习，in-context learning 就够了"）——如果你需要等同于"与你工作一个月的同事"的 context，那可能需要 1 亿 token 的上下文窗口，在现有内存架构下成本极高。

内存（HBM）是真正的瓶颈！Pope 的分析从第一性原理证明了 Dylan Patel 反复强调的"DRAM 还要翻 2-3 倍"。内存带宽决定了上下文长度上限、推理成本下界、最优 batch size。SK Hynix、三星、Micron 是直接受益者⚡️

124

131

14K

KA4869 retweeted

Zephyr

@zephyr_z9

about 1 month ago

So, Jensen was right all along...

148

269

229K

KA4869 retweeted

Financial Times

@FT

about 1 month ago

China poised to restart exporting jet fuel, diesel and gasoline https://t.co/1GhvpR0eBx

315

148

593K

KA4869 retweeted

McKinsey Global Institute

@McKinsey_MGI

about 2 months ago

AI won’t make most human skills obsolete, but it will change how they’re used. Negotiation, problem solving, and leadership will matter more than ever as people work alongside agents and robots. Our new Skill Change Index shows which skills will be most, and least, exposed to automation in the next five years: https://t.co/fRXfHF1k56

McKinsey_MGI's tweet photo. AI won’t make most human skills obsolete, but it will change how they’re used.

Negotiation, problem solving, and leadership will matter more than ever as people work alongside agents and robots.

Our new Skill Change Index shows which skills will be most, and least, exposed to automation in the next five years: https://t.co/fRXfHF1k56

321

907

184K

KA4869 retweeted

Elon Musk

@elonmusk

about 2 months ago

You can access 𝕏 APi via @OpenClaw. We’re trying to make it affordable without giving away the shop. Hopefully, this can be useful & fun 💫

47K

14K

39M

KA4869 retweeted

Dustin

@r0ck3t23

about 2 months ago

Elon Musk thinks the entire education system is built on a broken assumption. That every student should learn the same thing. At the same speed. In the same order. At the same time. Musk: “Everyone goes through from like 5th grade to 6th grade to 7th grade like it’s an assembly line. But people are not objects on an assembly line.” The model was designed for a factory economy. Standardized inputs. Predictable outputs. That economy is gone. The assembly line is gone. But the education system still runs on its logic. A student who masters algebra in two weeks sits through eight more weeks because the calendar says so. A student who struggles gets dragged forward because the schedule doesn’t wait. Neither is being served. Both are being processed. Musk: “Allow people to progress at the fastest pace that they can or are interested in, in each subject.” AI doesn’t teach a classroom. It teaches a student. One at a time. Every time. It skips what a student already knows. It finds where they’re stuck and approaches it from a different angle. It adjusts in real time. Not at the end of a semester when the damage is already done. A student obsessed with basketball learns fractions through shooting percentages. A student who builds in Minecraft learns geometry through architecture. The subject doesn’t change. The entry point does. No teacher with thirty students can do this. Not because they lack skill. Because the math doesn’t work. AI doesn’t have that constraint. Musk: “You do not need to tell your kid to play video games. They will play video games on autopilot all day. So if you can make it interactive and engaging, then you can make education far more compelling.” The brain isn’t broken. The format is. Kids learn complex systems and strategic thinking for hours voluntarily. Then walk into a classroom and can’t focus for twenty minutes. That’s not a discipline problem. That’s a design problem. Musk: “A university education is often unnecessary. You probably learn the vast majority of what you’re going to learn there in the first two years. And most of it is from your classmates.” Four years. Six figures of debt. And the real value comes from the people sitting next to you. Not the institution charging you. The degree doesn’t certify knowledge. It certifies endurance. Musk: “If the goal is to start a company, I would say no point in finishing college.” The system was built to train employees. If you’re not trying to be one, it has nothing left to offer you. Every lecture. Every textbook. Every curriculum. Now available instantly. Personalized to any learner. Adapted to any pace. The question isn’t whether the old model survives. It’s how long we keep forcing students through it while the replacement already exists.

38K

10K

17K

22M

KA4869 retweeted

Marc Andreessen 🇺🇸

@pmarca

2 months ago

I'm calling it. AGI is already here – it's just not evenly distributed yet.

14K

AlexKerr @KA4869

2 months ago

@craigzLiszt Skip India or visit it last. Coz its poor hygiene may get you sick and ruin the whole trip

117

KA4869 retweeted

BuBBliK

@k1rallik

2 months ago

Solo dev reverse-engineered Google's billion-dollar algorithm in 7 days Google published the paper that crashed memory stocks worldwide. Then shipped zero code. Tom Turney read the math, opened his terminal, and built the whole thing with Claude - then made it faster than Google promised. Day 1-3: Core algorithms, 141 tests, Python prototype Day 3-5: C port into llama.cpp, Metal GPU kernels Day 5-7: Speed optimization from 739 to 2747 tok/s That's a 3.7x speedup through pure engineering: > fp32 → fp16 WHT > half4 vectorized butterfly ops > graph-side rotation > block-32 storage layout Then he added his own research on top: > Sparse V: skip 90% of value decompressions at long context > Asymmetric K/V: keep keys precise, compress values harder > Temporal decay: old tokens get lower precision automatically Result: 35B model running on a MacBook with 4.6x compressed cache. 613 GitHub stars in a week. Google still hasn't released their own code.

k1rallik's tweet photo. Solo dev reverse-engineered Google's billion-dollar algorithm in 7 days

Google published the paper that crashed memory stocks worldwide. Then shipped zero code.

Tom Turney read the math, opened his terminal, and built the whole thing with Claude - then made it faster than Google promised.

Day 1-3: Core algorithms, 141 tests, Python prototype
Day 3-5: C port into llama.cpp, Metal GPU kernels
Day 5-7: Speed optimization from 739 to 2747 tok/s

That's a 3.7x speedup through pure engineering:
> fp32 → fp16 WHT
> half4 vectorized butterfly ops
> graph-side rotation
> block-32 storage layout

Then he added his own research on top:

> Sparse V: skip 90% of value decompressions at long context
> Asymmetric K/V: keep keys precise, compress values harder
> Temporal decay: old tokens get lower precision automatically

Result: 35B model running on a MacBook with 4.6x compressed cache.
613 GitHub stars in a week. Google still hasn't released their own code.

169

KA4869 retweeted

Google Research

@GoogleResearch

3 months ago

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: https://t.co/CDSQ8HpZoc

39K

22K

19M

KA4869 retweeted

Lisan al Gaib

@scaling01

3 months ago

Opus 4.6 is smart enough to realize it is being evaluated. It found the benchmark it was being evaluated on. It reverse-engineered the answer-key decryption logic. Realized the file was not in the correct format on GitHub and found a mirror for the file. Then decrypted it and gave the correct response.

scaling01's tweet photo. Opus 4.6 is smart enough to realize it is being evaluated.

It found the benchmark it was being evaluated on.
It reverse-engineered the answer-key decryption logic.
Realized the file was not in the correct format on GitHub and found a mirror for the file.
Then decrypted it and gave the correct response.

102

245

942

704K

KA4869 retweeted

zerohedge

@zerohedge

3 months ago

Jane Street Sued For Crypto Insider Trading That Accelerated Terraform Collapse https://t.co/pXdsdiPg7j

359

275

AlexKerr @KA4869

4 months ago

That can live and grow for decades. We may be entering a strange new reality: Actors might retire, but their careers may just be getting started. (4/4)

AlexKerr @KA4869

4 months ago

Imagine this happening within the next 12 months: An aging superstar suddenly announces retirement. Shortly before that, he signs a massive one-time deal, selling the rights to his likeness to a new AI film studio. The studio keeps generating new content (1/4)

AlexKerr @KA4869

4 months ago

AI can recreate faces and performances, but it cannot easily recreate cultural legacy. If an actor’s peak era can be digitally preserved, the career model of acting might change completely — from relying on constantly taking new roles to creating one character or franchise (3/4)

AlexKerr @KA4869

4 months ago

using his pre-50 prime image — and audiences love it. This could signal a major shift in the film and TV industry. In the AI era, the biggest winners may not be rising young actors, but veteran actors who already have iconic roles and deep audience memory. (2/

AlexKerr

@KA4869

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users