Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward - Twitter Profile

Pinned Tweet

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

almost 2 years ago

we're so back / it's so over (w/ Little Squeak)

0

11

2

0

4K

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 15 hours ago

Fight or flight?

Nym @nym

1 day ago

The UK is threatening tech bosses with prison if they don't install surveillance software on every phone in Britain. The minister who resigned for not doing this fast enough called it "incremental change."

13

280

87

47

10K

0

27

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 15 hours ago

@SStricklandMMA Making a bet now that one of your drug tests comes up positive for PEDs, and not because you would take any

0

10

chrisbward retweeted

NZ ☄️

@CodeByNZ

about 16 hours ago

GOOGLE JUST COMPRESSED 31GB OF AI MEMORY INTO 4GB. That’s nearly an 8x reduction. Memory has quietly become one of the biggest bottlenecks for running large AI models at scale. Every extra gigabyte limits how many users a system can serve and how much context it can handle. By dramatically compressing memory requirements without meaningfully hurting performance, Google is making it possible to run larger workloads on the same hardware.

CodeByNZ's tweet photo. GOOGLE JUST COMPRESSED 31GB OF AI MEMORY INTO 4GB.

That’s nearly an 8x reduction.

Memory has quietly become one of the biggest bottlenecks for running large AI models at scale. Every extra gigabyte limits how many users a system can serve and how much context it can handle.

By dramatically compressing memory requirements without meaningfully hurting performance, Google is making it possible to run larger workloads on the same hardware.

0

6

4

1

288

Who to follow

Prof Michael Green FCCT

@Michael_S_Green

Education Consultant & Advisor; Visiting Prof @UniofGreenwich; Author; Former HMI @Ofstednews & workload advisor @educationgovuk; SFHEA; FCCT; Own views 🌈

ERANGEY BEATS

@JAYERANGEY

UK BASS-BASSLINE-UKG Dj and Producer bookings - [email protected]

Antonio Jiménez 🙏 🇺🇦

@ajimenezsa

Java Software Engineer. Nowadays focus on backend and performance. Next? Web3? ML? Quantum? Who knows... Side Project: https://t.co/swo9YJo0l0 RTs are not endorse

chrisbward retweeted

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 16 hours ago

@marias_martin @KaiXCreator In later tweets he discusses that he reviews the code. So he lied. He misled everyone.

0

1

0

35

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 16 hours ago

@marias_martin @KaiXCreator In later tweets he discusses that he reviews the code. So he lied. He misled everyone.

0

1

0

35

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 16 hours ago

@marias_martin @KaiXCreator My bad, he says "forget the code even exists" inferring that he never looks at the code, and also mentions he doesn't look at the diffs. Source: https://t.co/7j5TF9mlaU

Andrej Karpathy

@karpathy

over 1 year ago

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

1K

34K

4K

18K

7M

1

0

26

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 16 hours ago

@marias_martin @KaiXCreator Literally says in the original tweet "I don't even look at the code". He lied to everyone.

0

11

chrisbward retweeted

toyxyz

@toyxyz3

about 24 hours ago

https://t.co/doDpfPHInc

2

65

3

96

6K

chrisbward retweeted

Rhys

@RhysSullivan

1 day ago

tbh i never really cared about the agents.md vs claude.md stuff because you can just do ln -s agents.md claude.md and move on with your life

96

2K

46

389

104K

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 18 hours ago

@NatureUnedited tastes great too https://t.co/pCLMeDBrB2

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 18 hours ago

this is the best one yet @EmberSnacks

0

91

0

75

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 18 hours ago

this is the best one yet @EmberSnacks

0

91

chrisbward retweeted

sudox

@kmcnam1

about 24 hours ago

14

2K

104

79

26K

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 18 hours ago

all because you didn't create pre-push hooks in your local repo

Branko

@brankopetric00

1 day ago

CI pipeline takes 45 minutes to tell you that you forgot a semicolon.

22

301

22

5

23K

0

21

chrisbward retweeted

allisx86 @allisx86

1 day ago

allisx86's tweet photo. https://t.co/dwla5Paqr5

25

20K

400

1K

558K

chrisbward retweeted

Ahmad

@TheAhmadOsman

1 day ago

You don’t “run a model” You run Kernels The model is just a graph The Inference Engine is scheduler / optimizer / executor But the actual work? That happens in the Kernels - MatMul Kernels - Attention Kernels - RMSNorm Kernels - KV cache Kernels - Quantized linear Kernels - Sampling Kernels - Fused “please don’t write this back to memory 9 times” Kernels Same model, same GPU, same VRAM Wildly different performance Because one stack is using optimized fused Kernels that understand your hardware And the other stack is playing hot potato with tensors through 47 tiny launches and pretending the GPU is the problem Bad Kernels make people say: “this model is slow” Good Kernels make people say: “wait how is this running locally?” This is why Inference Engines and the Kernels implemented within them matter The model is the recipe The hardware is the kitchen The Kernels are the knives, pans, burners, and the chef not cutting onions with a spoon Most people benchmark models The real ones benchmark the Kernels underneath

TheAhmadOsman's tweet photo. You don’t “run a model”
You run Kernels

The model is just a graph

The Inference Engine is scheduler / optimizer / executor

But the actual work? That happens in the Kernels

- MatMul Kernels
- Attention Kernels
- RMSNorm Kernels
- KV cache Kernels
- Quantized linear Kernels
- Sampling Kernels
- Fused “please don’t write this back to memory 9 times” Kernels

Same model, same GPU, same VRAM
Wildly different performance

Because one stack is using optimized fused Kernels that understand your hardware

And the other stack is playing hot potato with tensors through 47 tiny launches and pretending the GPU is the problem

Bad Kernels make people say:
“this model is slow”

Good Kernels make people say:
“wait how is this running locally?”

This is why Inference Engines and the Kernels implemented within them matter

The model is the recipe
The hardware is the kitchen
The Kernels are the knives, pans, burners, and the chef not cutting onions with a spoon

Most people benchmark models
The real ones benchmark the Kernels underneath

17

463

56

342

15K

chrisbward retweeted

TracketPacer

@TracketPacer

1 day ago

65

24K

2K

822

306K

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐 @chrisbward

about 19 hours ago

don't focus on building apps, build platforms instead after that; markets after that; standards

@levelsio

about 23 hours ago

I think the challenge is that everyone can now build apps But 1) almost nobody has distribution (like an audience), or 2) the money to pay for distribution (ads or UGC), or 3) the creative genius to get distribution for free (classically called guerilla marketing)

146

5K

323

3K

852K

0

12