kevin @proofofkevin - Twitter Profile

kevin @proofofkevin

1 day ago

@HarrisonSatcher hey

0

2

0

22

proofofkevin retweeted

Dan Robinson

@danrobinson

about 2 months ago

Come to our hackathon to see this guy

2

56

4

8

9K

kevin @proofofkevin

2 months ago

AlphaZero thinks in juicy targets, just won’t admit it

benedict

@bqbrady

2 months ago

Benchmarking Frontier LLMs on Chess Over the weekend I built a series of evals to understand how language models reason about endgames, tactics, and full chess games against strong opponents. Turns out they are getting pretty good! https://t.co/zRRrD3NfMO

bqbrady's tweet photo. Benchmarking Frontier LLMs on Chess

Over the weekend I built a series of evals to understand how language models reason about endgames, tactics, and full chess games against strong opponents. Turns out they are getting pretty good!

https://t.co/zRRrD3NfMO https://t.co/EpogWAr7Nj

18

125

11

61

24K

0

2

0

82

proofofkevin retweeted

MeekMill

@MeekMill

2 months ago

I need a GitHub too! Is it like that or nah?

854

13K

2K

1K

6M

Who to follow

3 months ago

Mexico City is just Japan at home

0

2

0

53

kevin @proofofkevin

3 months ago

@ArenRendell You’re doing me erasure

1

2

0

29

kevin @proofofkevin

3 months ago

@ArenRendell Claude down, Thotiana

0

2

0

368

kevin @proofofkevin

3 months ago

Once again we need a word like screenager but for adults and their ay eye workflows. Non derogatory of course

Aren

@ArenRendell

3 months ago

woke up this morning and realized my work and workflow is 95% llm. thank you @proofofkevin. cannot comment further at this time. more to come.

0

2

0

196

0

2

0

216

kevin @proofofkevin

3 months ago

threaded a needle and it makes sense why we say that

0

1

0

31

kevin @proofofkevin

3 months ago

2 phone

baby keem

@babykeem

3 months ago

3 more to go. 2 sparks otw

251

10K

362

627

1M

0

1

0

89

kevin @proofofkevin

3 months ago

@bqbrady Next customer to persuade is myself

1

2

0

56

kevin @proofofkevin

3 months ago

Aren

@ArenRendell

3 months ago

actually now wondering if @proofofkevin built something that entirely replaces ME but it’s way better and he did it in ~2 days. gulp

0

3

0

1

414

0

3

0

247

kevin @proofofkevin

3 months ago

@ArenRendell 1 day

1

0

36

kevin @proofofkevin

3 months ago

https://t.co/l8jbdisB5j

0

5

2

644

kevin @proofofkevin

4 months ago

Models will keep improving but the risk of one bad prompt wrecking your app will (very likely) remain nonzero. Vintage devs just use git. Go-to-bed-grandpa AI PMs will use history rollbacks. And rollbacks w mobile live previews = 🪬

ROFI

@bidah

4 months ago

Open sourcing the first @expo vibe coding web IDE and SDK: React Native Vibe Code Powered by @claudeai agent SDK, history rollbacks, live web and native app previews, full stack setup by @convex, publish to web w/ @Cloudflare, voice prompting, upload assets to app, add images and files to prompt, model selector, skills loader, visual edits, sandboxing by @e2b, download codebase option, Monaco code editor, fork/remix and a CLI to run locally. The project is a @turborepo running @nextjs hosted on @vercel with streaming powered by @aisdk ◆ try cloud version at https://t.co/v8q8TjUake ◆ github repo: https://t.co/BtrpEKzdA1

27

294

23

316

44K

0

3

0

262

kevin @proofofkevin

4 months ago

@DimitriosMitsos @bqbrady @meridian Re: the article above, biggest gap by far has been verifiability. Context is a bottleneck. But even without pre-indexing, in human readable codebases, exploration feels O(1) relative to the verifiability challenge.

0

2

0

27