prasun /prʌˈsun/ @p_r - Twitter Profile

Pinned Tweet

prasun /prʌˈsun/ @p_r

almost 9 years ago

@StuFlemingNZ

4

143

30

1

0

p_r retweeted

prasun /prʌˈsun/ @p_r

2 days ago

@ShriramKMurthi https://t.co/B9Qf8207HG

0

1

0

7

p_r retweeted

prasun /prʌˈsun/ @p_r

2 days ago

@ShriramKMurthi https://t.co/iz4hPqtGlF

2

1

0

446

prasun /prʌˈsun/ @p_r

2 days ago

@ShriramKMurthi https://t.co/B9Qf8207HG

Ethan Mollick

@emollick

almost 2 years ago

@karpathy We wrote a paper on this last year based on data from a randomized controlled trial calling this the "jagged frontier" of AI. https://t.co/vTbYZryoJI

4

318

5

69

24K

0

1

0

7

Who to follow

Mad hatter I Perpetual gypsy

🪷

@spiritasleo

here for the tea 🍵

prasun /prʌˈsun/ @p_r

2 days ago

@ShriramKMurthi https://t.co/iz4hPqtGlF

Andrej Karpathy

@karpathy

almost 2 years ago

Jagged Intelligence The word I came up with to describe the (strange, unintuitive) fact that state of the art LLMs can both perform extremely impressive tasks (e.g. solve complex math problems) while simultaneously struggle with some very dumb problems. E.g. example from two days ago - which number is bigger, 9.11 or 9.9? Wrong. https://t.co/dUrR6wm8GC or failing to play tic-tac-toe: making non-sensical decisions: https://t.co/XarwfUBtod or another common example, failing to count, e.g. the number of times the letter "r" occurs in the word "barrier", ChatGPT-4o claims it's 2: https://t.co/xpffK2r0pv The same is true in other modalities. State of the art LLMs can reasonably identify thousands of species of dogs or flowers, but e.g. can't tell if two circles overlap: https://t.co/HCXxBxosAu Jagged Intelligence. Some things work extremely well (by human standards) while some things fail catastrophically (again by human standards), and it's not always obvious which is which, though you can develop a bit of intuition over time. Different from humans, where a lot of knowledge and problem solving capabilities are all highly correlated and improve linearly all together, from birth to adulthood. Personally I think these are not fundamental issues. They demand more work across the stack, including not just scaling. The big one I think is the present lack of "cognitive self-knowledge", which requires more sophisticated approaches in model post-training instead of the naive "imitate human labelers and make it big" solutions that have mostly gotten us this far. For an example of what I'm talking about, see Llama 3.1 paper section on mitigating hallucinations: https://t.co/pjuxoIOJCY For now, this is something to be aware of, especially in production settings. Use LLMs for the tasks they are good at but be on a lookout for jagged edges, and keep a human in the loop.

karpathy's tweet photo. Jagged Intelligence

The word I came up with to describe the (strange, unintuitive) fact that state of the art LLMs can both perform extremely impressive tasks (e.g. solve complex math problems) while simultaneously struggle with some very dumb problems.

E.g. example from two days ago - which number is bigger, 9.11 or 9.9? Wrong.
https://t.co/dUrR6wm8GC

or failing to play tic-tac-toe: making non-sensical decisions:
https://t.co/XarwfUBtod

or another common example, failing to count, e.g. the number of times the letter "r" occurs in the word "barrier", ChatGPT-4o claims it's 2:
https://t.co/xpffK2r0pv

The same is true in other modalities. State of the art LLMs can reasonably identify thousands of species of dogs or flowers, but e.g. can't tell if two circles overlap:
https://t.co/HCXxBxosAu

Jagged Intelligence. Some things work extremely well (by human standards) while some things fail catastrophically (again by human standards), and it's not always obvious which is which, though you can develop a bit of intuition over time. Different from humans, where a lot of knowledge and problem solving capabilities are all highly correlated and improve linearly all together, from birth to adulthood.

Personally I think these are not fundamental issues. They demand more work across the stack, including not just scaling. The big one I think is the present lack of "cognitive self-knowledge", which requires more sophisticated approaches in model post-training instead of the naive "imitate human labelers and make it big" solutions that have mostly gotten us this far. For an example of what I'm talking about, see Llama 3.1 paper section on mitigating hallucinations:
https://t.co/pjuxoIOJCY

For now, this is something to be aware of, especially in production settings. Use LLMs for the tasks they are good at but be on a lookout for jagged edges, and keep a human in the loop.

214

3K

396

1K

409K

2

1

0

446

prasun /prʌˈsun/ @p_r

5 days ago

@stevenstrogatz @m_j_wiener > see his tweets below I wasn't able to find the tweets without dinner digging. Should be easier to find by following backwards from this tweet https://t.co/DRoCucENdH

Steven Strogatz

@stevenstrogatz

over 5 years ago

@MarcosCarreira @m_j_wiener You guys realize what a precious thing it is to be able to do math together. I love it that we can understand each other, admit our mistakes if and when we make them (nobody’s perfect!) and figure out the truth together.

3

35

2

0

8

1

6K

prasun /prʌˈsun/ @p_r

6 days ago

@alex1craig @doodlestein @ryancarson You ask AI to do it

0

1

0

89

prasun /prʌˈsun/ @p_r

8 days ago

p_r's tweet photo. https://t.co/IMLxl23x2i

0

4

prasun /prʌˈsun/ @p_r

almost 6 years ago

$AMD 80.86 #AMD

1

0

prasun /prʌˈsun/ @p_r

11 days ago

$AMD 486

1

0

21

prasun /prʌˈsun/ @p_r

8 days ago

@lauriewired One of the reasons that we have silicon transistors is that silicon could run hotter than germanium

0

2

0

131

prasun /prʌˈsun/ @p_r

8 days ago

@BlrLitFest Jupiter Jones, Pete Crenshaw, and a Robert Andrews

0

2

1

0

168

prasun /prʌˈsun/ @p_r

9 days ago

@tennisabstract What do these numbers look like now https://t.co/gePnrmKaZq

prasun /prʌˈsun/ @p_r

12 days ago

@tennisabstract Rafa's going to win this, isn't he

0

1

1K

1

0

933

p_r retweeted

prasun /prʌˈsun/ @p_r

12 days ago

@tennisabstract Rafa's going to win this, isn't he

0

1

1K

prasun /prʌˈsun/ @p_r

12 days ago

@tennisabstract Rafa's going to win this, isn't he

0

1

1K

p_r retweeted

prasun /prʌˈsun/ @p_r

10 months ago

@patrickc When John Templeton said that, he was talking about learning from your mistakes instead of thinking "this time it's different" and making the same mistake again.

p_r's tweet photo. @patrickc When John Templeton said that, he was talking about learning from your mistakes instead of thinking "this time it's different" and making the same mistake again. https://t.co/MvcVU6adJo

0

1

0

26

p_r retweeted

prasun /prʌˈsun/ @p_r

5 months ago

@kaushikcbasu Roger Penrose discovered Penrose tiles in the 1970s. These tiles use simple shapes to achieve a non-repeating pattern ("aperiodic tiling"). People observed later that this concept of aperiodicity had appeared centuries earlier with Girih tiles used in the darb-i imam shrine.

0

1

0

39

p_r retweeted

prasun /prʌˈsun/ @p_r

3 months ago

@ChShersh Michael J. Flynn, creator of the foundational 1966 Flynn's taxonomy for classifying computer architectures (SISD, SIMD, MISD, MIMD) based on instruction/data streams, passed away on December 24, 2025, at age 91.

0

1

0

35

p_r retweeted

prasun /prʌˈsun/ @p_r

about 2 months ago

@Thom_Wolf https://t.co/ffItSnuFeX

0

2

1

0

408

p_r retweeted

prasun /prʌˈsun/ @p_r

about 1 month ago

@kaushikcbasu There's an app for that https://t.co/AG0dt5X3zD

0

1

0

113

p_r retweeted

prasun /prʌˈsun/ @p_r

about 1 month ago

@unixterminal ~94 ms 94 million ns x ~3 cycles/ns x ~3 insts/cycle ~1 billion instructions

0

1

0

43

p_r retweeted

prasun /prʌˈsun/ @p_r

28 days ago

@jimkxa https://t.co/BMMjZahbCD

0

1

0

52

prasun /prʌˈsun/

@p_r

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users