Tarun Reddi @_TeenDifferent - Twitter Profile

27 minutes ago

Hey @CloudflareDev @Cloudflare @CloudflareHelp we’re evaluating AI Gateway for LLM request caching. Can AI Gateway cache/log storage + request processing be region-pinned (US-only)? Docs show AI Gateway isn’t compatible with Regional Services. Is there any Enterprise/private config for this?

0

1

0

13

_TeenDifferent retweeted

Marques Brownlee

@MKBHD

26 days ago

Don't make me tap the sign

186

16K

620

289

663K

_TeenDifferent retweeted

Thinking Machines

@thinkymachines

29 days ago

People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. https://t.co/AFJZ5kH7Ku

465

16K

2K

12K

8M

_TeenDifferent retweeted

Tilde

@tilderesearch

about 1 month ago

Introducing Aurora, a new optimizer for training frontier-scale models. We train Aurora-1.1B, which achieves 100x data efficiency on open-source internet data. Despite having 25% fewer parameters, 2 orders of magnitude fewer training tokens, and using fully open-source internet-only data, Aurora matches Qwen3-1.7B on several benchmarks. Aurora was developed after identifying a major failure mode that can occur under Muon, an increasingly popular optimizer that has shown strong gains over Adam(W). We find that Muon can cause a huge percentage of neurons to effectively die early in training, reducing effective network capacity so that many parameters no longer meaningfully contribute to network outputs. By redistributing update energy more uniformly across neurons while preserving Muon’s stability properties, Aurora prevents neuron death and recovers substantial model capacity. What makes this work especially exciting is that it points toward a broader direction for ML research: better optimizers may not come purely from elegant mathematical abstractions, but from understanding and addressing the concrete dynamics and pathologies that emerge inside real training systems.

42

2K

176

1K

525K

Who to follow

ake_六四天安門_

@akedon

肉は正義！ https://t.co/fV27LL6ydN ※背景画像はすき家の牛丼キング種々雑多な話題をツイート・フォローしてます。

CAGTI

@cagti_science

Advancing sustainable innovation through green chemistry and technology

Shaurya Yadav

@Shauryayadav__

Paranoid

_TeenDifferent retweeted

Zyphra

@ZyphraAI

about 1 month ago

Today we're releasing ZAYA1-8B, a reasoning MoE trained on @AMD and optimized for intelligence density. With <1B active params, it outperforms open-weight models many times its size on math and reasoning, closing in on DeepSeek-V3.2 and GPT-5-High with test-time compute. 🧵

ZyphraAI's tweet photo. Today we're releasing ZAYA1-8B, a reasoning MoE trained on @AMD and optimized for intelligence density.

With <1B active params, it outperforms open-weight models many times its size on math and reasoning, closing in on DeepSeek-V3.2 and GPT-5-High with test-time compute. 🧵 https://t.co/URTj1br9tw

101

2K

293

2K

1M

_TeenDifferent retweeted

Aaron

@Norapom04

about 2 months ago

pov RL kernel engineers whos tasked with implementing backward pass for deepseek v4's attention mechanism

2

145

6

18

34K

_TeenDifferent retweeted

kache

@yacineMTB

about 2 months ago

fact check: true

3

87

4

10

10K

_TeenDifferent retweeted

kache

@yacineMTB

about 2 months ago

POV you are coding during the year of 2026.5

73

4K

179

175

102K

_TeenDifferent retweeted

Max Weinbach

@mweinbach

about 2 months ago

Deepseek v4 benchmarks

1

41

4

1

4K

Tarun Reddi

@_TeenDifferent

about 2 months ago

V4 ...it's happening!

0

14

_TeenDifferent retweeted

Lisan al Gaib

@scaling01

about 2 months ago

DEEPSEEK-V4 FLASH AND PRO ITS HAPPENING https://t.co/5U61r1gjUT

10

463

40

56

121K

Tarun Reddi

@_TeenDifferent

about 2 months ago

might just be my take… but OAI did a solid job here. the earlier ones worked fine for me, though they sometimes overexplained or got a bit defensive this one feels more natural… no fluff, straight to the point… i like it nice work to whoever worked on this @OpenAI 👌

0

19

_TeenDifferent retweeted

Lisan al Gaib

@scaling01

about 2 months ago

After some deliberation I think GPT-5.5 is close to Mythos despite being only ~1/5 to ~1/2 the size* VendingBench-2 results are good ARC-AGI results are good FrontierMath 4 results are good CritPt results are very good (sadly no Mythos results to compare) CyberGym results are insane TerminalBench 2.0 results are insane UK AISI cyber range results are insane (these 3 results are all on par with Claude Mythos) but for some reason it absolutely shits the bed on SWE-Bench Pro, which threw me off but should just be discarded as noise or spiky intelligence *I think GPT-5.4 is ~1-2T GPT-5.5 is ~2-5T Mythos is ~10T Mythos pricing does look kind of ridiculous at $125 Mythos might turn out to be Anthropic's GPT-4.5 moment

38

477

14

79

81K

Tarun Reddi

@_TeenDifferent

about 2 months ago

@sama

0

810

_TeenDifferent retweeted

ARC Prize

@arcprize

about 2 months ago

GPT-5.5 on ARC-AGI (Verified) ARC-AGI-2: - Max: 85.0%, $1.87 - High: 83.3%, $1.45 - Med: 70.4%, $0.86 - Low: 33%, $0.35 GPT-5.5 is now state of the art on ARC-AGI-2

58

2K

222

292

291K

Tarun Reddi

@_TeenDifferent

about 2 months ago

@sama @scaling01 idk about him… i’m already in… throw me some credits let’s build something for education with OAI… feels like it’s time

0

872

_TeenDifferent retweeted

kache

@yacineMTB

about 2 months ago

It's over, anthropicbros.... We didn't IPO in time

70

3K

39

362

383K

_TeenDifferent retweeted

OpenAI

@OpenAI

about 2 months ago

Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.

2K

52K

7K

9K

13M

_TeenDifferent retweeted

Radu Soricut @RSoricut

about 2 months ago

Meet Vision Banana 🍌 from @GoogleDeepMind! We provide strong evidence that image generators are generalist vision learners. Traditional computer vision tasks (segmentation, depth estimation, normal prediction) can now be performed at/near SOTA with a single generalist model derived from an image generation model. 🖼️ Explore the results: https://t.co/wgBmLooZTO 📄 See details at: https://t.co/rhkDBRmfLR

RSoricut's tweet photo. Meet Vision Banana 🍌 from @GoogleDeepMind! We provide strong evidence that image generators are generalist vision learners. Traditional computer vision tasks (segmentation, depth estimation, normal prediction) can now be performed at/near SOTA with a single generalist model derived from an image generation model.

🖼️ Explore the results: https://t.co/wgBmLooZTO

📄 See details at: https://t.co/rhkDBRmfLR

28

630

93

276

78K

Tarun Reddi

@_TeenDifferent

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users