Ilia Karmanov @ikdeepl - Twitter Profile

ikdeepl retweeted

2 months ago

Terence Tao's SAIR foundation is doing some really cool work on enabling AI4Maths to be open and collaborative I'm heaps excited that we now get to work together on bringing projects like their Mathematics Distillation Challenge to the HF ecosystem. Let's go 🚀!

3

63

14

7

10K

ikdeepl retweeted

Londonist

@Londonist

2 months ago

All the places you can see animals in London, mapped https://t.co/rR8jDPw3VT

0

6

5

1

3K

ikdeepl retweeted

vLLM

@vllm_project

3 months ago

Thanks to @AI21Labs for tracking down a silent uint32 overflow in vLLM's Mamba-1 CUDA kernel and contributing the fix. Root cause: `uint32_t` stride × cache_index overflows silently at scale. Fix merged in #35275. The debugging story is worth a read. 🔗 https://t.co/S4XBnEn1uv

0

132

20

48

13K

ikdeepl retweeted

Oli London

@OliLondonTV

3 months ago

Dogs stolen from their owners in China walk 17km along a highway led by a corgi to get back home. The dogs escaped a dog meat truck and walked along a highway in Changchun, Jilin before returning to their village.

46

1K

130

39

31K

Who to follow

Babak Ehteshami Bejnordi

@BabakEht

Research Scientist@Qualcomm AI Research: Deep learning, Conditional computation, Model Efficiency, LLM/Vision

Mathew Salvaris

@MSalvaris

Machine Learning @Microsoft - - ex @iRobot Neuroscience @UCL - - PhD Computer Science Machine Learning. Avid snowboarder and climber

Adam Golinski

@adam_golinski

ML research @Apple, prev @OxCSML @InfAtEd, part of @MLinPL & @polonium_org 🇵🇱, sometimes funny

Ilia Karmanov @ikdeepl

3 months ago

@RNR_0 When you guys go on international vacations you will unfortunately be so much poorer than the people in your plane from Switzerland or US, or even working in London.

0

1

0

525

Ilia Karmanov @ikdeepl

3 months ago

@chaotichermes Is this bravery or safety?

0

223

ikdeepl retweeted

Bryan Catanzaro

@ctnzr

3 months ago

Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: https://t.co/CAYpP1iK3i And yes, Ultra is coming!

ctnzr's tweet photo. Announcing NVIDIA Nemotron 3 Super!

💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell
💚36 on AAIndex v4
💚up to 2.2X faster than GPT-OSS-120B in FP4
💚Open data, open recipe, open weights

Models, Tech report, etc. here:
https://t.co/CAYpP1iK3i

And yes, Ultra is coming! https://t.co/QuguMQaC8S

62

1K

205

452

208K

ikdeepl retweeted

Rabeeh Karimi @KarimiRabeeh

3 months ago

Excited to share that Nemotron 3 Super is now released! 🚀 A 120B hybrid MoE model (12B active, 1M context) built for complex agentic systems and long-context reasoning. Key innovations: • LatentMoE + Hybrid MoE • Multi-Token Prediction (MTP) • NVFP4 pretraining

KarimiRabeeh's tweet photo. Excited to share that Nemotron 3 Super is now released! 🚀

A 120B hybrid MoE model (12B active, 1M context) built for complex agentic systems and long-context reasoning.

Key innovations:
• LatentMoE + Hybrid MoE
• Multi-Token Prediction (MTP)
• NVFP4 pretraining https://t.co/JlmKMmadpA

4

80

4

14

3K

Ilia Karmanov @ikdeepl

4 months ago

@0xSeco Unrealised shouldn't be taxed at all, that's an idiocy of a liquidity crisis waiting to happen. The rate itself should be consistent with whatever prevents opening BVs. Ideally NL should draw inspiration from Switzerland, lower taxes but better infrastructure

0

1

0

13

ikdeepl retweeted

Lucas Beyer (bl16)

@giffmana

5 months ago

In today's episode of "Would You Please Just Look at the Data?" Eric finds that in MMLU-Pro chemistry and physics subsets, blindly picking the answer that has a leading space is correct pretty often!

giffmana's tweet photo. In today's episode of "Would You Please Just Look at the Data?"

Eric finds that in MMLU-Pro chemistry and physics subsets, blindly picking the answer that has a leading space is correct pretty often! https://t.co/nzxnk2x358

14

645

35

142

78K

ikdeepl retweeted

Syeda Nahida Akter @__SyedaAkter

9 months ago

Most LLMs learn to think only after pretraining—via SFT or RL. But what if they could learn to think during it? 🤔 Introducing RLP: Reinforcement Learning Pre-training—a verifier-free objective that teaches models to “think before predicting.” 🔥 Result: Massive reasoning boosts & gains that COMPOUND after post-training! 📝 Blog: https://t.co/5v4eLVHxRe 🔗Paper: https://t.co/OWnX0L1Wv3 🧵↓

__SyedaAkter's tweet photo. Most LLMs learn to think only after pretraining—via SFT or RL. But what if they could learn to think during it? 🤔

Introducing RLP: Reinforcement Learning Pre-training—a verifier-free objective that teaches models to “think before predicting.”

🔥 Result: Massive reasoning boosts & gains that COMPOUND after post-training!

📝 Blog: https://t.co/5v4eLVHxRe
🔗Paper: https://t.co/OWnX0L1Wv3

🧵↓

8

255

40

138

20K

ikdeepl retweeted

Grant Sanderson

@3blue1brown

11 months ago

New video on the details of diffusion models: https://t.co/rRjJehNuF3 Produced by @welchlabs, this is the first in a small series of 3b1b this summer. I enjoyed providing editorial feedback throughout the last several months, and couldn't be happier with the result.

40

3K

351

1K

382K

ikdeepl retweeted

Demis Hassabis

@demishassabis

11 months ago

Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced version was able to solve 5 out of 6 problems. Incredible progress - huge congrats to @lmthang and the team! https://t.co/pp9bXF7rVj

193

6K

732

606

1M

ikdeepl retweeted

National Wildlife Federation

@NWF

about 1 year ago

A Sunset to Remember ☀️🌊 “The paddleboarder portrays the peaceful coexistence of people and wildlife,” as captured at sunset in August 2020. 🥇 People in Nature | 📷 Renee Capozzola 2024 National Wildlife Photo Contest Winners 📲: https://t.co/WRNSrUCfbW

NWF's tweet photo. A Sunset to Remember ☀️🌊

“The paddleboarder portrays the peaceful coexistence of people and wildlife,” as captured at sunset in August 2020.

🥇 People in Nature | 📷 Renee Capozzola

2024 National Wildlife Photo Contest Winners 📲: https://t.co/WRNSrUCfbW https://t.co/CabseDxTfY

0

19

7

0

2K

ikdeepl retweeted

Andrew Tao

@drewtao

about 1 year ago

Vision Language Models can be amazing at document understanding. Please check out our Nano sized model. More to come!

0

8

3

0

567

ikdeepl retweeted

Drew Pavlou 🇦🇺🇺🇸🇺🇦🇹🇼

@DrewPavlou

about 1 year ago

The French overseas territory St. Pierre et Miquelon (population 5,800) now has the highest tariff rates in the world at 99% Their exports are valued at just $3.5 million dollars a year. My guess as to what happened here is that they likely export a tiny amount (like $100 k worth of lobsters) to America without importing anything in return. So the insane White House algorithm (trade deficit/imports) would have produced this insane 99% tariff figure. So unbelievably stupid, incompetent and insane

DrewPavlou's tweet photo. The French overseas territory St. Pierre et Miquelon (population 5,800) now has the highest tariff rates in the world at 99%

Their exports are valued at just $3.5 million dollars a year. My guess as to what happened here is that they likely export a tiny amount (like $100 k worth of lobsters) to America without importing anything in return. So the insane White House algorithm (trade deficit/imports) would have produced this insane 99% tariff figure.

So unbelievably stupid, incompetent and insane

437

38K

5K

3K

3M

ikdeepl retweeted

Jeff Dean

@JeffDean

about 1 year ago

We're using a ReLU to set tariffs?

83

5K

429

376

440K

ikdeepl retweeted

Kevin Meng

@mengk20

about 1 year ago

AI models are *not* solving problems the way we think using Docent, we find that Claude solves *broken* eval tasks - memorizing answers & hallucinating them! details in 🧵 we really need to look at our data harder, and it's time to rethink how we do evals...