Maximilian Bode @mxpbode - Twitter Profile

@deepseek_ai @AlibabaGroup @crystalsssup Okay okay, due to reasonable feedback we added: @cohere for their non commercial models @ServiceNow with Apriel, I like folks there (and pipeline rl) Motif @tngtech as a shout out for awesome hacks and merges of big MoEs This is DEFINITELY right, no take backs

natolambert's tweet photo. @deepseek_ai @AlibabaGroup @crystalsssup Okay okay, due to reasonable feedback we added:
@cohere for their non commercial models
@ServiceNow with Apriel, I like folks there (and pipeline rl)
Motif
@tngtech as a shout out for awesome hacks and merges of big MoEs

This is DEFINITELY right, no take backs https://t.co/GwC2E9Z33R

5

91

16

32

25K

Who to follow

TNG Technology Consulting GmbH

@tngtech

TNG, aka "The Nerd Group", is a consulting partnership focused on high end information technology, particularly AI. 932 employees, 99.9% academics, ~53% PhDs.

Moritz

@moritzp82

Living in different worlds :) #Munich #IT #Triathlon #Marathon #agile

Christoph Stock

@christophstock

Co-founder of TNG Technology Consulting, author, investor, technology optimist.

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

6 months ago

Hehehe ;-)

2

18

1

0

1K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

6 months ago

https://t.co/kBqoQjJyBg

2

9

4

1

2K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

7 months ago

G1PO, our @UnitreeRobotics humanoid robot, was in a playful showrace against the Munich @Motorworld_de's #GT3RS. Of course, he still got helping hands from his human friends to steer him and thus the kart. @Porsche cars and TNG managers narrowly escaped, and the oil drums got only slightly bumped (yt link in reply ;-).

tngtech's tweet photo. G1PO, our @UnitreeRobotics humanoid robot, was in a playful showrace against the Munich @Motorworld_de's #GT3RS. Of course, he still got helping hands from his human friends to steer him and thus the kart. @Porsche cars and TNG managers narrowly escaped, and the oil drums got only slightly bumped (yt link in reply ;-).

1

11

4

1

554

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

8 months ago

TNG's Chimera models reached 8th place on the market share list on @openrouter . Model usage grew to 14.4 billion tokens/day, running on @chutes_ai.

tngtech's tweet photo. TNG's Chimera models reached 8th place on the market share list on @openrouter .

Model usage grew to 14.4 billion tokens/day, running on @chutes_ai. https://t.co/bnzPsE85Yw

1

16

6

3

1K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

8 months ago

Zen Driving Études after a long day of robotics work.

0

15

4

1

1K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

10 months ago

News from the Aider discord regarding DeepSeek-TNG R1T2 Chimera's performance in the Aider Polyglot benchmark, courtesy of benchmark wizard neolithic5452 and the magic @UnslothAI quantizations: - 2 bit UD-IQ2_M: 60.0% - 4 bit Q4_K_XL: 62.7% - 8 bit: 64.4% This seems to be the second highest open-weights result, after @deepseek_ai's R1-0528 which scored 71.4%. It appears to be before Kimi K2 (59.1%) and also @Alibaba_Qwen's Coder-480B, which scored 60.9% in the 4 bit UD-Q4_K_XL version. On @openrouter, R1T2 currently is the 11th most popular model for Aider. Over all applications, R1T2 processed 3.12B tokens yesterday on OR. On @chutes_ai, as of today is the tenth-most popular model.

4

33

5

2K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

11 months ago

@elonmusk R1T2 Chimera #3 trending model today Right behind Grōk and Kimi K2

1

21

6

1

2K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

11 months ago

Fun screenshot. Let's hand over the #1 trending position to #Grok4.

1

31

7

2

5K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

11 months ago

DeepSeek-TNG-R1T2-Chimera is currently the #1 trending model on @openrouter The platform which @karpathy called the "transfer switch of AI". Our Assembly-of-Experts method pushes the Pareto frontier between model intelligence and inference cost. Thanks again to the Open Weights community: @deepseek_ai @huggingface, @AIatMeta @openrouter @chutes_ai @UnslothAI and @jon_durbin @ping_toven @xlr8harder @reach_vb @alexatallah to name a few

tngtech's tweet photo. DeepSeek-TNG-R1T2-Chimera is currently the #1 trending model on @openrouter

The platform which @karpathy called the "transfer switch of AI".

Our Assembly-of-Experts method pushes the Pareto frontier between model intelligence and inference cost.

Thanks again to the Open Weights community:

@deepseek_ai @huggingface, @AIatMeta @openrouter @chutes_ai @UnslothAI and @jon_durbin @ping_toven @xlr8harder @reach_vb @alexatallah to name a few

1

67

12

22

12K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

12 months ago

R1T2 Chimera is available on @chutes_ai Thanks to @jon_durbin and team for making it available within 24 hours!

3

60

6

2

6K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

12 months ago

Today we release DeepSeek-TNG R1T2 Chimera. This new Chimera is a Tri-Mind Assembly-of-Experts model with three parents, namely R1-0528, R1 and V3-0324. R1T2 operates at a sweet spot in intelligence vs. output token length. It appears to be... * about 20% faster than R1, and more than twice as fast as R1-0528 * significantly more intelligent than R1 in benchmarks such as GPQA Diamond and AIME-24/25, albeit not quite on R1-0528 level * much more intelligent than our first R1T Chimera, and also think-token consistent, which is a major improvement We perceive it as generally well-behaved and a nice persona to talk to. The weights are on @huggingface under the MIT licence. We are looking forward to your experiments and feedback! Thanks to @deepseek_ai for giving their models to the world, to @chutes_ai and @openrouter for hosting R1T, to @WolframRvnwlf for benchmarking it, to @xlr8harder for beta-testing the new Chimera, and to @natolambert for constructive discussions at @aiDotEngineer.

tngtech's tweet photo. Today we release DeepSeek-TNG R1T2 Chimera.

This new Chimera is a Tri-Mind Assembly-of-Experts model with three parents, namely R1-0528, R1 and V3-0324.

R1T2 operates at a sweet spot in intelligence vs. output token length. It appears to be...

* about 20% faster than R1, and more than twice as fast as R1-0528

* significantly more intelligent than R1 in benchmarks such as GPQA Diamond and AIME-24/25, albeit not quite on R1-0528 level

* much more intelligent than our first R1T Chimera, and also think-token consistent, which is a major improvement

We perceive it as generally well-behaved and a nice persona to talk to. The weights are on @huggingface under the MIT licence. We are looking forward to your experiments and feedback!

Thanks to @deepseek_ai for giving their models to the world, to @chutes_ai and @openrouter for hosting R1T, to @WolframRvnwlf for benchmarking it, to @xlr8harder for beta-testing the new Chimera, and to @natolambert for constructive discussions at @aiDotEngineer.

22

394

87

120

127K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

about 1 year ago

We post our new paper "Assembly of Experts: Linear-time construction of the Chimera LLM variants with emergent and adaptable behaviors" on @huggingface, while waiting for @arxiv. We explain how we constructed the 671B R1T Chimera child model from the great @deepseek_ai V3-0324 and R1 parent models (谢谢!) in less than one hour of CPU time. The Chimera research prototype is currently the 4th most-popular LLM on @chutes_ai with about 3.5B tokens/day, 0.9-1.0B of which flow through @openrouter. Over 160B tokens have been processed since release on April 26th.

tngtech's tweet photo. We post our new paper "Assembly of Experts: Linear-time construction of the Chimera LLM variants with emergent and adaptable behaviors" on @huggingface, while waiting for @arxiv.
We explain how we constructed the 671B R1T Chimera child model from the great @deepseek_ai V3-0324 and R1 parent models (谢谢!) in less than one hour of CPU time.
The Chimera research prototype is currently the 4th most-popular LLM on @chutes_ai with about 3.5B tokens/day, 0.9-1.0B of which flow through @openrouter. Over 160B tokens have been processed since release on April 26th.

1

58

9

13

13K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

about 1 year ago

Assembly of Experts: Our linear-time 671B Chimera LLM construction paper should soon appear on https://t.co/dVojsu26Pb. We are at the @aiDotEngineer fair in SFO until tomorrow, so for a chat -> DM ;-)

tngtech's tweet photo. Assembly of Experts: Our linear-time 671B Chimera LLM construction paper should soon appear on https://t.co/dVojsu26Pb.
We are at the @aiDotEngineer fair in SFO until tomorrow, so for a chat -> DM ;-) https://t.co/VcS5TTf6In

0

15

4

3

780

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

about 1 year ago

More evidence for the effectiveness of the Chimera construction method: Taking DeepSeek's R1-0528 release, we started benchmarking new Chimera variants on AIME-24 and SimpleQA. R1-0528 significantly improves AIME performance from 79.8 to 91.4 while doubling the amount of output tokens compared to R1. It appears to be a great model overall. Our R1-0528-Chimera variants of R1-0528 and V3-0324 seem to improve math performance too, with results up to 83.3. Positive aspect: more compact reasoning with output-token count below that of R1. This could be an interesting trade-off for real-world applications. On SimpleQA R1T-0528-Chimera is close to the V3-0328 results and seems to fare better than R1-0528. Caveat: these numbers are just a preliminary indicator, since they stem from single benchmark runs and are subject to statistical fluctuation. We'll continue searching for variants with interesting features and beneficial behavior combinations

tngtech's tweet photo. More evidence for the effectiveness of the Chimera construction method:

Taking DeepSeek's R1-0528 release, we started benchmarking new Chimera variants on AIME-24 and SimpleQA.

R1-0528 significantly improves AIME performance from 79.8 to 91.4 while doubling the amount of output tokens compared to R1. It appears to be a great model overall.

Our R1-0528-Chimera variants of R1-0528 and V3-0324 seem to improve math performance too, with results up to 83.3. Positive aspect: more compact reasoning with output-token count below that of R1. This could be an interesting trade-off for real-world applications.

On SimpleQA R1T-0528-Chimera is close to the V3-0328 results and seems to fare better than R1-0528.

Caveat: these numbers are just a preliminary indicator, since they stem from single benchmark runs and are subject to statistical fluctuation.

We'll continue searching for variants with interesting features and beneficial behavior combinations

2

83

14

16

9K

mxpbode retweeted

TNG Technology Consulting GmbH

@tngtech

about 1 year ago

https://t.co/MQ1C50pltK

5

32

4

5

2K

Maximilian Bode @mxpbode

about 1 year ago

Having a blast in beautiful Estes Park, CO. It was a real pleasure sharing my perspective on how LLMs are transforming software engineering, and why running your own GPU stack can be a game-changer. Thanks to everyone who came with smart questions and big ideas! #LambdaConf2025

LambdaConf 2025

@lambda_conf

about 1 year ago

Max Bode at the podium! #LambdaCon2025

0

4

1

0

883

0

6

3

0

624

Maximilian Bode

@mxpbode

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users