Dmitry (Dima) Lepikhin @lepikhin - Twitter Profile

There's finally a proper benchmark for @openclaw model performance. I just found that @kilocode built an open source benchmark that tests models across 23 real world openclaw tasks like scheduling meetings, writing code, triaging email etc gpt-5.3-codex is sitting at number one. tbh that matches my experience. gemini 3 flash in second place. didn't expect that. curious to see where gpt-5.4 will land on this.

moritzkremb's tweet photo. There's finally a proper benchmark for @openclaw model performance.

I just found that @kilocode built an open source benchmark that tests models across 23 real world openclaw tasks like scheduling meetings, writing code, triaging email etc

gpt-5.3-codex is sitting at number one. tbh that matches my experience.

gemini 3 flash in second place. didn't expect that.

curious to see where gpt-5.4 will land on this.

102

588

48

522

78K

Dmitry (Dima) Lepikhin

@lepikhin

4 months ago

@vonderleyen 🤌😘

0

160

Dmitry (Dima) Lepikhin

@lepikhin

4 months ago

@tymrtn It's quite obvious that distillation is there for any non-frontier model. Web has plenty of traces from frontiers.

0

668

Dmitry (Dima) Lepikhin

@lepikhin

4 months ago

@AnthropicAI This is unsurprising, thankfully it's not the tactic that produces new frontier.

0

3

0

1K

Dmitry (Dima) Lepikhin

@lepikhin

4 months ago

@auren Include pot

0

35

Dmitry (Dima) Lepikhin

@lepikhin

4 months ago

@agihippo Well, the issue is, I'd like to do a coffee place like in Italy, hole in the wall, 1 euro caffe normale at the bar stand, maybe ocasional cappuccino, but there is no place for it in Bay Area.

0

1

0

184

lepikhin retweeted

Anselm Levskaya @anselmlevskaya

4 months ago

AI folks radically overestimate how much LLMs help for practical bio lab work and so get weirdly fixated on biorisk scifi scenarios. Lab work is gated by a researcher's personal pain tolerance, relentlessness, and a huge body of tacit knowledge passed down by apprenticeship.

0

34

5

13

5K

Dmitry (Dima) Lepikhin

@lepikhin

4 months ago

@andrewchen how much is the latter?

0

163

Dmitry (Dima) Lepikhin

@lepikhin

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users