Dr. Bobby Gomez-Reino

Verified account

@BobbyGRG

Physics PhD, entrepreneur, ex-CERN innovator, now CEO at Cleverdist. Fast-tracking value to industry with software & generative AI.

Switzerland

Joined April 2010

304 Following

811 Followers

4.6K Posts

Pinned Tweet

Dr. Bobby Gomez-Reino

10 months ago

Small Teaser of our IO (Industrial Operator) 2.0 An AI system for Industrial Control Rooms. Version 2 going to production in early September. Version 3.0 (2025 Q4) in the making will use a new "Omniscient feature" we are developing based on hierarchical Realtime agents, where the industrial process intelligence is distributed across an intelligent tree. Testing now with the new @OpenAI gpt-realtime... Teaser coming soon too.

0

8

0

4

2K

Dr. Bobby Gomez-Reino

about 1 hour ago

@scaling01 im curious about how u r eastiamting that roughly. would u mind sharing?

0

0

0

0

101

Dr. Bobby Gomez-Reino

about 11 hours ago

Fable 5 already in action?

BobbyGRG's tweet photo. Fable 5 already in action? https://t.co/GbgbelUxYT

0

1

0

0

44

Dr. Bobby Gomez-Reino

1 day ago

@Teknium @jargoti20 @NousResearch some people quickly blame the harness. its harness+Model we experience. many llms are not good enough for long horizon taks regardless of how good harness might be.

1

1

0

0

323

Who to follow

Management Consultancy that delivers! Business Strategy, Marketing Strategy & Planning, Business Transformation & Social Media Management

Andrii Kovalchuk

@tieorangeAndriy

Android / Flutter Developer @Andersenlab ❤️: Material Design, Jetpack Compose, Kotlin, Dart, Typescript, Mobx, Redux

Jaime Forero Álvarez

@jforeroalvarez

Dr. Bobby Gomez-Reino

1 day ago

over prev antrhpic sota on sb

0

0

0

0

19

Dr. Bobby Gomez-Reino

1 day ago

are we ready for Fable to beat SimpleBench human baseline?

BobbyGRG's tweet photo. are we ready for Fable to beat SimpleBench human baseline? https://t.co/IWkdTeuVkX

2

0

0

0

80

Dr. Bobby Gomez-Reino

1 day ago

14% increase though

1

0

0

0

28

Dr. Bobby Gomez-Reino

1 day ago

As expected, for most of the coding tasks we are asking, we dont see incredible different to gpt5.5-medium that is already solving mostly what we are typically asking for. However, in our IO long horizon operations, the differences appear in minutes. benching soon

Dr. Bobby Gomez-Reino

6 days ago

so when they release Mythos, GPT5.6 etc. and some people start to say that they don't see the difference ... here you see the answer. trivial tasks and routine tasks are saturated, only people working on hard open-ended challenges will notice

BobbyGRG's tweet photo. so when they release Mythos, GPT5.6 etc. and some people start to say that they don't see the difference ... here you see the answer. trivial tasks and routine tasks are saturated, only people working on hard open-ended challenges will notice https://t.co/xtY5t9OOkV

9

92

8

21

10K

0

0

0

0

65

Dr. Bobby Gomez-Reino

2 days ago

Not liking this token-saving strategy from OpenAI when you paste content in ChatGPT. Basically they are creating a temp file with you content and then letting the model use tools to search, possibly get a summary etc. I want to have control over that optimization.

0

0

0

0

39

Dr. Bobby Gomez-Reino

3 days ago

@tensorqt I know the feeling. multiple sessions open in parallel reading here and there, accepting once in a while...

BobbyGRG's tweet photo. @tensorqt I know the feeling. multiple sessions open in parallel reading here and there, accepting once in a while... https://t.co/NkfGtAWSYP

0

1

0

0

57

Dr. Bobby Gomez-Reino

3 days ago

@balakhonoff @mikeydsoftware human inputs can just be another type of event feeding your agentic system context

0

1

0

0

12

Dr. Bobby Gomez-Reino

3 days ago

@_overment @plainionist so this would happen with any harness supporting MCP. Could be a bad MCP toolset implementation, or it could be that MCP server does wonders and it is worth to add when needed.

1

1

0

0

38

Dr. Bobby Gomez-Reino

3 days ago

@_overment @plainionist is it? I would assume first the OP was surprise because even without using the tools the number of tokens was double than now. But this could be just because he was using MPC servers adding a lot of tools with extensive instructions to the context...

1

2

0

0

45

Dr. Bobby Gomez-Reino

3 days ago

So, if we believe in exponentials, we should expect to have a Mythos-level model at <10usd/mtok before xmas,... right? 👀 people are not building for that, or are they?

BobbyGRG's tweet photo. So, if we believe in exponentials, we should expect to have a Mythos-level model at <10usd/mtok before xmas,... right? 👀 people are not building for that, or are they? https://t.co/WcFAXwDLh5

0

0

0

0

50

Dr. Bobby Gomez-Reino

4 days ago

people asking LLMs to make up disturbing images and then feeling offended about the result is peak dumb

0

0

0

0

41

Dr. Bobby Gomez-Reino

5 days ago

maybe. but our programmers use much more than 200usd per month of compute for coding. i guess regardless of coding harness. even myself (that i had pretty much stopped programming before 2024) i need now tokens like oxygen :D I think we would hit similar cost, without the flexibility of switching model to best available if needed. migrate rules. you might be totally right on a proper comparable setup blind test. but i guess other priorities than optimizing that cost. if cursor keep improving the value we generate is massive.

BobbyGRG's tweet photo. maybe. but our programmers use much more than 200usd per month of compute for coding. i guess regardless of coding harness. even myself (that i had pretty much stopped programming before 2024) i need now tokens like oxygen :D I think we would hit similar cost, without the flexibility of switching model to best available if needed. migrate rules.
you might be totally right on a proper comparable setup blind test. but i guess other priorities than optimizing that cost. if cursor keep improving the value we generate is massive.

1

2

0

0

36

Dr. Bobby Gomez-Reino

6 days ago

so when they release Mythos, GPT5.6 etc. and some people start to say that they don't see the difference ... here you see the answer. trivial tasks and routine tasks are saturated, only people working on hard open-ended challenges will notice

BobbyGRG's tweet photo. so when they release Mythos, GPT5.6 etc. and some people start to say that they don't see the difference ... here you see the answer. trivial tasks and routine tasks are saturated, only people working on hard open-ended challenges will notice https://t.co/xtY5t9OOkV

9

92

8

21

10K

Dr. Bobby Gomez-Reino

5 days ago

>>I gave a 10min voice note to Cursor, left to go eat dinner we record 15-20 meetings including all relevant stakeholders (feature owner, architect, cyber, ux), then ask for a plan, review on the fly (mostly headlines), correct, run and test (95% success rate if discussion is rich enough). We call it EVC (Extreme Vibe Coding) session. Never going back to any other thing. Eliminated information loss from legacy processes.

0

0

0

1

73

Dr. Bobby Gomez-Reino

5 days ago

@Cool_Goose in fact, maybe this is just explained by some regression in Opus4.7 compared to Opus4.5/4.6. So when 4.7 was internally released, probably many swapped model from 4.6 to 4.7 for trivial tasks. I assume this doesnt mean that Mythos was used for those and performed worse.

BobbyGRG's tweet photo. @Cool_Goose in fact, maybe this is just explained by some regression in Opus4.7 compared to Opus4.5/4.6. So when 4.7 was internally released, probably many swapped model from 4.6 to 4.7 for trivial tasks. I assume this doesnt mean that Mythos was used for those and performed worse. https://t.co/ikZo3CoU8D

0

1

0

0

35

Dr. Bobby Gomez-Reino

5 days ago

@Cool_Goose yes, that is one i cannot really figure out with the info in this chart. maybe someone from ant explained it already?

1

1

0

0

18

Dr. Bobby Gomez-Reino

5 days ago

@_overment 60 inference any programmer can do in a day, no? I think jensen went to far when he mentioned the 250k usd/year per programmer, but 25k sounds maybe quite reasonable?

0

1

0

0

11

Last Seen Users on Sotwe

Trends for you

Most Popular Users