Babel @babelDefi - Twitter Profile

I work at a much smaller company and have been rewriting our entire company from scratch basically, 10+ app ecosystem and trying to keep the whole system in memory and verifying (lots of screenshots). And I rarely trust sonnet for subtasks etc. I guess maybe I need to start trying out the cheaper models?

0

1

0

14

Babel

@babelDefi

about 17 hours ago

I use up my 200/ month limit every single week, usually with a day or two left so I have to switch to my codex for those few days. Running round the clock on 1/4th the usage limits doesn’t sound realistic. I could optimize my usage a bit definitely. But you’re talking 25% of my limit being more than enough

1

0

69

Babel

@babelDefi

about 17 hours ago

@shade_engine @steph_palazzolo Yeah the model that found 10,000 zero day exploits in nation-critical infrastructure within weeks of being released is probably ass

2

0

118

Babel

@babelDefi

about 17 hours ago

@nzmrldev @elliotarledge See here

Taelin

@VictorTaelin

about 21 hours ago

Just an example on Opus 4.8 outclassing GPT 5.5 in ways that are invisible to benchmarks. (When I post these, this is NOT an attack on OpenAI in any way, quite the opposite, I just want things to improve...) Left both working in a generic goal: "optimize this file". After 8 hours: → Opus 4.8 landed a solid +17% → GPT 5.5 landed +30% I then checked file sizes. → GPT 5.5 *doubled* the file → Opus 4.6 grew it by 0.1% (!!!!!) For most benches, 5.5 would have beat 4.8 here, but clearly Opus did a much better job. GPT produced a short-term win that would stale further progress if I merged it. Opus delivered a no-tradeoff, long-term win. And if I had asked GPT 5.5 to "keep the file size the same", it would just start hacking that, minifying, removing docs, etc. - something Opus 4.8 just doesn't do. Its file is as clear as it was when I set the goal. --- About this: it is an "HVM5 v2" that is even simpler. Now the whole file is at <14k tokens, and consistently outperforming HVM4 by 5-fold. And this version does not have native constructors, only Unit (`()`), Either (`inl(x)/inr(x)`) and Pair (`tup(x,y)`) as primitives, which is very GPU friendly and means we might actually manage to run SupGen on it! Opus's overnight progress:

VictorTaelin's tweet photo. Just an example on Opus 4.8 outclassing GPT 5.5 in ways that are invisible to benchmarks. (When I post these, this is NOT an attack on OpenAI in any way, quite the opposite, I just want things to improve...)

Left both working in a generic goal: "optimize this file".

After 8 hours:
→ Opus 4.8 landed a solid +17%
→ GPT 5.5 landed +30%

I then checked file sizes.
→ GPT 5.5 *doubled* the file
→ Opus 4.6 grew it by 0.1% (!!!!!)

For most benches, 5.5 would have beat 4.8 here, but clearly Opus did a much better job. GPT produced a short-term win that would stale further progress if I merged it. Opus delivered a no-tradeoff, long-term win. And if I had asked GPT 5.5 to "keep the file size the same", it would just start hacking that, minifying, removing docs, etc. - something Opus 4.8 just doesn't do. Its file is as clear as it was when I set the goal.

---

About this: it is an "HVM5 v2" that is even simpler. Now the whole file is at <14k tokens, and consistently outperforming HVM4 by 5-fold. And this version does not have native constructors, only Unit (`()`), Either (`inl(x)/inr(x)`) and Pair (`tup(x,y)`) as primitives, which is very GPU friendly and means we might actually manage to run SupGen on it!

Opus's overnight progress:

35

430

16

87

32K

1

5

0

1

955

Babel

@babelDefi

about 18 hours ago

@Gabriel78470020 @elliotarledge Wrong. This benchmark will be the only one that really matters moving forward, as we’ve already heard confirmed from basically every lab

0

516

Babel

@babelDefi

about 18 hours ago

@nzmrldev @elliotarledge 4.8 was basically tuned for ultra code. The layers of critique and verification make this pretty believable

0

1

0

301

Babel

@babelDefi

about 18 hours ago

@prelife_ @J_D_Becerra @WNY_Plinker @DCplaysgames @Elvish_Harper Are you fucking retarded? Why come into a Star Wars thread to whine about the fact that it isn’t true history and it’s all made up?

1

0

31

Babel

@babelDefi

about 18 hours ago

@Not_A_DoctorOk @fiago7 @GlynnErnesto Well I live near an airport. There are airports everywhere in the U.S. having a hard time imagining someone who doesn’t see passenger planes in the air daily.

1

8

0

138

Babel

@babelDefi

about 18 hours ago

@Not_A_DoctorOk @fiago7 @GlynnErnesto Yeah seeing a passenger plane fly over (the kind that we see 100 of every day) is just as exciting as b52 stratofortress bombers

1

10

0

302

Babel

@babelDefi

1 day ago

@bobloblaw635654 @TravLCox @SportsJWQuacken I don’t think I’ve heard about tithing in a sacrament meeting more than a few times in many years

0

28

Babel

@babelDefi

1 day ago

@haider1 Because it is different architecture lol

0

4

0

536

Babel

@babelDefi

1 day ago

@KC_Kreative @japan_nobunaga Bumping into someone and telling them “this is unresolved between us” before leaving

1

0

32

Babel

@babelDefi

1 day ago

@AnthonyGalli @ericernerstedt @TheCriticalDri2 House is really good honestly but it’s not the same genre as breaking bad. It is comfort / background watching like Seinfeld or something

0

22

Babel

@babelDefi

3 days ago

@LostMyHats @AmericaOnlycast I was raised LDS and went to seminary and studied it all through high school etc. even if I don’t consider myself a member anymore, basically none of what you said is true.

0

129

Babel

@babelDefi

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users