Emmanuel Galanos @egalanos - Twitter Profile

Emmanuel Galanos @egalanos

about 6 hours ago

@skuffd They said it's the same underlying model as Mythos 5 (non-preview), but with safeguards.

0

2

0

84

Emmanuel Galanos @egalanos

about 6 hours ago

@AsphaltCowb0y @bcherny @binaryminary They previously screwed up the frontmatter name - context: fork actually means a NEW context. What he means by fork: true is that it would inherit the current context (i.e. skill execution would see all history, but caller of skill only sees the result at the end)

0

15

Emmanuel Galanos @egalanos

about 7 hours ago

@joegibbs98 @tszzl Yes. Can just keep breaking down any skill into aspects to verify. Can also recursively break down a skill into sub-skills. I expect the most difficult domains will be things like judging subjective things against current norms (e.g. making a funny joke about current affairs)

0

1

0

5

Emmanuel Galanos @egalanos

about 17 hours ago

@joegibbs98 @tszzl Wherever there's a generator / verifier gap, then an ansemble of focused verifiers can grade to a rubric -> improvement feedback. That could be applied to nearly any domain.

2

1

0

10

Who to follow

Kam 🌱📒

@lifemademore

🌱lifelong learner | process driven | creative Tana Navigator 🧭 | Notion Essentials Certified🏗️ 📝Sharing tips on TfT, PKM and intentional living.

Renée

@reneedefour

+ Freelance operations manager + Right hand woman to YouTube creator entrepreneurs + Dreaming up something wild and improbable

Noodle Nodes

@Noodle_Nodes

Exploring object-oriented notetaking

Emmanuel Galanos @egalanos

about 17 hours ago

@reach_vb @Angaisb_ I find 5.5 over fixates on irrelevant things in long threads and it's hard to get it to let go and focus on what's important

0

1

0

9

Emmanuel Galanos @egalanos

about 18 hours ago

@thsottiaux Only some tasks can be specified enough, even then, it's hard to spec every obvious thing. E.g. /goal Cypress -> Playwright migration was discussed & done, but tests put in wrong places as autistic GPT fixated on `legacy` word and put all tests in a single file named that.

0

1

272

Emmanuel Galanos @egalanos

2 days ago

@badlogicgames @FrameworkPuter You may find Voxtype interesting. Good model support and desktop integration.

0

364

Emmanuel Galanos @egalanos

3 days ago

@embirico Most usage to Codex CLI (as no Linux desktop app). Due to GPT's more thorough & bug free code vs Opus. Not perfect though. GPT overly fixates on the wrong things. Codex CLI missing a lot of QoL features, but at least not the buggy CC mess. Combo of GPT + Opus is best.

0

61

Emmanuel Galanos @egalanos

4 days ago

@antirez @AMD GPU stability was painful for the first few months. It's good now, but you definitely want the newest distro. Fedora 44 has been rock solid. Ubuntu LTS is too old.

0

1

0

135

Emmanuel Galanos @egalanos

5 days ago

@Kappaemme1926 Lack of Linux desktop app. Oh and not being able to rewind/go back in the conversation.

0

6

Emmanuel Galanos @egalanos

6 days ago

@zeeg Any runs with 4.8 on max effort? Curious whether it is capability or laziness

1

0

31

Emmanuel Galanos @egalanos

7 days ago

@tharshan_09 @badlogicgames Just ask Claude to explain how the Workflow tool works. Essentially a workflow is a JavaScript file with a few function primitives for workflow phase, agent, pipeline, parallel. The tool docs tell Claude how to program it and then run it.

0

38

Emmanuel Galanos @egalanos

7 days ago

@ClaudeDevs Now you just need to fix the Skill frontmatter original sin of `context: fork` behaviour of running in a new context.

0

235

Emmanuel Galanos @egalanos

12 days ago

@theo @BrunoBertapeli Their benchmark placing 4.7 > 5.5 clearly isn't aligned with actual real world engineering where GPT is clearly ahead. So 4.8 < 4.7 doesn't have much significance IMO.

0

360

Emmanuel Galanos @egalanos

12 days ago

@antirez Jaggeredness + regressions. GPT-5.2 had better reasoning, attention to detail, debugging. Worse in big picture, EQ, design, TPS, etc. 5.3-codex pulled ahead in core coding, but still generalist gaps where Opus better. 5.4 clear lead w/ CC & Opus regressions. 5.5 mogging.

0

650

Emmanuel Galanos @egalanos

13 days ago

@badlogicgames OpenAI are still deep asleep on B2B - largest team plan is still $25/month (Claude has $140 premium seats). Extra already paid usage credits sitting in org account are impossible to use by members who run out of quota. Tagging/messaging OpenAI peep goes nowhere.

1

0

101

Emmanuel Galanos @egalanos

13 days ago

@ClaudeDevs /btw can't easily escape back to prompt afterwards (no flicker mode). Feels locked up after giving the response, though sometimes with enough mashing of keys, can manage to get back.

0

167

Emmanuel Galanos @egalanos

15 days ago

@badlogicgames Neither based on stated reason. Would have fn scan first with early return of orig array on happy path. It calls seperate mutator fn if payload size exceeded. Or maintain session messages meta obj that tracks size/counts and only call pruner if needed.

0

124

Emmanuel Galanos @egalanos

16 days ago

@trq212 So can Claude desktop be made to run on Linux then :)

0

1

0

23

Emmanuel Galanos @egalanos

17 days ago

@devteamdrew Opus 4.7 peak intelligence is higher than its predecessors, but its lows are worse with making basic errors (even early in context). It's also quite prone to get stuck adding verbose meta-commentary/narrative. My guess: security blunting nerfed its intelligence. Hard to trust.

0

3

0

352

Emmanuel Galanos

@egalanos

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users