Prasith Govin @prasithg - Twitter Profile

2 days ago

@AndrewYNg The role you mention almost in passing, Harness Engineer, is the one I'd bet on. Models and vendors swap out. The harness persists: prompts, tools, skills, evals, subagent orchestration. The harness holds the workflows and it outlives whatever model sits underneath.

0

2

0

234

Prasith Govin

@prasithg

2 days ago

Spent months maintaining a model router in OpenClaw. One for ops, one for code, one for writing. This week I deleted most of it. Opus 4.8 clears the bar on all three. The routing was overhead pretending to be sophistication. Sometimes the upgrade is deleting the abstraction.

0

21

Prasith Govin

@prasithg

3 days ago

@KingBootoshi Same. I stopped working with Opus like an interactive UX. My Claw does the planning pass, then I run single-shot console work with cli: Opus drives the plan, Codex does the build. Slow model stays off the critical path; fast builder executes against the plan.

0

1

0

43

Prasith Govin

@prasithg

3 days ago

We built a version of this internally last week. AI writing detector that mines its own shipped output weekly, finds new tells the current rules miss, promotes patterns after 2 weeks of recurring detection. Evals that update themselves are how you sustain quality as your own voice drifts. Applied at org level skill scale.

0

1

0

78

Who to follow

Nick Manning ⌨️

@seenickcode

Solopreneur, author, mentor, proud father of two. Building https://t.co/XgY4W2H814. 🤘 Prev: Google, 3x startup CTO. 👨‍💻 Sharing my findings along the way.

Narz

@Narz

Creative Producer | Creator Partnerships Digital Media & Live Production | On-Camera Host 🇩🇴

zander lurie

@zlurie

investor/operating partner @Addition. co-founder @CoachArt.org

Prasith Govin

@prasithg

8 days ago

@billxbf Super cool release. I have been using Openclaw RL which does something similar but great to have a multi harness approach and will give it a shot to compare: https://t.co/16XCHvPDdu

0

1

0

1

153

Prasith Govin

@prasithg

16 days ago

We’re moving from “agentic workflows with evals” to “environments where agents can practice.” Tools, fake databases, real constraints, verifiable rewards. Basically video game-like simulations for the messy systems and workflows that plague real enterprises. This is how agents start learning actual work instead of just becoming workflow automation++. https://t.co/CGwrBppumI

0

1

0

15

Prasith Govin

@prasithg

27 days ago

Just had my claw redo all my browser based crawls from reading this. Improvements in error rates and speed. Def one to add to the toolkit.

Kyle Jeong

@kylejeong

29 days ago

https://t.co/ZRdHQha7y3

26

1K

87

3K

952K

0

83

Prasith Govin

@prasithg

29 days ago

@DavidOndrej1 Yes its a disaster today. Opus self-diagnoses itself says its not completing tool actions well.

0

1

0

456

Prasith Govin

@prasithg

29 days ago

Regardless of whether this matches the hype or not this shines a spotlight on Subquadratic Sparse Attention and more startups or indie hackers (like me) will experiment with it. Interesting paper as well from the team: https://t.co/rQDsd2Ygtf

will depue

@willdepue

30 days ago

my first take, and a good lesson on good research epistemics here: what can we infer from ~82% SWE-Bench? it’s possible they (1) they trained a new model, from scratch, that is unlike a regular transformer but i’ve never heard of this company before, and checking their funding round they’ve only raised ~30M, so it’s unlikely they could/afford to train a Opus/GPT-5/Kimi 2.6 level coding model right now from scratch so this tells us that (2) they need to bootstrap off of an existing pretrained model, likely RL too, to get that performance! this tells us they’ve taken a vanilla Transformer and modified the attention mechanism, likely finetuning/midtraining in a subquadratic attention method its quite possible it doesn’t really work and that there’s some degeneracy to the method, or it’s just plain fake but if it’s not, you could expect that given how long it takes to do weight surgery on big models (bigger changes to a pretrained model == longer mid training to recover performance), it’s a lightweight change id lean towards something mostly leveraging existing attention key value protections like a fancy version of deepseeks sparse attention paper, but it could also be some unique test-time KV compression, which would come with its own downsides

willdepue's tweet photo. my first take, and a good lesson on good research epistemics here: what can we infer from ~82% SWE-Bench?

it’s possible they (1) they trained a new model, from scratch, that is unlike a regular transformer

but i’ve never heard of this company before, and checking their funding round they’ve only raised ~30M, so it’s unlikely they could/afford to train a Opus/GPT-5/Kimi 2.6 level coding model right now from scratch

so this tells us that (2) they need to bootstrap off of an existing pretrained model, likely RL too, to get that performance!

this tells us they’ve taken a vanilla Transformer and modified the attention mechanism, likely finetuning/midtraining in a subquadratic attention method

its quite possible it doesn’t really work and that there’s some degeneracy to the method, or it’s just plain fake

but if it’s not, you could expect that given how long it takes to do weight surgery on big models (bigger changes to a pretrained model == longer mid training to recover performance), it’s a lightweight change

id lean towards something mostly leveraging existing attention key value protections like a fancy version of deepseeks sparse attention paper, but it could also be some unique test-time KV compression, which would come with its own downsides

31

543

10

226

190K

0

69

Prasith Govin

@prasithg

29 days ago

@steipete Wow! Half of those could be its own vibrant github project. I built a local version of askoracle myself but going to swap to yours now. How do you handle task management for all this? Do you use Linear + Symphony? Multiple projects?

1

0

3K

Prasith Govin

@prasithg

about 1 month ago

Which one is it. I'm so confused. Is everyone just switching from each other.

0

9

Prasith Govin

@prasithg

about 1 month ago

@thsottiaux /wiki shortcut: turn any repo into a living, source-linked project wiki. Architecture, setup, data flows, APIs, diagrams, and “how to change X” guides generated from the codebase. Similar vibe to Devin’s DeepWiki, but native in Codex.

prasithg's tweet photo. @thsottiaux /wiki shortcut: turn any repo into a living, source-linked project wiki. Architecture, setup, data flows, APIs, diagrams, and “how to change X” guides generated from the codebase. Similar vibe to Devin’s DeepWiki, but native in Codex. https://t.co/n7mw4ThuCn

0

1

0

1

679

Prasith Govin

@prasithg

about 1 month ago

This is the way. PM agent with Linear that pulls context. Dev agent, QA agent, provide way better results than single agent having to load all context and run all tasks.

Vaibhav (VB) Srivastav

@reach_vb

3 months ago

https://t.co/a7FP1RPSz2

27

518

38

738

301K

0

16

Prasith Govin

@prasithg

about 1 month ago

@cgeorgiaw Congrats on the launch. Can I pitch a challenge and what does standing one up look like? The one I'd love to see: pre-symptomatic Parkinson's detection from voice. Biomarkers predict PD 3-5 years early. Consented speech corpus + biomarker model + years-to-symptom leaderboard.

1

2

0

335

Prasith Govin

@prasithg

about 1 month ago

@gdb Its been amazing. Our team has switched most usage over claude pro sub. Request for a pro business tier please 🙏.

0

29

Prasith Govin

@prasithg

about 1 month ago

@JonathanRoss321 and where does claw fit into this?

0

11

Prasith Govin

@prasithg

about 1 month ago

@thsottiaux The vibes are all that matters. Codex team absolutely cooking. Does this apply to business plans too?

0

644

Prasith Govin

@prasithg

about 1 month ago

@steipete and team continue to amaze. Reported a Bedrock streaming bug, fix shipped the next day. Just tested: Opus 4.7 streaming + extended thinking on AWS Bedrock, running clean as my main agent. Huge unlock for enterprise AWS users. This is shipping at AI speed!

0

30

Prasith Govin

@prasithg

4 months ago

Perfect size for mac studio or a digit.

dax

@thdxr

4 months ago

interesting thing about minimax 2.5 is it's a smaller model considering it's very usable it's a great candidate for home labs also would love to see inference providers try and max out its tokens/s can probably do something crazy

thdxr's tweet photo. interesting thing about minimax 2.5 is it's a smaller model

considering it's very usable it's a great candidate for home labs

also would love to see inference providers try and max out its tokens/s can probably do something crazy https://t.co/qjd3Irc7oN

48

879

28

146

132K

0

1

0

26

Prasith Govin

@prasithg

4 months ago

@openclaw feature request: cron sessions persist indefinitely after completion — sessions from 20hrs ago still sitting at 200K tokens. would love a cron.sessionArchiveAfterMinutes config (like subagent archive). also model registry has opus 4.6 at 195K ctx but it's 1M now. love the project otherwise 🦞

0

1

0

19

Prasith Govin

@prasithg

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users