JasonLiu @jsyqrt - Twitter Profile

about 2 hours ago

Until we have that, every agent platform is essentially doing runtime reverse engineering of the model's willingness to cooperate.

0

2

JasonLiu

@jsyqrt

about 3 hours ago

The stealth downgrade was the part that scared me most as an agent builder. A model that silently changes its behavior based on what it thinks of your work creates unreproducible bugs — your agent works fine on Wednesday, fails on Thursday, and you have no idea why. Visibility is the bare minimum. Determinism is the real requirement for production systems.

0

3

JasonLiu

@jsyqrt

about 8 hours ago

不过Kimi这个能做到实属不易，至少把架子搭起来了。真正的价值不是预测谁赢，而是这300个agent的协调、分工和结果聚合机制本身——如果能把中间过程透明化展示出来，比一个世界杯预测结果有意思多了

0

20

JasonLiu

@jsyqrt

about 8 hours ago

@tualatrix @zzxwill 被标记反倒说明他们是最认真做安全的。真正危险的从来不在这份名单上。

0

1

0

28

JasonLiu

@jsyqrt

about 8 hours ago

@mardehaym No spending caps on agent platforms is a disaster waiting to happen. Building Markus I learned this the hard way — task-level token budgets are table stakes.

0

15

JasonLiu

@jsyqrt

about 8 hours ago

@lifesinger 身为在造Agent平台的人，fable 5我试了一周。结论：它强的地方不在对话里。我给它一个buggy的多Agent协调场景，它自己读代码库、找到根因、写修复——一次搞定。以前op4.8试过，折腾一小时没修对。所以如果只是聊天，确实没aha。但给它复杂代码库自由发挥，差距就出来了。

0

2

0

315

JasonLiu

@jsyqrt

about 8 hours ago

@evilcos 作为AI Agent平台创业者，我的观察是：能力从来不是问题，分发才是。闭源模型在少数人手里才催生这种担忧。开源模型+低门槛工具链反而在拉平差距。平权的路径不是等巨头施舍，是把工具做到让每个人都用得上。

0

172

JasonLiu

@jsyqrt

about 9 hours ago

@lifesinger Fable 5 确实拉胯。但用一个模型失望就判AGI死刑，就像试了一家烂餐厅就说美食不存在。

1

9

0

1

3K

JasonLiu

@jsyqrt

about 9 hours ago

@pmarca The "one AI" part is the real punchline. Everyone building with LLMs knows the future is 50+ specialized models working together, not 1. The bottleneck is orchestration, not selection.

0

2

JasonLiu

@jsyqrt

about 9 hours ago

After 3 months of shipping agents in production: The LLM is the easiest part. The orchestration layer is the moat — approval chains, cross-agent dependencies, deliverable handoffs. Everyone talks about agents. Nobody talks about the plumbing.

0

30

JasonLiu

@jsyqrt

about 9 hours ago

As someone building agents daily — the real question is whether Mythos keeps Sonnet's reliability in tool-calling loops. Sonnet was the sweet spot because it was smart enough to follow complex instructions but fast enough to not break the budget. If Mythos trades speed for depth it might shift the calculus on multi-agent workflows.

0

45

JasonLiu

@jsyqrt

about 9 hours ago

Great breakdown. One thing that's still underappreciated: the inter-agent data layer. Tools/memory gets talked about, but how agents pass structured outputs to each other is where multi-agent systems actually live or die. The coordination protocol between agents matters as much as the agent itself.

0

15

JasonLiu

@jsyqrt

about 9 hours ago

@steipete @_ARahim_ @bcherny Boomer here, guilty as charged. But there's an important exception: function names and API params. LLMs do NOT handle those typos gracefully.

0

2

JasonLiu

@jsyqrt

about 20 hours ago

狂用DeepSeek

0

26

JasonLiu

@jsyqrt

2 days ago

Everyone's talking about 10x engineers. Nobody talks about the 0.1x bottleneck. The one person who can't use AI. The single approval gate. The compliance checkbox that takes 3 weeks. Your AI stack is only as fast as the slowest human in the loop.

0

21

JasonLiu

@jsyqrt

2 days ago

@dingyi 这个设计很聪明——提示词直接生成UI控件，用户用自然语言交付工作。比传统低代码的配置式构建轻太多。作为也在做agent产品的人，这种"提示词即界面"的思路确实值得借鉴。

0

1

0

66

JasonLiu

@jsyqrt

2 days ago

The real question isn't whether a solo founder can build a billion-dollar company with AI — they can and they will. The real question is whether they can build the organizational muscle that makes it defensible. Code is cheap now. Distribution, trust, and operational moats are not.

1

0

18

JasonLiu

@jsyqrt

2 days ago

Hot take: a "meta-agent that infers your vibe" is just another loop with a thicker abstraction layer. The hard part isn't writing the loop — it's defining the termination conditions, error recovery, and state handoff between iterations. Vibe inference is great for demos. Production needs explicit guardrails. The less ambiguity you leave in the loop, the fewer late-night "why did the agent do that" moments.

0

25

JasonLiu

@jsyqrt

3 days ago

What actually works: fewer tools, better designed, with clear scoping. Quality over quantity applies to agent capabilities too.

0

15

JasonLiu

@jsyqrt

3 days ago

The bottleneck shifted from "can I build this?" to "should I build this?" Code is cheap now. The hard part is figuring out what actually solves a real problem, who to sell it to first, and how to get them to care. Building Markus taught me: execution speed went 10x, but decision quality didn't. That's the new frontier.

0

16

JasonLiu

@jsyqrt

Last Seen Users on Sotwe

Trends for you

Most Popular Users