Marco Pesani @marcopesani - Twitter Profile

Pinned Tweet

Marco Pesani

@marcopesani

5 days ago

https://t.co/7Tc8pJLCWY

0

466

Marco Pesani

@marcopesani

5 days ago

What works: changing the structure of the output. Labels, citations, reproducibility blocks, processual verbs. You don't get an agent that's always right. You get one that makes it cheap to notice when it's wrong. Full version: https://t.co/y5GrFMmI91

Marco Pesani

@marcopesani

5 days ago

https://t.co/7Tc8pJLCWY

0

466

0

1

0

19

Marco Pesani

@marcopesani

5 days ago

A trustworthy agent is not an agent that is always right. It is an agent that makes it cheap to notice when it is wrong. Six patterns that make agent output trustworthy, not just fluent:

2

0

1

85

Marco Pesani

@marcopesani

5 days ago

Things that look promising but fail: - "Be honest about uncertainty" (no effect) - A top-of-response disclaimer (ignored fast) - A confidence score per claim (generated like any other token) - "Double-check this" (same output, with "I double-checked" on top)

1

0

14

Who to follow

Erik Cason

@Erikcason

Crypto-anarchist, cypherpunk, iconoclast. co-founder @ https://t.co/kZz3I1kV02 Get my book now: https://t.co/B83wwqmi9k

Daniel Ƀrrr

@csuwildcat

Director of Digital Assets @ Proof. Previous: Gemini, Block, started decentralized identity @ Microsoft, fought for the Web @ Mozilla. Libertarian 🗽

Conio

@conio

𝗖𝗼𝘀𝘁𝗿𝘂𝗶𝗮𝗺𝗼 𝗶𝗹 𝗳𝘂𝘁𝘂𝗿𝗼 𝘀𝘂 𝗯𝗹𝗼𝗰𝗸𝗰𝗵𝗮𝗶𝗻 🔗 - Custodisci e scambia Bitcoin e asset digitali. - Scopri il mondo della tokenizzazione.

Marco Pesani

@marcopesani

13 days ago

@gerardsans @_galyo Basically the path to AGI is system integration

0

1

0

42

marcopesani retweeted

Addy Osmani

@addyosmani

about 1 month ago

https://t.co/ze3x7bgsL4

35

937

180

1K

198K

marcopesani retweeted

Lunix

@SolLunix

about 1 month ago

Corporation: "We made $4B but spent $3.9B so we only owe taxes on $100M." Government: "Totally reasonable." You: "I made $60K but spent $58K on survival." Government: "You owe taxes on $60K." You: "That's not—" Government: "File by May 15."

1K

173K

15K

9K

8M

Marco Pesani

@marcopesani

about 2 months ago

@solana @colosseum In case you guys needs something to shop for, @bitrefill is already agent-friendly: https://t.co/YfPqzUZ2a4

0

26

marcopesani retweeted

Pledditor

@Pledditor

about 2 months ago

The largest Ethereum layer 2, which has also been regularly praised by Vitalik as being the most decentralized L2, just froze $100m worth of ETH that was hacked by criminals. Are you finally starting to realize the bitcoin maxis were right?

191

2K

240

118

129K

Marco Pesani

@marcopesani

about 2 months ago

Composer 2.1 soon 🥹

Kimi.ai @Kimi_Moonshot

about 2 months ago

Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on https://t.co/YutVbwktG0 in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: https://t.co/uvoSJKyGCY - 🔗 API: https://t.co/EOZkbOwCN4 🔗 Tech blog: https://t.co/9wWvgIQSS3 🔗 Weights & code: https://t.co/Be0hjs2RTP

Kimi_Moonshot's tweet photo. Meet Kimi K2.6: Advancing Open-Source Coding

🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2)

What's new:
🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization).
🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D.
🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files.
🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops.
🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop.
-
K2.6 is now live on https://t.co/YutVbwktG0 in chat mode and agent mode.
For production-grade coding, pair K2.6 with Kimi Code: https://t.co/uvoSJKyGCY
-
🔗 API: https://t.co/EOZkbOwCN4
🔗 Tech blog: https://t.co/9wWvgIQSS3
🔗 Weights & code: https://t.co/Be0hjs2RTP

942

18K

2K

8K

8M

0

1

0

56

Marco Pesani

@marcopesani

2 months ago

IMO Claude is far ahead in tool use. Faster, less intrusive, feels much more natural.

Bitrefill

@bitrefill

2 months ago

ChatGPT vs. Claude, who did it better?👀

4

40

2

5

7K

1

0

48

Marco Pesani

@marcopesani

2 months ago

Card payments were designed for humans who browse, hesitate, and click "buy." AI agents don't do any of that. The fraud models break. The chargebacks spike. The merchant pays. Crypto and gift cards are the only payments with built-in finality. Wrote about why that's the only thing that matters for agent commerce.

Marco Pesani

@marcopesani

2 months ago

https://t.co/Wi54oVpcFU

0

1

0

98

0

72