Dave Lee

Verified account

@mega__d

Building MEGA — the optimization layer for production agents.

Joined November 2023

99 Following

20 Followers

81 Posts

Pinned Tweet

6 days ago

This is the direction we’re building with AgentOpt. Not just agents that run tasks, but agents you can measure, test, and optimize end to end. MEGA Optimus makes that loop executable.

8 days ago

MEGA Optimus is here. Point it to a project folder. Optimus writes the spec, builds the eval harness, and runs the optimization loop end to end. Reduce your token spend and latency while improving accuracy. https://t.co/Dx7mxZeVwZ Autonomous AgentOpt

0

12

3

1

772

0

1

0

0

105

about 9 hours ago

@iKunalmathur Love the local-first angle! Curious how you’re optimizing Notio for production, especially around messy voice inputs, category edge cases, and insight quality. Happy to connect.

0

0

0

0

13

about 13 hours ago

Building AI agents is getting easier. Improving them is still hard. Looking to connect with founders and builders working on reliability, evals, optimization, token efficiency, or AgentOps. Curious what you’re building and what you’re struggling with. Let’s connect, learn from each other, and build better agents together!

5

6

0

0

109

about 11 hours ago

@StevenACZ Great to connect too! Reliability is definitely the part that decides whether agents actually ship.

0

0

0

0

7

about 11 hours ago

@d_ai_1231 Connected!

0

0

0

0

4

about 13 hours ago

1

1

0

0

25

about 13 hours ago

@resatu What kind of teammates do you seek?

1

1

0

0

26

1 day ago

Hey AI builders 👋 Looking to connect with people working on: 🚀 Agent Systems 📊 AI Evals 🔄 Agent Optimization ⚡ AgentOps 🛠️ Developer Tools 🤖 AI Infrastructure 🔍 AI Security What are you building right now? And what’s been the hardest part of measuring, evaluating, or improving it? Drop it below 👇

4

2

0

0

185

1 day ago

@resatu Could you briefly explain what this service does?

1

0

0

0

25

1 day ago

@pk_iv evaluation-driven optimization layer — not just observing agent behavior, but continuously improving it.

0

1

0

0

60

1 day ago

@mytwillot Love this use case. Messy tweet data is exactly where evals get hard: categories drift, edge cases pile up, and “looks right” isn’t enough. Happy to compare notes.

0

1

0

0

17

1 day ago

@JustJerry121 Glad to connect as well. We treat failures as optimization assets. Once a failure is captured, we turn it into a repeatable eval case, add it to the evaluation set, and reuse it to validate future improvements.

0

0

0

0

16

2 days ago

@JonBuildsHQ Building infrastructure for eval-driven self-evolving agents https://t.co/d183mRa981

1

1

0

1

39

2 days ago

Love this pattern. We tested a persona-style SOUL.md used with OpenClaw and found that the operating contract itself can become part of the security surface. After hardening it, the agent became safer without losing the behavior that made it useful. Hermes’ SOUL.md could be even stronger with that layer too. Related test: https://t.co/4Kvh0vPbE1

0

1

0

1

199

2 days ago

@TTrimoreau AgentOpt https://t.co/d183mRa981

0

2

0

0

115

3 days ago

@garrytan The 10x less code direction makes sense for many workflows. Curious how you think this applies to enterprise use cases where deterministic outcomes, auditability, and compliance are critical.

0

0

0

0

172

3 days ago

Building AI agents is getting easier. But making the entire agent system reliable, cost-efficient, and better over time is still hard. We are building an optimization infrastructure for agent systems to solve this. It measures system performance, identifies what needs to improve across prompts, workflows, tools, and code, tests changes against real tasks, and keeps only what actually works. Explore MEGA Code: https://t.co/a7KUh2UKJU

0

1

0

1

55

3 days ago

@swyx @bentannyhill @Zach_Kamran Maybe not full self-driving yet. Observability and evaluation have come a long way. Automated improvement is the next frontier.

0

0

0

0

512

4 days ago

Everyone is building agents. But how do you know they’re actually behaving as intended? Do you rely on logs, evals, user feedback, manual review, or something else? And when they fail, how do you turn that failure into measurable improvement?

0

1

0

0

33

4 days ago

@ellen_in_sf Great breakdown. Reducing token usage during a session clearly matters. Beyond that, helping agents use fewer tokens while maintaining or improving task performance over time is just as important.

1

1

0

0

42

6 days ago

@karankendre More CEOs are going to wake up to AI costs soon.

0

0

0

0

52

Last Seen Users on Sotwe

Trends for you

Most Popular Users