Kevin Gu @KevinGu - Twitter Profile

Pinned Tweet

about 2 months ago

introducing Dex, the first agent system with full operational context and a self-updating knowledge base karpathy's llm knowledge base on steroids Dex ingests raw events from every app in your workspace. every slack message, email, meeting, browser session, task update compounds into one living knowledge base. background agents continuously monitor and enhance it while you sleep so your agents get smarter every day try now at joindex [dot] com

Regina Lin

@reggitales

about 2 months ago

Introducing Dex: the self-driving workspace for operators. Dex is the first agent system with full operational context and a self-updating knowledge base. Every datapoint from your workspace is ingested, synced, and structured into compounding context for agents to take action. Comment "DEX" or tag @dexbythirdlayer access. First 1,000 sign-ups get 7 days free. After, join our rolling waitlist. Sign up at joindex [dot] com for a fun surprise. How dex works (threads)

93

279

27

309

66K

8

110

6

133

20K

kevingu retweeted

Dex

@dexbythirdlayer

about 17 hours ago

four weeks before YC demo day, we killed what we'd been building. it had a 50% CMO response rate, but we couldn’t stop thinking about an idea that nobody had a name for yet… 10,000+ downloads later, that idea is Dex.

2

5

2

1K

Kevin Gu

@kevingu

14 days ago

@danylo_dev nice launch

1

0

156

Kevin Gu

@kevingu

15 days ago

@BainCapVC @kneureither @justoutquan https://t.co/ww5nUd2SHf

0

1

0

107

Kevin Gu

@kevingu

15 days ago

speaking at a panel moderated by @BainCapVC this thursday on predictive + proactive interfaces with @kneureither, @justoutquan, and Jenning Chen. rsvp in comments

kevingu's tweet photo. speaking at a panel moderated by @BainCapVC this thursday on predictive + proactive interfaces with @kneureither, @justoutquan, and Jenning Chen. rsvp in comments https://t.co/HE2f5igG6C

1

8

2

0

292

Kevin Gu

@kevingu

23 days ago

@alexshander03 @JudgmentLabs Congrats on the launch!!

1

15

1

0

634

Kevin Gu

@kevingu

30 days ago

@catherinehyeo @evatuecke @GreylockVC @neo @BoxGroup @Liquid2V @JeffDean congrats!!

0

2

0

78

kevingu retweeted

Alex Lieberman

@businessbarista

about 2 months ago

Someone is going to build a worldclass “Brain” for enterprises & make a stupid amount of money. Why? As @da_fant said, “coding w ai is solved bc all context is in the git repo. knowledge work is difficult bc context is spread out. an ai system that creates a git repo w all context for a knowledge worker will be able to 100% automate the work.” When companies talk about being data ready for AI, this is what they’re implicitly saying. Engineering has been prepared for this moment for a long time because of the deterministic nature of code, the centralization/versioning of data (read: GitHub), and AI tools that are largely build by engineers for engineers. But for the rest of white collar work, there’s a TON of catching up to do to properly harness the power of the technology. The big challenge here, and why no one has truly cracked the code for "an ai system that creates a git repo w all context for a knowledge worker" is because unlike code, most knowledge is 1) distributed, 2) unstructured, and 3) unverifiable. It's distributed: transcripts live in Granola. Documents in Notion. Customer Data in Hubspot. ERP. Emails. Slack messages. Random spreadsheets. SOP docs. Etc. Etc. Building an ingestion engine that connects to all of your disparate data sources and auto-updates based on the shelf-life of the data is the first, and frankly, easiest step of the process. Next, it's unstructured: let's say I want to create a proposal for a potential client. To nail the proposal, I want it to pull important information from a variety of sources. The specific asks & background from our initial sales call. Previous proposals to anchor ourselves to a proven format. And completed sprint boards from Linear, so the pricing & timeline in the document is grounded in truth. Whether it's a thoughtful filesystem (a la Obsidian) or an OpenClaw-esque memory structure, the brain needs to be great at self-organizing in a thoughtful schema. This is very hard, especially if you want to build a generalizable brain that can be shaped to an array of different enterprises. And finally, most knowledge is unverifiable: writing a function, running a unit test, and seeing if the code works is easy. It works or it doesn't. Using AI to accelerate your content creation process is highly subjective. What is a good/bad idea? Is the content in your voice or not? Does it feel like slop or novel? Answering these questions are both difficult and non-verifiable. That same system described above doesn't just have to be great at organizing & forming coherent relationships, but it also has to be great at self-improving based on feedback from the user. Memory systems (like those introduced by OpenClaw) are great to a point, but as you scale the corpus of data within your company's brain, things like compaction and cleaning become wildly important to avoid the needle in the haystack problem. Someone is going to figure out how to solve this problem, and when they do, not only will they make a shit ton of money, but they'll be robinhood for knowledge workers, enabling non-engineers to enjoy the sort of leverage that only technical folks have felt for the last few years.

156

913

72

2K

206K

Kevin Gu

@kevingu

about 2 months ago

@leonardtang_ Cool work, scratchpad/analysis markdowns are nice addition for statefulness for long running optimizations

1

0

189

Kevin Gu

@kevingu

about 2 months ago

@spencermateega congrats!

0

3

0

402

Kevin Gu

@kevingu

about 2 months ago

technical blog post coming soon

0

11

1

0

618

Kevin Gu

@kevingu

about 2 months ago

introducing Dex, the first agent system with full operational context and a self-updating knowledge base karpathy's llm knowledge base on steroids Dex ingests raw events from every app in your workspace. every slack message, email, meeting, browser session, task update compounds into one living knowledge base. background agents continuously monitor and enhance it while you sleep so your agents get smarter every day try now at joindex [dot] com

Regina Lin

@reggitales

about 2 months ago

Introducing Dex: the self-driving workspace for operators. Dex is the first agent system with full operational context and a self-updating knowledge base. Every datapoint from your workspace is ingested, synced, and structured into compounding context for agents to take action. Comment "DEX" or tag @dexbythirdlayer access. First 1,000 sign-ups get 7 days free. After, join our rolling waitlist. Sign up at joindex [dot] com for a fun surprise. How dex works (threads)

93

279

27

309

66K

8

110

6

133

20K

Kevin Gu

@kevingu

about 2 months ago

@reggitales the context system we've all been waiting for

1

2

0

320

kevingu retweeted

Regina Lin

@reggitales

about 2 months ago

Introducing Dex: the self-driving workspace for operators. Dex is the first agent system with full operational context and a self-updating knowledge base. Every datapoint from your workspace is ingested, synced, and structured into compounding context for agents to take action. Comment "DEX" or tag @dexbythirdlayer access. First 1,000 sign-ups get 7 days free. After, join our rolling waitlist. Sign up at joindex [dot] com for a fun surprise. How dex works (threads)

93

279

27

309

66K

Kevin Gu

@kevingu

about 2 months ago

@reggitales DEX

1

2

0

264

kevingu retweeted

Ejaaz

@cryptopunk7213

2 months ago

whoa this is actually fucking sick, a self-improving ai you can use yourself right now (for any task) dude created an ai agent that autonomously upgraded itself to #1 across multiple domains in < 24 hours…. then open sourced the entire thing but here’s why it actually works: - agents fucking suck, not because of the model, because of their harness (tools, system prompts etc) - Auto agent creates a Meta agent that tweaks your agents harness, runs tests, improves it again - until it’s #1 at its goal - best part: you can set this up for ANY task. in this article he uses it for terminal bench (code) and spreadsheets (financial modelling) - it topped rankings for both :) - secret sauce: he used THE SAME MODEL to evaluate the agent - claude managing claude = better understanding of why it failed and how to improve it humans were the fucking bottleneck and this not only saves you a load of time, it’s just a better way to train them for domain specific tasks seriously check it out

cryptopunk7213's tweet photo. whoa this is actually fucking sick, a self-improving ai you can use yourself right now (for any task)

dude created an ai agent that autonomously upgraded itself to #1 across multiple domains in < 24 hours…. then open sourced the entire thing

but here’s why it actually works:

- agents fucking suck, not because of the model, because of their harness (tools, system prompts etc)

- Auto agent creates a Meta agent that tweaks your agents harness, runs tests, improves it again - until it’s #1 at its goal

- best part: you can set this up for ANY task. in this article he uses it for terminal bench (code) and spreadsheets (financial modelling) - it topped rankings for both :)

- secret sauce: he used THE SAME MODEL to evaluate the agent - claude managing claude = better understanding of why it failed and how to improve it

humans were the fucking bottleneck and this not only saves you a load of time, it’s just a better way to train them for domain specific tasks

seriously check it out

37

849

53

2K

230K

kevingu retweeted

Zayne Sprague ✈️ ICLR Rio

@ZayneSprague

2 months ago

Neat project + post. This line of work is making me think of harnesses as transient reasoning scaffolds. Reminds me of earlier “XoT as intermediate structure” work (SatLM, Program-of-Thought, Tree-of-Thought, etc.), but now in an agentic regime with much more room for optimization. I keep wondering about this idea of meta-optimization: how broad should we expect that optimization to get? Should a meta-agent mostly do local search over prompts/tools/hyperparameters, or should it sometimes pursue riskier, longer-horizon interventions? Wilder example, but if we pointed something like AutoResearch at "coding", should we expect it to rediscover higher-level workflows or abstractions akin to Claude Code? My guess is that “meta” agents, or even deeper recursive optimizers, will tend to favor local improvements over sweeping pipeline redesigns. Very targeted and precise changes, even when broader features might help in the long run. Measuring that “meta-scope”, where do agents spend their time optimizing these harnesses, seems worth studying.

2

22

3

17

5K

kevingu retweeted

Kevin Gu

@kevingu

2 months ago

repo available here: https://t.co/I7ajRbatg6

13

137

11

208

32K

kevingu retweeted

0xMarioNawfal

@RoundtableSpace

2 months ago

Send this article to your agent and thank me later https://t.co/mHc2hDCYEK

46

3K

252

8K

846K

kevingu retweeted

tim ganiev

@postimortem

2 months ago

imo the meta-optimization + emergent features are the future of agent engineering. you don't know which concepts the model actually operates on (reminds me of polysemantic neurons), so it ends up improving its harness in ways you'd never hand-engineer pretty cool huh

0

9

2

6

2K

kevingu retweeted

Regina Lin

@reggitales

2 months ago

point AutoAgent at a task domain with evals. 24 hours later it has domain-specific tooling, verification loops, and orchestration logic. all discovered autonomously.

3

16

2

5

3K

Kevin Gu

@kevingu

Last Seen Users on Sotwe

Trends for you

Most Popular Users