AMC @TweetAnnaMarie - Twitter Profile

Pinned Tweet

over 3 years ago

I've been thinking about art (and "not art") for a long, long time. I finally started figuring out how to put some of those thoughts into words this year. Here's a first stab at it: https://t.co/nMTTgZw9So

0

6

2

1

0

AMC @TweetAnnaMarie

11 days ago

@AnthropicAI Big shoutout to the @Zapier team that built this: Robin Salimans, Lukas Bergstrom, Jake Talgard, Michael Haarala, Daniel Shepard, and everyone else who helped make this happen.

0

4

0

107

AMC @TweetAnnaMarie

11 days ago

Zapier benchmarks every major model. Today, for Opus 4.8, Anthropic benchmarked with us. Opus 4.8 is the highest scoring model yet on AutomationBench.

TweetAnnaMarie's tweet photo. Zapier benchmarks every major model.

Today, for Opus 4.8, Anthropic benchmarked with us.

Opus 4.8 is the highest scoring model yet on AutomationBench. https://t.co/azVi5HCMBu

1

9

2

1

978

AMC @TweetAnnaMarie

11 days ago

AutomationBench tests how models perform on the trickiest, stickiest real-world workflows we know customers are actually trying to automate. 600 tasks, 6 domains, deterministic scoring. And today our scores are featured on @AnthropicAI's official launch scorecard.

TweetAnnaMarie's tweet photo. AutomationBench tests how models perform on the trickiest, stickiest real-world workflows we know customers are actually trying to automate. 600 tasks, 6 domains, deterministic scoring.

And today our scores are featured on @AnthropicAI's official launch scorecard. https://t.co/KwK3y75WUk

1

5

2

0

220

Who to follow

Jackie Bavaro

@jackiebo

📙 Cracking the PM Interview 📒 Cracking the PM Career 👩🏼‍💻 Previously Head of Product Management @ Asana, PM @ Google & Microsoft.

Diana Kimball Berlin

@dianaberlin

Bookworm turned PM turned VC turned PM. Leading B2B product & GTM at @GammaApp. Before: VC @MatrixVC, product @Quip @SoundCloud @Microsoft. → [email protected]

Lenny Rachitsky

@lennysan

Deeply researched product, growth, and career advice

TweetAnnaMarie retweeted

Logan Kilpatrick

@OfficialLoganK

19 days ago

Gemini 3.5 Flash ranks #1 on Automation Bench (from Zapier), beating every other frontier model at a much lower cost

179

1K

61

123

135K

AMC @TweetAnnaMarie

about 2 months ago

Wanna measure how good models are at executing super challenging, real world workflows? We just open sourced our benchmarks. Check it

Wade Foster

@wadefoster

about 2 months ago

We built an AI benchmark that measures real work. Today we're releasing it to everyone. AI evals tell you whether a model can do complex reasoning or generate code. Useful, but usually not the question our customers ask. They want to know: can this model find the right CRM record, send the right follow-up, and not break anything along the way? We went looking for a benchmark that tested that. Nobody had built one, so we did. @Zapier’s AutomationBench drops AI models into realistic business environments across six domains (Sales, Marketing, Ops, Support, Finance, HR) and checks whether the work actually got done. The tasks include live CRM data, inbox threads with ambiguous context, and multi-step tool chains where one wrong call cascades. Scoring is deterministic: either the right records were updated and the right messages were sent, or they weren't. It’s useful enough that we're releasing it publicly today. Open task set, open methodology, open leaderboard. Everyone should have access to this. No model has cracked 10%. Yet. Try it here: https://t.co/V7qHAGX7Ql

wadefoster's tweet photo. We built an AI benchmark that measures real work.

Today we're releasing it to everyone.

AI evals tell you whether a model can do complex reasoning or generate code. Useful, but usually not the question our customers ask. They want to know: can this model find the right CRM record, send the right follow-up, and not break anything along the way?

We went looking for a benchmark that tested that. Nobody had built one, so we did.

@Zapier’s AutomationBench drops AI models into realistic business environments across six domains (Sales, Marketing, Ops, Support, Finance, HR) and checks whether the work actually got done.

The tasks include live CRM data, inbox threads with ambiguous context, and multi-step tool chains where one wrong call cascades.

Scoring is deterministic: either the right records were updated and the right messages were sent, or they weren't.

It’s useful enough that we're releasing it publicly today. Open task set, open methodology, open leaderboard. Everyone should have access to this.

No model has cracked 10%. Yet.

Try it here: https://t.co/V7qHAGX7Ql

15

132

22

138

35K

1

4

0

240

AMC @TweetAnnaMarie

3 months ago

@lugg Thank you. Your drivers Jessiah and Ricco in SF yesterday (black pickup truck) came by to move some large items. It’s possible they picked it up by accident and still have it in their truck? I am willing to pay a high reward $$$ for the safe return of the sentimental items inside

0

56

AMC @TweetAnnaMarie

3 months ago

SOS… my husband’s backpack was stolen while Lugg was moving an item from my house yesterday. The backpack contains incredibly sentimental items including my child’s first baby teeth. It’s possible the luggers stole it? I need someone to reach out to me ASAP. Will pay $$$ @lugg

2

3

2

0

2K

AMC @TweetAnnaMarie

3 months ago

@The_Coolector … any chance you know how to buy one of these any more? My husband’s backpack was stolen yesterday and it had his favorite jacket in it… this one 😭😭 I’ve been looking on eBay and poshmark, but no luck. https://t.co/Z103umRmLn

0

127

AMC @TweetAnnaMarie

3 months ago

@lugg Do you hire criminals??? https://t.co/1s8e2JS45T

AMC @TweetAnnaMarie

3 months ago

SOS… my husband’s backpack was stolen while Lugg was moving an item from my house yesterday. The backpack contains incredibly sentimental items including my child’s first baby teeth. It’s possible the luggers stole it? I need someone to reach out to me ASAP. Will pay $$$ @lugg

2

3

2

0

2K

0

50

AMC @TweetAnnaMarie

3 months ago

We will pay $1,000, NO QUESTIONS ASKED for the small satchel that’s inside the backpack (little peak design Velcro close satchel).

0

155

AMC @TweetAnnaMarie

4 months ago

TweetAnnaMarie's tweet photo. https://t.co/2fY3Jb3fQl

0

102

AMC @TweetAnnaMarie

4 months ago

Been a hot second, but I wrote a post this morning about a fun little framework I've been developing at the intersection of workplace productivity and personal growth Using the ‘The Four Floors of Feeling’ to Give Feedback at Work https://t.co/IrUQzC2r4i

1

2

0

1

152

TweetAnnaMarie retweeted

Brendan Irvine-Broque

@irvinebroque

8 months ago

next Thursday in SF! Come hear from @dballona @TweetAnnaMarie and I about how the way people use MCP and tools is changing. Night and day different than 5 months ago…

irvinebroque's tweet photo. next Thursday in SF!

Come hear from @dballona @TweetAnnaMarie and I about how the way people use MCP and tools is changing. Night and day different than 5 months ago… https://t.co/G03Kbx4S7E

1

5

1

374

AMC @TweetAnnaMarie

8 months ago

Netflix’ biggest competitor is your sleep. OpenAI’s biggest competitor is your friends.

near

@nearcyan

8 months ago

oh also the 'many hours/day chatgpt session lengths' might seem odd now, but give it a bit. the companion part will be big here, but the music+video+hardware over the next few years should be able to complete it, things take time

3

103

0

5

10K

0

173

TweetAnnaMarie retweeted

Zapier

@zapier

8 months ago

All eyes are on OpenAI DevDay. Agent Builder was just announced: a new way to design AI-powered workflows right inside OpenAI. But it ships with only a few native integrations, and most businesses run on hundreds of tools. That’s where Zapier MCP comes in. It instantly connects Agent Builder to Zapier's ecosystem of 8,000+ apps and 30,000+ actions. Imagine this: an OpenAI Agent analyzes campaign performance data and, through Zapier MCP, updates budgets in Google Ads and syncs new leads to HubSpot. Together, OpenAI’s Agent Builder and Zapier’s automation layer unlock the next wave of AI-native operations: intelligent logic meets real-world connectivity. What you get: - Production-ready connectors maintained by Zapier - Secure, auditable calls from your agent to the tools your customers already use - Faster time-to-value and fewer integration backlogs Try Agent Builder with Zapier MCP today: https://t.co/yHNPiQr5AQ

zapier's tweet photo. All eyes are on OpenAI DevDay.

Agent Builder was just announced: a new way to design AI-powered workflows right inside OpenAI. But it ships with only a few native integrations, and most businesses run on hundreds of tools.

That’s where Zapier MCP comes in. It instantly connects Agent Builder to Zapier's ecosystem of 8,000+ apps and 30,000+ actions.

Imagine this: an OpenAI Agent analyzes campaign performance data and, through Zapier MCP, updates budgets in Google Ads and syncs new leads to HubSpot.

Together, OpenAI’s Agent Builder and Zapier’s automation layer unlock the next wave of AI-native operations: intelligent logic meets real-world connectivity.

What you get:
- Production-ready connectors maintained by Zapier
- Secure, auditable calls from your agent to the tools your customers already use
- Faster time-to-value and fewer integration backlogs

Try Agent Builder with Zapier MCP today: https://t.co/yHNPiQr5AQ

37

334

29

151

38K

TweetAnnaMarie retweeted

Arthur MacWaters

@ArthurMacwaters

9 months ago

my founding engineer setting me up with claude code and merge permissions

15

2K

67

186

93K

AMC @TweetAnnaMarie

12 months ago

“I guess I’m just a vibe coding maximalist” OH in SF

1

2

0

227

TweetAnnaMarie retweeted

Wade Foster

@wadefoster

about 1 year ago

Interesting milestone at Zapier: We now have more AI agents than employees 🧵

45

857

69

979

300K

AMC @TweetAnnaMarie

about 1 year ago

TIL that @Plaid sells a data product letting employers know if their employees are getting a paycheck from any other company.

1

3

0

251

AMC @TweetAnnaMarie

about 1 year ago

I had the chance to catch @rafalwilinski & @vitorbal’s eval talk and it was 🔥🔥 Love to see all the eval goodness y’all have been cookin’ turn into great content for other agent builders to learn from.

swyx

@swyx

about 1 year ago

Congrats to @aiDotEngineer 2025 Best Speakers! MCP: @zeeg Tiny Teams: @alxai_ LLM Recsys: @devanshtandon_ GraphRAG: @danielchalef Fortune 500 Day 1: @hwchase17 Architects Day 1: @denyslinkov Infra: @dylan522p Voice: @bnicholehopkins Product Management: @bbalfour Agent Reliability: @itamar_mar SWE Agents: @bcherny Reasoning: @natolambert Evals: @rafalwilinski @vitorbal Retrieval+Search: @WilliamBryk Fortune 500 Day 2: @ritakozlov Architects Day 2: @pk_iv Security: @renebrandel Design Engineering: @JohnPhamous Generative Media: @sharifshameem Autonomy+Robotics: @nikhilabm Online Track: @MrAhmadAwais Overall Best Speaker: @simonw! For each track's best speakers we actually have a photo plaque printed for each of you with your speaker photo. come collect from me this weekend if you are still in town! thanks to ALL our keynote, breakout, expo, workshop, attendee and more speakers for generously sharing their knowledge and working on the best AIE talks we've ever had! We appreciate you and are working to get them all edited and up online ASAP.

swyx's tweet photo. Congrats to @aiDotEngineer 2025 Best Speakers!

MCP: @zeeg
Tiny Teams: @alxai_
LLM Recsys: @devanshtandon_
GraphRAG: @danielchalef
Fortune 500 Day 1: @hwchase17
Architects Day 1: @denyslinkov
Infra: @dylan522p
Voice: @bnicholehopkins
Product Management: @bbalfour
Agent Reliability: @itamar_mar
SWE Agents: @bcherny
Reasoning: @natolambert
Evals: @rafalwilinski @vitorbal
Retrieval+Search: @WilliamBryk
Fortune 500 Day 2: @ritakozlov
Architects Day 2: @pk_iv
Security: @renebrandel
Design Engineering: @JohnPhamous
Generative Media: @sharifshameem
Autonomy+Robotics: @nikhilabm
Online Track: @MrAhmadAwais

Overall Best Speaker: @simonw!

For each track's best speakers we actually have a photo plaque printed for each of you with your speaker photo. come collect from me this weekend if you are still in town!

thanks to ALL our keynote, breakout, expo, workshop, attendee and more speakers for generously sharing their knowledge and working on the best AIE talks we've ever had! We appreciate you and are working to get them all edited and up online ASAP.

19

387

42

323

144K

0

3

0

354

AMC

@TweetAnnaMarie

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users