Jeff Wang 👨‍🚀

⌨️ building @beepsdev - on-call for humans and agents together 📚 prev: built @effxhq (acqd by @figma), ex-@airbnb ❤️ reliability, scale & ai

about 2 months ago

welcome Assured Robot Intelligence (ARI) to MSL! excited to build physical with @LerrelPinto @xiaolonw and the whole team!

20

599

45

80

97K

Who to follow

joey parsons

@joeyparsons

farhad manjoo (former bluecheck)

@fmanjoo

I used to write for the NYT. DMs open. signl: 415-604-2709 [email protected].

Umang Jaipuria

@umang

building a postmodern services biz. ex- eng / pm / head of some orgs @ amzn, twtr, plaid, znga. leave me anon feedback: https://t.co/8rLZAru0QI

2 months ago

@tanayj are there projections from other schools?

1

0

2K

jffwng retweeted

Xiang Yue @xiangyue96

2 months ago

Great to see that Muse Spark ranks top on Claw-Eval!

4

52

2

4

7K

2 months ago

@jack_w_rae same. used it last night for double check my taxes

0

1

0

71

jffwng retweeted

Andrew Carr 🤸

@andrew_n_carr

2 months ago

meta muse spark crushes one of my hard benchmarks "recommended me something good to read that I am certain to have never read before" theres lots of theory of mind involved, most models recommend the same 20 or so pieces of work. everything spark returned was novel, weird, and good. I had to heard of most of them and they were fun reads.

8

238

12

50

41K

jffwng retweeted

Scott Wu

@ScottWu46

2 months ago

Craziest part is we all knew each other already in high school! Along with @randomjohnnyh (Perplexity cofounder), @demi_guo_ (Pika CEO), @stevenkplus1 and Andrew (Cognition), and many others. We all grew up in different states but met thru the olympiad scene. Vividly remember this line from @alexandr_wang when we were around 19: "I hear people saying they want to find the next Paypal mafia. Why shouldn't it just be us?" Glad to see @chameleon_jeff get the recognition he deserves :)

57

3K

134

983

435K

jffwng retweeted

2 months ago

honestly I didn’t even know our model could do some of these

63

3K

113

1K

262K

jffwng retweeted

Wei-Ning Hsu

@mhnt1580

2 months ago

Building almost everything from scratch is such a fun and rare experience. It’s truly a pleasure to work with such a talented group with @jhyuxm and push hard on multimodal. This is just the first step. More to come

0

28

1

0

2K

jffwng retweeted

Garry Tan

@garrytan

2 months ago

You can make very impressive one shot arcade games with this!

15

148

5

45

29K

jffwng retweeted

2 months ago

a good writeup about Muse Spark on a few complex queries (multimodal, stock analysis, coding): https://t.co/ngYaXTZ4gW

27

592

54

174

50K

jffwng retweeted

2 months ago

1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵

alexandr_wang's tweet photo. 1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵 https://t.co/fThDXdsxwB

747

10K

1K

3K

5M

3 months ago

@thenanyu CPG companies also traditionally follow this model

0

128

3 months ago

@thenanyu Exactly. Code used to be the bottleneck. Now that code is abundant, the bottleneck now is choosing which code to keep and which to discard.

0

1

0

180

3 months ago

@kevinyien Extending this: Holds true for other functional pods too. Take comms/marketing/brand. Old lines were never rigid, but still value in each. Will continue so as they expand and deepen.

0

1

0

109

3 months ago

We’re going to see more ai;dr than tl;dr soon.

0

92

jffwng retweeted

andrew chen

@andrewchen

3 months ago

in a world of agents, the product role is going to split into two jobs: - one that organizes humans (stakeholders, design, eng) - one that organizes agents (prompts, evals, workflows, etc) Both will be in pursuit of offering the right products to customers, but how you get there will dramatically change. What happens to the typical product rituals? Instead of PRDs, OKRs, standups, product reviews, we'll need the equivalent for agents. Couple wild ideas here... instead of standups: the equivalent is that agents will report back to us based on run logs and anomaly flags. no one needs to say what they did yesterday, the system already did thousands of things. the question is where it broke, where it surprised you, and where it got better. Show us the patterns, the trends, the edge cases - particularly the ones the agents didn't fix automatically. the daily ritual becomes reviewing deltas, scanning failures, and deciding which ones matter. less reporting, more triage instead of OKRs: we’ll need adversarial agents that continuously monitor/grade the system and detect patterns, scoring outcomes on an hourly or daily basis. Rather than setting a quarterly goal of "increase X by 5%" and revisiting slowly -- instead, management will be able to monitor success in real-time and detect trends/patterns towards overall goals instead of PRDs: we won't need waterfall. Prototyping will rule the day, and we’ll need a living agentic loop that mediates customer feedback/ratings and what's being prioritized and built. you don’t hand it to eng, you deploy it into the agent loop. if it’s wrong, it fails visibly and you can revert. if it’s right, it produces the right output instead of product reviews: we'll need simulation systems to examine agent behavior in different scenarios. In an agentic world where UI shifts from buttons/menus to agents automatically doing things, you'll want to examine their behavior before you deploy. You rewind decisions, fork alternate paths, and see how different prompts or constraints would have changed outcomes. the review becomes interactive. less storytelling, more counterfactuals. The PM sits in the middle of this split. On the human side, still aligning taste, risk tolerance, and strategy across people. On the agent side, shaping the actual behavior of the system through prompts, evals, and feedback loops. one side is persuasion. The other is instrumentation. the best ones will collapse the gap, translating intent directly into systems that act on it. the fascinating part is that the agentic loop will run 10000x faster than the human one, and of course, you can "hire" them faster. Thus the “organizing humans” half starts to feel slow and lower impact unless it directly improves the agent loop. Eventually the PM will shift towards agents and maybe ignore the human coordination altogether...

80

579

54

833

58K