William Wang @willrwang - Twitter Profile

William Wang @willrwang

4 days ago

Hey @SlackHQ @andrewjmacd @rseaman2 could we get capabilities to merge channels too please?

tobi lutke

@tobi

about 1 month ago

Slack fixed copy & paste!

11

240

3

9

47K

0

1

0

217

William Wang @willrwang

10 days ago

Richard was my first lead when I interned at Shopify and one of the earliest engineering leaders I looked up to. Thrilled to be working together again.

Richard Wilson

@senjaiRW

13 days ago

So I work at @Opendoor now. I've just finished my first real week there and I am super bullish on this company. I was convinced to stop building my own thing to join, was hesitant at first but definitely the right call. The opportunity is insane. Will be my third tour of duty with @nejatian and as an engineer I've pretty much done the best work of my career working with him.

43

803

81

31

107K

5

86

7

2

7K

willrwang retweeted

Satish Kanwar

@skanwar

12 days ago

@sciohn_fhanne One doesn't work without the other. Vibes set momentum, momentum creates companies, companies build economies. Being cynical is a choice. Doing something about it is a choice. I'd rather we go hard for the city, succeed or fail, than point fingers.

15

223

16

13

27K

willrwang retweeted

Simon Eskildsen

@Sirupsen

30 days ago

be hardcore & whimsical

4

84

12

5

6K

Who to follow

noninvasive neural decoding @alljoined

willrwang retweeted

Araa Doraisamy

@araa3185

29 days ago

Working alongside people who care this much is the part that doesn't show up in the chart, you feel it in every review, every iteration, every "let's make it better"

0

20

4

2

2K

willrwang retweeted

Yang Guo

@yang_guo

about 1 month ago

14

232

42

13

149K

William Wang @willrwang

about 1 month ago

What Kaz said, but for the ML and data crowd. We're working on housing's hardest data problems IRL in Miami. DM me.

Kaz Nejatian

@nejatian

about 1 month ago

If you are graduating university and are about to join a consulting firm. Don't do that. Do this instead. Just send me a pic of your offer from one of the top 5 and we'll get you an offer to join Opendoor instead.

80

1K

93

224

773K

6

133

19

12

36K

William Wang @willrwang

about 1 month ago

P2P → B2C → B2B → A2A. The future of marketplaces will be your agent negotiating with my agent.

Anthropic

@AnthropicAI

about 1 month ago

New Anthropic research: Project Deal. We created a marketplace for employees in our San Francisco office, with one big twist. We tasked Claude with buying, selling and negotiating on our colleagues’ behalf.

468

8K

724

4K

3M

0

3

0

1

2K

willrwang retweeted

Kaz Nejatian

@nejatian

about 2 months ago

In its early days, @Opendoor had one of the most cracked engineering teams in the Bay Area. Many of them left to create some of the best experiences currently available on the internet. If you are one of them and want to come back, please DM me. We're back on mission.

44

852

75

21

57K

willrwang retweeted

Jack Lindsey @Jack_W_Lindsey

2 months ago

Before limited-releasing Claude Mythos Preview, we investigated its internal mechanisms with interpretability techniques. We found it exhibited notably sophisticated (and often unspoken) strategic thinking and situational awareness, at times in service of unwanted actions. (1/14)

Jack_W_Lindsey's tweet photo. Before limited-releasing Claude Mythos Preview, we investigated its internal mechanisms with interpretability techniques. We found it exhibited notably sophisticated (and often unspoken) strategic thinking and situational awareness, at times in service of unwanted actions. (1/14) https://t.co/vhng7PXqcz

155

7K

769

4K

978K

willrwang retweeted

Alfred Lin

@Alfred_Lin

2 months ago

A CEO from one of our portfolio companies shared this with their team. I’m re-sharing it with their permission, because it resonated and reflects what all founders and CEOs should be communicating. -- We are living through a period of compounding change. And in moments like this, the biggest risk is no longer making the wrong decision. It is moving too slowly while the world moves around you. There are two paths. We can play defense: - Protect what we have - Optimize what works - Wait for clarity It feels safe. It isn’t. Or we can play offense: - Learn faster than the environment changes - Use new tools to solve old problems in better ways - And create entirely new strategies and businesses That’s where the opportunity is. Challenge yourself to do things faster and better than you have ever attempted. Stay uncomfortable. Stay on the front foot.

110

3K

430

3K

899K

willrwang retweeted

Andrej Karpathy

@karpathy

3 months ago

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. https://t.co/WAz8aIztKT All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

karpathy's tweet photo. Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project.

This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.:

- It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work.
- It found that the Value Embeddings really like regularization and I wasn't applying any (oops).
- It found that my banded attention was too conservative (i forgot to tune it).
- It found that AdamW betas were all messed up.
- It tuned the weight decay schedule.
- It tuned the network initialization.

This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism.
https://t.co/WAz8aIztKT

All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges.

And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

962

20K

2K

11K

4M

William Wang @willrwang

4 months ago

@freemanjiangg @tobi Nice

0

68

William Wang @willrwang

almost 2 years ago

Just wrapped up our fourth iteration of Builder Sundays at the Toronto @Shopify port where we heard amazing presentations on their projects, companies, and side gigs. For all builders and early stage entrepreneurs in the 6ix, join us for the next session from 12-5pm Aug. 11th!

willrwang's tweet photo. Just wrapped up our fourth iteration of Builder Sundays at the Toronto @Shopify port where we heard amazing presentations on their projects, companies, and side gigs.

For all builders and early stage entrepreneurs in the 6ix, join us for the next session from 12-5pm Aug. 11th! https://t.co/k6KRKoyYhH

10

84

10

19

21K

William Wang @willrwang

almost 2 years ago

@ankerbachryhl @MadsLunau @getsamai LFG brother 🚀🎸⭐️

0

1

0

54

William Wang @willrwang

almost 2 years ago

Awesome to be a part of this! and nice finally catching up in person @agarcher -- still remember those hack days projects from a few years ago

Adam Archer @agarcher

almost 2 years ago

First Builder Sundays event has been sick. Exactly the heads down cooking vibe I was hoping to see. 🔥

10

95

5

15

43K

1

9

0

876

willrwang retweeted

Lulu Cheng Meservey

@lulumeservey

almost 2 years ago

Yesterday I saw the wildest CEO talk maybe ever, and finally learned what Tobi was building that Saturday in May. So Shopify kicked off their 2024 Summit, and Tobi was the first speaker. He gave a short talk; it was just ok. Typical CEO speech, not bad, but kind of meh (he rated it 7/10). Then came the record scratch. Apparently Tobi had struggled to write his talk, so he had instead BUILT A TEAM OF AI EMPLOYEES TO DO IT FOR HIM. They were AI agents with specialized skills, each with a specific role, working both individually and in collaboration, in a virtual office where they’d periodically convene in the conference room to brainstorm and task each other. There was even a virtual water cooler for spontaneous encounters (which would happen because the “employees” were programmed to feel thirsty and get water). That team had autonomously conceived, drafted, researched, written, edited, and designed the slides for Tobi’s talk. So then the second half of the talk was Tobi walking through his thinking, each step of building the agents, and what he learned while tinkering. The language wasn’t flowery, the delivery wasn’t rehearsed, the slides weren’t flashy. And yet the energy in the room was absolutely electric, the air was crackling. Same vibe as the Top Gun 2 scene where Tom Cruise flies the route himself. Obviously a great talk can be super motivating for employees, but it’s another level when they’re reminded that the leader of the whole enterprise is a master of their craft, someone who’s still a practitioner and not just a manager. Nothing like it.

52

2K

115

1K

544K

William Wang @willrwang

about 2 years ago

A Shopify friend gave me his bib for the Tamarack Ottawa marathon about a month ago and I didn't have much time to train. Aimed for a 3:45 but felt really good so decided to push it. Fantastic weather and great race conditions. PB’d with 3:30