Harry Uglow @harry_uglow - Twitter Profile

Pinned Tweet

23 days ago

Very grateful to have been named in the Forbes 30 Under 30 Europe AI list! It was a surprise seeing this come in right as I was stepping back from Dex, but I'll admit I was very happy to see my achievements marked - building a great team and the best voice AI experience in recruitment, and raising $8.4m in funding. Really though, it was a great reminder of just how early in my career I am. So stay tuned, the best is yet to come. And yes, I promise this is the only time I'll post about it!

harry_uglow's tweet photo. Very grateful to have been named in the Forbes 30 Under 30 Europe AI list!

It was a surprise seeing this come in right as I was stepping back from Dex, but I'll admit I was very happy to see my achievements marked - building a great team and the best voice AI experience in recruitment, and raising $8.4m in funding.

Really though, it was a great reminder of just how early in my career I am. So stay tuned, the best is yet to come.

And yes, I promise this is the only time I'll post about it!

1

14

0

1K

Harry Uglow

@harry_uglow

about 4 hours ago

As someone who has been on Twitter for 15+ years, it is clear to me that the site is the best it’s ever been. The early Elon days were quite rough but I’m very glad it’s bounced back and we aren’t all on Bluesky.

Nikita Bier

@nikitabier

about 13 hours ago

Reports of our death were greatly exaggerated.

799

5K

366

254

3M

0

95

Harry Uglow

@harry_uglow

2 days ago

This is what I was talking about with cost savings coming from open source models btw

Flo Crivello

@Altimor

2 days ago

Pulled the trigger today and switched 100% of Lindy traffic to DeepSeek v4, churning from Anthropic models. Saves us millions of $ and we're actually seeing an *increase* in performance on many core use cases. Transformative for the business.

162

2K

146

1K

837K

0

49

Harry Uglow

@harry_uglow

2 days ago

Lots of excitement around Factory's model router and its 20% cost saving, and rightly so. Many are saying this is just the start. After digging into the numbers, I think it's close to the ceiling. Here's why 👇 Routing only saves money on work a cheaper model can handle without dropping the ball. Factory's router holds ~99% of Opus 4.7's pass rate while cutting cost by 20%. They also published a Pareto curve of other experiments, showing that when they pushed harder performance suffered. Getting down to ~56% of Opus 4.7 cost dragged the pass rate to 81%. In their research, 20% was the elbow of the curve, i.e. about the most you can save before quality starts to go. It's also important not to confuse this cost saving as "only 20% of tasks could be handed to smaller models". The reality was likely far more. Firstly, smaller models aren't free - Claude Sonnet is only ~50% cheaper. But most importantly, the hardest tasks are often the long, token-hungry, multi-step ones. So a handful of hard sessions still eat the lion's share of the bill, even if the majority of tasks get routed. Furthermore, any benchmark that Opus 4.7 scores 99% is forgiving. The tasks we throw at AI in reality are often harder. If you're pushing AI to its limits you'll naturally need to send a higher share of tasks to the smartest models. Hence why I think Factory's numbers form something of a ceiling for cost saving. At least for now... So why does this matter? Plenty of startups are now running in-house agents and watching usage spiral. The results are awe-inspiring, but cost is a creeping concern. I've seen teams attempt their own model routing, and if you are Factory's research should give pause for thought. What it shows is that you're unlikely to beat ~20% cost reduction. Worse, if you think you have, you've probably traded away performance without realising it. For Factory's enterprise clients, 20% off a vast bill is real money. For a startup building it yourself, if you ask me the juice isn't worth the squeeze. My advice: worry about cost far less than you're tempted to. Put that energy into the product and anything that helps you ship faster. The AI landscape is going to keep shifting and many cost reductions are going to come for free. Where I'd bet the genuinely dramatic cost reductions will come from (most to least likely): → Hardware acceleration letting the frontier labs cut prices → Better open-source models and a shift toward local compute → Specialised models giving routers cheaper options with best-in-class performance (see Harvey's announcement from the last 24 hours!) I'm watching the first two very closely this year, and will keep sharing what I find

harry_uglow's tweet photo. Lots of excitement around Factory's model router and its 20% cost saving, and rightly so.

Many are saying this is just the start. After digging into the numbers, I think it's close to the ceiling. Here's why 👇

Routing only saves money on work a cheaper model can handle without dropping the ball. Factory's router holds ~99% of Opus 4.7's pass rate while cutting cost by 20%. They also published a Pareto curve of other experiments, showing that when they pushed harder performance suffered. Getting down to ~56% of Opus 4.7 cost dragged the pass rate to 81%. In their research, 20% was the elbow of the curve, i.e. about the most you can save before quality starts to go.

It's also important not to confuse this cost saving as "only 20% of tasks could be handed to smaller models". The reality was likely far more. Firstly, smaller models aren't free - Claude Sonnet is only ~50% cheaper. But most importantly, the hardest tasks are often the long, token-hungry, multi-step ones. So a handful of hard sessions still eat the lion's share of the bill, even if the majority of tasks get routed.

Furthermore, any benchmark that Opus 4.7 scores 99% is forgiving. The tasks we throw at AI in reality are often harder. If you're pushing AI to its limits you'll naturally need to send a higher share of tasks to the smartest models. Hence why I think Factory's numbers form something of a ceiling for cost saving. At least for now...

So why does this matter?

Plenty of startups are now running in-house agents and watching usage spiral. The results are awe-inspiring, but cost is a creeping concern. I've seen teams attempt their own model routing, and if you are Factory's research should give pause for thought. What it shows is that you're unlikely to beat ~20% cost reduction. Worse, if you think you have, you've probably traded away performance without realising it.

For Factory's enterprise clients, 20% off a vast bill is real money. For a startup building it yourself, if you ask me the juice isn't worth the squeeze.

My advice: worry about cost far less than you're tempted to. Put that energy into the product and anything that helps you ship faster. The AI landscape is going to keep shifting and many cost reductions are going to come for free.

Where I'd bet the genuinely dramatic cost reductions will come from (most to least likely):
→ Hardware acceleration letting the frontier labs cut prices
→ Better open-source models and a shift toward local compute
→ Specialised models giving routers cheaper options with best-in-class performance (see Harvey's announcement from the last 24 hours!)

I'm watching the first two very closely this year, and will keep sharing what I find

0

1

0

88

Who to follow

2 days ago

@runaway_vol Bullshit, my shoulders have more room for chips

0

61

Harry Uglow

@harry_uglow

3 days ago

How dare this airline charge a fraction of the price of its competitors and not give me the same level of service! Scandal

0

27

Harry Uglow

@harry_uglow

3 days ago

Lot of ground covered today! 📍 Regent’s Park - Bike ride (failed) Southwark - Trip to bike shop (unplanned) Home - Pitch call South Ken - Meeting Marylebone - Coffee Fitzrovia - Coffee, then work + 2 more calls from a friendly VC office Now off to Poland 🇵🇱 Back next week 🫡

0

3

0

71

Harry Uglow

@harry_uglow

3 days ago

Microsoft announces 7 frontier models… and is down 5% on the news 🤷‍♂️

Mustafa Suleyman

@mustafasuleyman

4 days ago

Super excited to announce seven new world-class MAI models today. They represent what we consider a new era in AI designed to keep you in control and on the frontier. First is our text foundation model, MAI-Thinking-1, exceptionally strong on reasoning and SWE tasks. - It’s a 35B active parameter MoE with a 256K context window. Independent human raters on Surge prefer it for overall quality in blind side-by-sides versus Sonnet 4.6, and it’s achieved 97% on AIME 2025, the key measure of its general-purpose reasoning abilities. - It's at 53% on SWE Bench Pro, placing it right alongside Opus 4.6 on one of the toughest coding benchmarks. - And since we co-designed our models with our own silicon, MAI-Thinking-1 is optimized on our MAIA 200 chip. Benchmarking head-to-head against the GB200, we see 30% better performance per dollar as well as a 1.4x performance-per-watt gain when running our MAI models on the MAIA 200 end-to-end. Next is MAI-Image-2.5 and its Flash variant. Two super strong models now at #2 on the leaderboards, surpassing the score of Nano Banana 2 on image editing. Last for now is MAI-Code-1-Flash, our new inference efficient coding model, especially tuned for VS Code and GitHub Copilot CLI. - Code-1-Flash achieves 51% on SWE Bench Pro, despite having just 5B parameters, putting it closer to Haiku in size but cheaper in cost. All of this is the foundation for Microsoft Frontier Tuning. It lets you customize our models to create custom, company-specific agents that only you control. You can make our model, your model. Your data. Your agents. Your moat. Early adopters are already seeing a difference. When we tuned our models for McKinsey’s tasks, MAI delivered the highest win rate, outperforming GPT-5.5 on quality, while being 10x lower on cost. Also really excited to be collaborating with the amazing team at Mayo Clinic to jointly train a new frontier AI model for healthcare. Our announcements today mark another milestone on the road to humanist superintelligence. You can learn more and about our other new models in our latest blog: https://t.co/v65eop5Ixq

mustafasuleyman's tweet photo. Super excited to announce seven new world-class MAI models today. They represent what we consider a new era in AI designed to keep you in control and on the frontier.
First is our text foundation model, MAI-Thinking-1, exceptionally strong on reasoning and SWE tasks.
- It’s a 35B active parameter MoE with a 256K context window. Independent human raters on Surge prefer it for overall quality in blind side-by-sides versus Sonnet 4.6, and it’s achieved 97% on AIME 2025, the key measure of its general-purpose reasoning abilities.
- It's at 53% on SWE Bench Pro, placing it right alongside Opus 4.6 on one of the toughest coding benchmarks.
- And since we co-designed our models with our own silicon, MAI-Thinking-1 is optimized on our MAIA 200 chip. Benchmarking head-to-head against the GB200, we see 30% better performance per dollar as well as a 1.4x performance-per-watt gain when running our MAI models on the MAIA 200 end-to-end.

Next is MAI-Image-2.5 and its Flash variant. Two super strong models now at #2 on the leaderboards, surpassing the score of Nano Banana 2 on image editing.

Last for now is MAI-Code-1-Flash, our new inference efficient coding model, especially tuned for VS Code and GitHub Copilot CLI.
- Code-1-Flash achieves 51% on SWE Bench Pro, despite having just 5B parameters, putting it closer to Haiku in size but cheaper in cost.

All of this is the foundation for Microsoft Frontier Tuning. It lets you customize our models to create custom, company-specific agents that only you control. You can make our model, your model. Your data. Your agents. Your moat.

Early adopters are already seeing a difference. When we tuned our models for McKinsey’s tasks, MAI delivered the highest win rate, outperforming GPT-5.5 on quality, while being 10x lower on cost.

Also really excited to be collaborating with the amazing team at Mayo Clinic to jointly train a new frontier AI model for healthcare.

Our announcements today mark another milestone on the road to humanist superintelligence. You can learn more and about our other new models in our latest blog: https://t.co/v65eop5Ixq

188

4K

539

1K

1M

0

77

Harry Uglow

@harry_uglow

5 days ago

@isnit0 Joined the waitlist 🫡

0

136

Harry Uglow

@harry_uglow

5 days ago

VC pass notes be like > it’s not me, it’s you. > …keep in touch! Will be cheering from the sidelines!

0

22

Harry Uglow

@harry_uglow

5 days ago

I think Claude has had competing products memory-holed from it. Go and directly ask it to compare the difference between OpenClaw and Hermes and watch as it has no idea what they are until it uses the web search tool.

0

48

Harry Uglow

@harry_uglow

5 days ago

@blc_16 @AnthropicAI Hahaha well played bro

0

1

0

88

Harry Uglow

@harry_uglow

7 days ago

@j4ppleby Better yet, why don’t we just go through all possible questions anyone could ask and hard-code all the answers? Who needs AI? Pretty sure the worlds largest switch statement would use less water too /s

0

99

Harry Uglow

@harry_uglow

8 days ago

@contextconor @garrytan 100%

0

12

Harry Uglow

@harry_uglow

8 days ago

@garrytan @contextconor as you are the only person who saw this I want you to know this has lived rent free in my head since you posted it last year. Accurate af

1

0

164

Harry Uglow

@harry_uglow

8 days ago

@garrytan This is the tizz-rizz founder matrix from first principles https://t.co/Gx9NrLwQbd

conor brennan-burke

@contextconor

9 months ago

@jia_seed Tizz/Rizz explainer (filmed on uber ride)

23

397

31

216

86K

1

3

0

409

Harry Uglow

@harry_uglow

9 days ago

Whenever anyone messages me saying “good post” it is always about something that’s had far less engagement than my lower effort posts

0

31

Harry Uglow

@harry_uglow

10 days ago

New office view - is this Southwarkmaxxing?

1

3

0

226

Harry Uglow

@harry_uglow

10 days ago

Congratulations to everyone who applied this batch, and best of luck if you're still interviewing 🤞 follow for more, and do reach out if you're applying or raising your first round!

1

0

1

118

Harry Uglow

@harry_uglow

10 days ago

I met with 50+ founders and reviewed dozens more decks for this cohort of a16z @speedrun. Here are the 3 pitching mistakes I saw founders make most often 🧵

1

0

3

324

Harry Uglow

@harry_uglow

10 days ago

3. Failing to connect traction to the story Storytelling is one of the most important skills any founder can have. Whatever you're building, the ability to convince people it's not just a smart idea but an important one really matters. Sales, fundraising, hiring - if you can make people care early, everything else gets easier. Yet I've seen founders with otherwise excellent pitches let their traction slide sit as cold pipeline or revenue numbers.Don't forget the WHY. Why did your first pilots convert? Why are your users coming back week on week? Why is your pipeline full? Add customer quotes if you can, or metrics that show product success beyond revenue. This lets you pitch beyond the numbers - showing not only that you have early traction, but that your users are happy and there's more coming.

1

0

98

Harry Uglow

@harry_uglow

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users