Sahil @Sahilb07 - Twitter Profile

Pinned Tweet

over 1 year ago

@deepseek_ai R1's open-source launch has sent shockwaves through tech markets, thanks to costs that are far lower than those of giants like @OpenAI or @Google Yet there’s a bigger picture that few are talking about🧵

1

0

38

Sahil @Sahilb07

over 1 year ago

In the end, success won’t merely hinge on who has the biggest model but on who best leverages these new approaches to deliver real-world impact. The AI landscape is evolving rapidly—and it’s thrilling to see how these trends will shape the future.

0

35

Sahil @Sahilb07

over 1 year ago

@deepseek_ai R1's open-source launch has sent shockwaves through tech markets, thanks to costs that are far lower than those of giants like @OpenAI or @Google Yet there’s a bigger picture that few are talking about🧵

1

0

38

Sahil @Sahilb07

over 1 year ago

This suggests that companies might pivot to offering specialized products built around 'smaller' models, targeting specific industries. By integrating AI into existing workflows, these solutions might deliver greater value than simply providing access to large, generic models.

1

0

22

Who to follow

ProductCracks

@ProductCracks

All things Product and what it's like to solve problems all around the world.

Sachin

@SachinVM24

Product Enthusiast|Developer|Movie Buff|Ready To Mingle To Discuss Ideas

Govind Lohar

@go_v_nd

Building https://t.co/FuEZZkZ7yP, an AI Visibility Solution

Sahilb07 retweeted

Gavin Baker

@GavinSBaker

over 1 year ago

1) DeepSeek r1 is real with important nuances. Most important is the fact that r1 is so much cheaper and more efficient to inference than o1, not from the $6m training figure. r1 costs 93% less to *use* than o1 per each API, can be run locally on a high end work station and does not seem to have hit any rate limits which is wild. Simple math is that every 1b active parameters requires 1 gb of RAM in FP8, so r1 requires 37 gb of RAM. Batching massively lowers costs and more compute increases tokens/second so still advantages to inference in the cloud. Would also note that there are true geopolitical dynamics at play here and I don’t think it is a coincidence that this came out right after “Stargate.” RIP, $500 billion - we hardly even knew you. Real: 1) It is/was the #1 download in the relevant App Store category. Obviously ahead of ChatGPT; something neither Gemini nor Claude was able to accomplish. 2) It is comparable to o1 from a quality perspective although lags o3. 3) There were real algorithmic breakthroughs that led to it being dramatically more efficient both to train and inference. Training in FP8, MLA and multi-token prediction are significant. 4) It is easy to verify that the r1 training run only cost $6m. While this is literally true, it is also *deeply* misleading. 5) Even their hardware architecture is novel and I will note that they use PCI-Express for scale up. Nuance: 1) The $6m does not include “costs associated with prior research and ablation experiments on architectures, algorithms and data” per the technical paper. “Other than that Mrs. Lincoln, how was the play?” This means that it is possible to train an r1 quality model with a $6m run *if* a lab has already spent hundreds of millions of dollars on prior research and has access to much larger clusters. Deepseek obviously has way more than 2048 H800s; one of their earlier papers referenced a cluster of 10k A100s. An equivalently smart team can’t just spin up a 2000 GPU cluster and train r1 from scratch with $6m. Roughly 20% of Nvidia’s revenue goes through Singapore. 20% of Nvidia’s GPUs are probably not in Singapore despite their best efforts. 2) There was a lot of distillation - i.e. it is unlikely they could have trained this without unhindered access to GPT-4o and o1. As @altcap pointed out to me yesterday, kinda funny to restrict access to leading edge GPUs and not do anything about China’s ability to distill leading edge American models - obviously defeats the purpose of the export restrictions. Why buy the cow when you can get the milk for free?

224

9K

1K

7K

3M

Sahil @Sahilb07

over 1 year ago

Partnership? Acquisition? Or Homegrown innovation? OpenAI needs a bold strategy to close the gap. What's their next move?

0

14

Sahil @Sahilb07

over 1 year ago

Will 2025 be make-or-break for OpenAI? As Google and Apple perfect AI hardware integration, OpenAI faces a critical choice: innovate beyond software or risk losing its edge. IS ChatGPT’s Hardware on the Horizon:

Sahilb07's tweet photo. Will 2025 be make-or-break for OpenAI?
As Google and Apple perfect AI hardware integration, OpenAI faces a critical choice: innovate beyond software or risk losing its edge.

IS ChatGPT’s Hardware on the Horizon: https://t.co/VVsjtu0LgS

1

0

24

Sahil @Sahilb07

over 1 year ago

Imagine ChatGPT-powered earbuds, smart home hubs that understand context across rooms, or even AI assistants for specific professions.

1

0

19

Sahilb07 retweeted

Phil Bak

@philbak1

almost 3 years ago

I ask this with all sincerity: What is the game plan here?

3K

14K

3K

1K

3M

Sahilb07 retweeted

Chamath Palihapitiya

@chamath

almost 3 years ago

Buffett, Active Investing and Index Funds... In 2008, Warren Buffett issued a challenge to the hedge fund industry, and a million-dollar bet was made. Buffett's position was that, including fees, costs and expenses, an S&P 500 index fund would outperform a hand-picked portfolio of hedge funds over 10 years. The bet pit two investing philosophies against each other: passive and active investing. Buffett picked the S&P500 Index. The hedge funders picked their actively managed funds. At the end of ten years, they looked back and Buffett won. A recent article in Bloomberg reinforces this point. Only one equity mutual fund, the $7.1B Baron Partners Fund, has outperformed the Invesco QQQ ETF (Nasdaq ETF) over the past 5, 10 and 15 years. Said differently, passively investing in the Nasdaq ETF exposed you to the gains of the best companies of this era without you having to do any work or diligence. All the best companies were part of the ETF. When one of those company lagged, their composition in the index fell or dropped all together. And when a company did well, their composition in the index would increase or they were added if they weren't part of it beforehand. Passive investing allowed the ETF manager to define simple rules and then do all the work for you. The companies it picked, because of its rigid rules, turned out to be far superior to those picked by active investors. So much so that only ONE fund (out of thousands) managed to beat the ETF. The lesson is that for most people, they will find that this is the superior method for investing in the stock market. Allocate some money (say each month) to a very low cost ETF and then let the ETF manager, natural selection and compounding do the rest.

314

3K

442

1K

898K

Sahilb07 retweeted

Nathan Baugh

@nathanbaugh27

almost 3 years ago

Storytelling is a game of psychology. 10 tricks rooted in psychology to make you a better storyteller: (a visual thread)

nathanbaugh27's tweet photo. Storytelling is a game of psychology.

10 tricks rooted in psychology to make you a better storyteller:

(a visual thread) https://t.co/Sa3N0kuqmC

63

4K

728

8K

1M

Sahilb07 retweeted

Rez Karim

@rezkhere

about 3 years ago

No more hours of video editing. ChatGPT can now create a video commercial with the script, voice-over, music and everything with just two prompts. I will show you how in 4 easy steps 👇

381

19K

4K

31K

5M

Sahil

@Sahilb07

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users