Ben @benc_yi - Twitter Profile

Ben

@benc_yi

2 days ago

@TechEmails same emil michael that lead deals for Uber. crazy how tides have turned

0

1

0

253

Ben

@benc_yi

11 days ago

@craigweiss agreed but lot of laggard industries’ marketing are unfortunately primarily on LinkedIn and nothing else

0

8

0

578

Ben

@benc_yi

13 days ago

@Kochara13 @SpaceX future

0

38

Ben

@benc_yi

13 days ago

@SakanaAILabs isnt this just

OpenRouter

@OpenRouter

22 days ago

Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇

OpenRouter's tweet photo. Introducing the Fusion API, the smartest compound model in the market.

Fusion achieves Fable-level intelligence at half the price.

How it works 👇 https://t.co/OTUQAdTQjU

725

15K

2K

13K

6M

1

0

123

Who to follow

NOBLE

@_nahshon

Created to Lead/ Designed to Dominate🇯🇲

GRAHAM BRIGHT

@grahambrightt

‘an album in my darkness’

16 days ago

a reminder to my generation: this life is fucking electric, don’t let any doomer shit bring you down. we will live in the most interesting timeline in history and work on some of the hardest, most rewarding problems of all time. https://t.co/PE0uACOlrc

0

51

Ben

@benc_yi

16 days ago

@DavidSHolz @midjourney bullish on the yet u and ur company do some pretty amazing things, thank u for being good human 👍

0

22

Ben

@benc_yi

19 days ago

@coinbase ???

Coinbase 🛡️

@coinbase

19 days ago

Legal stuff: Tokenized stocks will only be available in eligible jurisdictions outside the United States; coming soon.

31

205

18

12

56K

0

35

Ben

@benc_yi

21 days ago

it will be interesting to see how a free market economy survives this shift. right now a lot of core business logic is increasingly routed through a handful of foundation models. those models don’t just process the data - they effectively own it by proxy and can train on it. that concentration feels at odds with our hopes for a free market economy. at scale i expect foundation models to keep getting commoditized and open-sourced. the real advantage will shift to the companies willing to do serious post-training and reinforcement learning on top of them - customizing the models to their actual workflows, protecting their IP, and compounding efficiency in ways generic models can’t touch. that’s the hill-climbing machine Satya was talking about. human capital keeps finding the next hills worth climbing, and the post-trained systems turn those discoveries into durable, compounding advantage. @restorefast shares these views

0

101

Ben

@benc_yi

25 days ago

@sundarpichai stochasticity

Ben

@benc_yi

25 days ago

I wrote a research paper a few months ago on this as a paradigm shift in my paper "Beyond Transformers," arguing that the autoregressive paradigm is fundamentally backwards: we spend ~99% of compute on deterministic matrix math to approximate a probability distribution, then a trivial PRNG samples from it. That always sounded backwards to me - defining the distribution should be cheap, and sampling should be the native operation. Instead of one token at a time, diffusion models start with noise and iteratively refines entire blocks in parallel, until it converges. This is closer to how physical systems actually behave: a state evolving toward low-energy configurations through an energy landscape. Their "self-correction" is a digital approximation of attractor dynamics. Their parallel refinement is a digital approximation of a system settling into an energy basin. I wrote in my paper that "diffusion-based language models, energy-based text generation, and continuous-space language models all represent steps in this direction - and may offer a smoother transition path to future stochastic hardware." I think models like this add some backing to my thesis. But the next step isn’t a bigger GPU. The core workload that models like DiffusionGemma run on H100s - iterative denoising over probability distributions - is exactly the type of probabilistic sampling work that would be native on stochastic hardware. A good example of modern stochastic hardware (still early and under active R&D) is @extropic’s Thermodynamic Sampling Units (TSUs), which use physical thermal fluctuations as the computational mechanism. With further development, the need for heavy matrix math to approximate distributions could be reduced or eliminated, as the physics could handle sampling directly (hint: I am also bullish TSUs because they could be great in space) I've been working on building a diffusion LLM running inference through thermodynamic sampling primitives with learned pairwise coupling between token positions. And a gradient-free training algorithm that replaces backpropagation with local correlation statistics, designed for hardware where stochasticity is free. Still a work in progress lol but even today show's promise. DiffusionGemma proves the paradigm is becoming more viable at scale - I think we'll continue to see dLLMs grow in popularity. The question now is what happens when the hardware matches the math.

0

5

0

1K

0

96

Ben

@benc_yi

25 days ago

I wrote a research paper a few months ago on this as a paradigm shift in my paper "Beyond Transformers," arguing that the autoregressive paradigm is fundamentally backwards: we spend ~99% of compute on deterministic matrix math to approximate a probability distribution, then a trivial PRNG samples from it. That always sounded backwards to me - defining the distribution should be cheap, and sampling should be the native operation. Instead of one token at a time, diffusion models start with noise and iteratively refines entire blocks in parallel, until it converges. This is closer to how physical systems actually behave: a state evolving toward low-energy configurations through an energy landscape. Their "self-correction" is a digital approximation of attractor dynamics. Their parallel refinement is a digital approximation of a system settling into an energy basin. I wrote in my paper that "diffusion-based language models, energy-based text generation, and continuous-space language models all represent steps in this direction - and may offer a smoother transition path to future stochastic hardware." I think models like this add some backing to my thesis. But the next step isn’t a bigger GPU. The core workload that models like DiffusionGemma run on H100s - iterative denoising over probability distributions - is exactly the type of probabilistic sampling work that would be native on stochastic hardware. A good example of modern stochastic hardware (still early and under active R&D) is @extropic’s Thermodynamic Sampling Units (TSUs), which use physical thermal fluctuations as the computational mechanism. With further development, the need for heavy matrix math to approximate distributions could be reduced or eliminated, as the physics could handle sampling directly (hint: I am also bullish TSUs because they could be great in space) I've been working on building a diffusion LLM running inference through thermodynamic sampling primitives with learned pairwise coupling between token positions. And a gradient-free training algorithm that replaces backpropagation with local correlation statistics, designed for hardware where stochasticity is free. Still a work in progress lol but even today show's promise. DiffusionGemma proves the paradigm is becoming more viable at scale - I think we'll continue to see dLLMs grow in popularity. The question now is what happens when the hardware matches the math.

0

5

0

1K

Ben

@benc_yi

about 1 month ago

@ykslyy unirconically, yes

0

27

Ben

@benc_yi

about 1 month ago

@marcusyul smells like survivorship bias you are making assumptions based on complaints companies that are “losing” the winners are not posting on X about it lol they are just stacking wins and laying people off, but trust that some companies are winning

1

3

0

1K

benc_yi retweeted

MTS @MTSlive

about 1 month ago

We looked at the jobs pages of 910 early-stage startups from the top accelerator programs, analyzing who they want to hire and for how much. Explore at our latest drop: https://t.co/aDHLO22wtR

MTSlive's tweet photo. We looked at the jobs pages of 910 early-stage startups from the top accelerator programs, analyzing who they want to hire and for how much.

Explore at our latest drop:
https://t.co/aDHLO22wtR https://t.co/hz5HGrTh98

2

71

3

30

44K

benc_yi retweeted

cage 🇵🇸 @CAGEtheGEGEG

about 1 month ago

Why arent we doing this?

489

95K

5K

6K

10M

Ben

@benc_yi

about 2 months ago

The fastest way to ruin a beautiful feeling is to make it unlimited. One feeling I try to protect is being genuinely impressed. So I surround myself with people who are hungry, sharp, and operating at a level that makes me uncomfortable. But the trick is: to keep being impressed, I have to keep getting better too. Your environment raises your standards, then your standards force you to grow.

1

2

0

58

benc_yi retweeted

Paul Graham

@paulg

about 2 months ago

There is nothing more powerful than well-informed optimism. It has to be well-informed though. The "everything will be fine" type of optimism may also be somewhat useful, but it's not as useful as the "Hmm, what if we tried x?" kind.

208

3K

317

611

149K

benc_yi retweeted

Erin Price-Wright

@espricewright

about 2 months ago

If you're a naturally anxious person, I recommend pursuing a high stress career path where at least you'll be compensated for anxiety you're going to have anyways.

290

37K

3K

1M

Ben

@benc_yi

about 2 months ago

@Jason isnt the main point of contention that data center water competes directly with municipal water supply (tap) whereas almond water is from agricultural sources (raw, untreated, from underground aquifers, etc) still crazy chart though lol

2

10

0

1

863

benc_yi retweeted

RestoreFast @restorefast

about 2 months ago

restorefast's tweet photo. https://t.co/WBay2OTYxF

0

1

0

192

Ben

@benc_yi

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users