Omead Pooladzandi @HessianFree - Twitter Profile

Pinned Tweet

2 months ago

your spotify cache is bigger than our largest AI model. Bonsai: 1-bit weights. 1.7B to 8B params. 14x compression vs bf16. 8x faster on edge. 256 MB to 1.2GB. Based on Qwen 3. we just came out of stealth. intelligence belongs at the edge and we're going to put it there. Apache 2.0. we compressed intelligence. more coming. @PrismML

HessianFree's tweet photo. your spotify cache is bigger than our largest AI model.

Bonsai: 1-bit weights. 1.7B to 8B params. 14x compression vs bf16. 8x faster on edge. 256 MB to 1.2GB. Based on Qwen 3.

we just came out of stealth. intelligence belongs at the edge and we're going to put it there.

Apache 2.0.

we compressed intelligence. more coming. @PrismML

PrismML @PrismML

2 months ago

Today, we are emerging from stealth and launching PrismML, an AI lab with Caltech origins that is centered on building the most concentrated form of intelligence. At PrismML, we believe that the next major leaps in AI will be driven by order-of-magnitude improvements in intelligence density, not just sheer parameter count. Our first proof point is the 1-bit Bonsai 8B, a 1-bit weight model that fits into 1.15 GBs of memory and delivers over 10x the intelligence density of its full-precision counterparts. It is 14x smaller, 8x faster, and 5x more energy efficient on edge hardware while remaining competitive with other models in its parameter-class. We are open-sourcing the model under Apache 2.0 license, along with Bonsai 4B and 1.7B models. When advanced models become small, fast, and efficient enough to run locally, the design space for AI changes immediately. We believe in a future of on-device agents, real-time robotics, offline intelligence and entirely new products that were previously impossible. We are excited to share our vision with you and keep working in the future to push the frontier of intelligence to the edge.

PrismML's tweet photo. Today, we are emerging from stealth and launching PrismML, an AI lab with Caltech origins that is centered on building the most concentrated form of intelligence.

At PrismML, we believe that the next major leaps in AI will be driven by order-of-magnitude improvements in intelligence density, not just sheer parameter count.

Our first proof point is the 1-bit Bonsai 8B, a 1-bit weight model that fits into 1.15 GBs of memory and delivers over 10x the intelligence density of its full-precision counterparts. It is 14x smaller, 8x faster, and 5x more energy efficient on edge hardware while remaining competitive with other models in its parameter-class.
We are open-sourcing the model under Apache 2.0 license, along with Bonsai 4B and 1.7B models.

When advanced models become small, fast, and efficient enough to run locally, the design space for AI changes immediately. We believe in a future of on-device agents, real-time robotics, offline intelligence and entirely new products that were previously impossible.

We are excited to share our vision with you and keep working in the future to push the frontier of intelligence to the edge.

177

4K

591

3K

1M

88

2K

161

1K

205K

HessianFree retweeted

Larry Dial

@classiclarryd

about 5 hours ago

Building momentum at Marin! Upgrading from Dense -> 129B parameter MoEs -> architecture improvements -> optimizer improvements gives our pretraining recipe an estimated 6x cumulative learning speedup, accounting for MFU. Includes community contributions. https://t.co/5dPB9uBiSp

1

95

12

44

14K

Omead Pooladzandi

@HessianFree

about 3 hours ago

@norxornor @stochasticchasm @wen_kaiyue @dlwh Relu^2.3 is goated

0

3

0

42

Omead Pooladzandi

@HessianFree

about 4 hours ago

@varunneal @CevherLIONS the best way isnt clear yet. i usually just say well grads/momentum and hvps are going into a system that should learn to whiten both. then we just precond the momentum.

1

0

28

Who to follow

🪙 Bitcoin enthusiast & miner 🚀 Investing in the future of finance. #BitcoinMining #CryptoInvestor

Omead Pooladzandi

@HessianFree

about 8 hours ago

@SeunghyunSEO7 @torchcompiled initializing zero is equivalent to initializing with orthogonal for muon

0

28

Omead Pooladzandi

@HessianFree

about 8 hours ago

@wen_kaiyue We’ve clearly over biased ourselves towards Adam in the wild and in the modded nano GPT speed run we’ve over biased ourselves towards muon.

0

2

0

130

Omead Pooladzandi

@HessianFree

about 22 hours ago

@KyleLiang5 congrats man!

0

1

0

441

HessianFree retweeted

Kaizhao Liang

@KyleLiang5

about 23 hours ago

it's a pretty good model 😮‍💨 and it's finally out https://t.co/ypFJf5A5hw

4

67

7

13

4K

HessianFree retweeted

Unsloth AI

@UnslothAI

1 day ago

You can now train 120B+ parameter models locally on a laptop! 🔥 We collabed with NVIDIA and Microsoft to bring LLM training on the 128GB unified memory RTX Spark laptop!

UnslothAI's tweet photo. You can now train 120B+ parameter models locally on a laptop! 🔥

We collabed with NVIDIA and Microsoft to bring LLM training on the 128GB unified memory RTX Spark laptop! https://t.co/mKbbIRWh9c

55

1K

105

177

87K

Omead Pooladzandi

@HessianFree

3 days ago

@MainzOnX Xilin Li does does the bulk of his work in complex128 for representing speech. That's why PSGD can optimize directly the complex domain.

1

4

0

272

HessianFree retweeted

Michael Dell 🇺🇸

@MichaelDell

4 days ago

We have the first @DellTech + @nvidia Vera Rubin NVL72 @CoreWeave. Here we go! 🚀

152

3K

397

370

1M

Omead Pooladzandi

@HessianFree

4 days ago

novograd is still a good optimizer

0

2

0

336

Omead Pooladzandi

@HessianFree

4 days ago

not bad at all

1

8

0

788

Omead Pooladzandi

@HessianFree

4 days ago

@norxornor I bias more towards sgd at the start and then allow for more dynamic at the end. If I know I'm going to have a lot of noise I will also adjust eps.

0

2

0

27

Omead Pooladzandi

@HessianFree

4 days ago

@norxornor I usually schedule eps

1

0

69

Omead Pooladzandi

@HessianFree

4 days ago

@SemiAnalysis_ @eraznafre skill issue

0

1

0

1

291

Omead Pooladzandi

@HessianFree

4 days ago

@agarwl_ How is this from 2024

0

1

0

37

HessianFree retweeted

rohan anil

@_arohan_

6 days ago

Starting with Neolab tradition of writing our first blogpost. It’s a very good blogpost sir.

6

162

10

76

20K

HessianFree retweeted

Evan Walters

@evaninwords

6 days ago

Bug fix! Bonsai Image generations on local MacBook MLX will be even better quality. Turns out how you pad text matters 😆 try it out! https://t.co/vJFTG18oNf

1

26

6

9

5K