Max Pagels

@maxpagels

Head of Technology

Finland

Joined March 2007

565 Following

286 Followers

1.7K Posts

Max Pagels @maxpagels

12 days ago

@GaryMarcus I don't fully understand how the assumption was that more experimenting due to AI -> more successes in a linear fashion. The high cost of programming is what served as a prioritisation gate and generally got rid of most of the bad ideas.

0

0

0

0

39

maxpagels retweeted

15 days ago

Introducing DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation https://t.co/c9AvsRKybj What if we didn’t have to hold an entire neural network in memory to train it? Standard neural net training optimizes all parameters jointly. As a result, the memory required during training grows linearly with the depth of the network. In our #ICLR2026 paper, we propose DiffusionBlocks, a principled framework to train networks one block at a time, drastically reducing memory requirements while matching end-to-end performance. With DiffusionBlocks, we split the network into blocks and train them one at a time, so you only need memory for a single block. How? We explicitly assign each block a role: to move the representation a little closer to the target than the block before it did. That role turns out to be precisely what a diffusion model does, step by step. Each block only needs to optimize its own objective and can be trained independently. We validated this across five different architectures: • ViT • DiT • Masked diffusion • Autoregressive transformers • Recurrent-depth transformers In each case, performance is competitive with end-to-end training while using a fraction of the memory. This perspective also extends naturally to recurrent-depth (Looped) transformers, which apply the same network iteratively and normally require expensive backpropagation through time (BPTT). Viewed through DiffusionBlocks, we can replace those multiple iterations with a single forward pass during training. Read our paper and code, to learn more. Paper: https://t.co/CRj96VGYQn GitHub: https://t.co/eNW0K9Xh8E 🐟

56

2K

366

2K

859K

Max Pagels @maxpagels

15 days ago

@calvinfroedge It's enabled folks with no knowledge of a subject have an opinion on a subject with no work required. It speaks to human laziness and that makes it dangerous.

0

0

0

0

178

Max Pagels @maxpagels

15 days ago

I've argued that human-LLM pairing isn't going to produce new breakthroughs systematically, and forcing myself to put pen to paper gave me a few ideas why. Such novelty is likely going to need a machine-led approach. Human-LLM pairing won't work. https://t.co/AdOzqabvIs

0

0

0

0

36

Who to follow

Digitaalisen liiketoiminnan ytimessä vuodesta 1983. Tämä tili julkaisee Tivi-lehden tilaajille tarkoitetut artikkelit. Muut uutiset julkaisee @tiviuutiset.

CEO at @TesiFII 🇫🇮 #VentureCapital #PrivateEquity #Startups #Growth

CEO of @tomorrowfi. Host of @bosslevelpod. Death metal vocalist.

Max Pagels @maxpagels

7 months ago

@cramforce A surprising amount of AB tests that show no improvement are also simply ignored due to sunk costs and/or internal politics.

1

1

0

0

53

Max Pagels @maxpagels

11 months ago

@ID_AA_Carmack All I've seen use (contextual) bandits or similar constructs without credit assignment. a) inference-time performance is paramount, b) offline policy evaluation using data generated by any policy is well understood (rejection sampling, SN-IPS or doubly robust estimators).

0

0

0

0

182

Max Pagels @maxpagels

over 1 year ago

@timneutkens @rauchg @vercel @v0 @nextjs Tried, out of the box next to no help so I'm checking traces using your tips to understand what I'm doing wrong. https://t.co/UotleRvVyO

1

0

0

0

15

Max Pagels @maxpagels

over 2 years ago

@bernhardsson @modal_labs I seem to remember this being done as a MIP in Kubernetes for determining what server to run a container on to satisfy CPU/mem/availability requirements. Hard-constraint optimisation problems pop up in very interesting places!

0

1

0

0

74

Max Pagels @maxpagels

over 2 years ago

Another excellent book by @fugueur. Highly recommended, even if you live in Finland.

maxpagels's tweet photo. Another excellent book by @fugueur. Highly recommended, even if you live in Finland. https://t.co/j9tbuhpony

0

0

0

0

114

Max Pagels @maxpagels

about 3 years ago

@jaukia Redis has vector support (little known feature). Also Annoy has Node bindings, and Annoy is mmapped from disk so is very memory efficient.

1

2

0

0

55

Max Pagels @maxpagels

over 3 years ago

Here's my nomination for House Speaker.

maxpagels's tweet photo. Here's my nomination for House Speaker. https://t.co/nJ4mFAutwm

0

0

0

0

198

Max Pagels @maxpagels

over 3 years ago

Props once again to @analogue for making a beautiful product in the Pocket. Very refreshing in the age of shoddy build quality and endless digital subscriptions.

0

3

0

0

3K

Max Pagels @maxpagels

over 3 years ago

You know it's a bad blizzard when Helsinki airport says there may be slight delays in departures.

0

6

4

0

0

Max Pagels @maxpagels

over 3 years ago

A little bird told me that version II of the original machine generated whisky recipe is now in stock. https://t.co/JYBfdEnKgT

0

1

0

0

0

Max Pagels @maxpagels

over 3 years ago

This is the business equivalent of threatening to beat up kids unless they hand over their lunch money.

over 3 years ago

Shockingly, Elon Musk's strategy of berating and coercing private companies into handing over advertising revenues, rather than making Twitter attractive to them, appears to be backfiring in remarkable new ways, according to the Financial Times: https://t.co/a25mAFDmDj

GregTSargent's tweet photo. Shockingly, Elon Musk's strategy of berating and coercing private companies into handing over advertising revenues, rather than making Twitter attractive to them, appears to be backfiring in remarkable new ways, according to the Financial Times:

https://t.co/a25mAFDmDj https://t.co/HSCcITBlrQ

414

12K

3K

416

0

0

1

0

0

0

Max Pagels @maxpagels

over 3 years ago

Reading up on the places in London black cabs are restricted from accessing. Ridiculous. Every time I visit, I take a black cab – it's safe, reliable, and drivers spent more time studying than I did doing my master's degree. They should be allowed on every road and junction.

0

0

0

0

0

Max Pagels @maxpagels

over 3 years ago

This is a friendly reminder that many of the most serious early-stage companies aren't at confs like Slush. They are busy working.

0

2

0

0

0

Max Pagels @maxpagels

over 3 years ago

@shubham12et1062 @bernhardsson Or VW, depending on your needs. OBP is more scikit-like.

0

1

0

0

0

Max Pagels @maxpagels

over 3 years ago

@shubham12et1062 @bernhardsson I'd probably use a library since you get other things for free like offline evaluation and N different algorithms to play around with.

1

1

0

0

0

Last Seen Users on Sotwe

Trends for you

Most Popular Users