Isak Westerlund @westis96 - Twitter Profile

about 12 hours ago

@JarmoFriman Ostin itse myös tänään. Liian matala P/E arvo kun vertaa Amazonin ja Googlen P/E arvoihin. Omistaa 30% + IP OpenAI:sta myös. IPO tulossa.

1

0

987

Isak Westerlund

@westis96

7 days ago

@NaivaidyaY66600 @DevBySami It’s ahead in image and video understanding and some general knowledge tasks. Just sucks at agentic workflows and coding.

0

46

Isak Westerlund

@westis96

7 days ago

@apocsnap @specs People don’t realise they are not competing with normal glasses, but with Apple Vision Pro. I would wear this at work, but not the Vision Pro.

1

3

0

148

Isak Westerlund

@westis96

13 days ago

If you work in AI, move to Europe.

0

74

Who to follow

Jeff Clune

@jeffclune

Co-founder, Recursive. Professor, CS, U. British Columbia. CIFAR AI Chair, Vector Institute. | ML, AI, deep RL, deep learning, AI-Generating Algorithms (AI-GAs)

Anshuk Uppal

@sigmabayesian

PhD student @DTUtweet. Probabilistic ML 🧠 diffusion and sampling🧠. previously intern @MSFTResearch @SonyAI_global, visitor @NYU_Courant.

Sangyun Lee

@sang_yun_lee

PhD student @CMU_ECE | ex-intern @nvidia | Generative models

Isak Westerlund

@westis96

15 days ago

@Jason They should have run his face through AI though. Does not eeally look like someone in their twenties.

0

635

Isak Westerlund

@westis96

15 days ago

@simonw @jeremyphoward Good. This would have set a bad precedent.

0

46

westis96 retweeted

Sergey Levine

@svlevine

16 days ago

Diffusion (or flow) makes for excellent policies, but training them with RL is notoriously hard: BPTT is unstable, RL over diffusion blows up the horizon. In our new paper, we show how we can optimize flow matching actors by using "one weird trick" -- "approximate" the Jacobian of the flow denoising process with the identity matrix. 👇

svlevine's tweet photo. Diffusion (or flow) makes for excellent policies, but training them with RL is notoriously hard: BPTT is unstable, RL over diffusion blows up the horizon. In our new paper, we show how we can optimize flow matching actors by using "one weird trick" -- "approximate" the Jacobian of the flow denoising process with the identity matrix. 👇

8

1K

122

957

84K

Isak Westerlund

@westis96

17 days ago

@KenneyNL Really shows the inpact you have had in getting people into game dev! 💪

0

4

0

549

westis96 retweeted

Zander Blasingame @zwblasingame

22 days ago

🧵1/9 Rex has been selected for an oral presentation at #ICML2026! 🎉 We make diffusion/flow model solvers exactly reversible (bijection), for both ODEs and SDEs Last project of my Ph.D. w/ Chen Liu @ClarksonUniv → first project of my postdoc at #AITHYRA!

4

125

25

69

25K

westis96 retweeted

hardmaru

@hardmaru

30 days ago

For over a decade, we’ve accepted that end-to-end backprop is the only way to train deep networks. But holding the entire network in memory all at once is why AI training is hitting a resource wall. We found a new way to break the network into blocks and train them independently. The trick? Treating the network’s forward pass like a diffusion model denoising a signal. This reinterpretation slashes the memory needed to train deep models. In our #ICLR2026 paper (https://t.co/PK5h0mqQSo), we matched end-to-end performance across ViTs, DiTs, and LLMs. We did this while training just one isolated block at a time.

154

6K

640

4K

749K

westis96 retweeted

Alex Rives

@alexrives

30 days ago

Today we're announcing ESMFold2, an open scientific engine to power prediction, design, and discovery across protein biology. The new model delivers state of the art performance on protein interactions, especially antibodies, a critical modality for therapeutics. We have designed and validated miniprotein binders and single chain antibodies across five therapeutic targets that are important in cancer and immunology. We are seeing very high success rates, and affinities at levels consistent with therapeutic activity. We’re also releasing an atlas of 6.8 billion proteins, and 1.1 billion predicted structures. ESMFold2 is built on a state of the art language model that has been trained on billions of protein sequences. A world model of protein biology emerges through language modeling. We’ve used the techniques of mechanistic interpretability developed to understand large language models to understand the concepts ESM uses to represent proteins. The model’s representation space has a compositional organization of features across scales, levels of complexity, and abstraction, that reflects and mirrors the understanding of protein biology developed through a century of empirical science. This understanding emerges without prior knowledge, just from language modeling of protein sequences. Language models are becoming a powerful substrate to understand and program biology. The design of protein interactions is one of the most fundamental problems in biophysics, and has critical implications for the discovery of new medicines. A simple gradient based search with the model was able to discover high-affinity protein binders. I'm excited by the potential this has to accelerate basic science and the understanding of proteins. And especially for the new avenues it opens up for therapeutic design and medicine.

75

2K

452

715

605K

westis96 retweeted

Jueun Kim @jueunkim_0525

about 1 month ago

🚨New Optimizer Paper AMUSE: Anytime MUon with Stable gradient Evaluation AMUSE combines Muon with Schedule-Free-style gradient evaluation for stable anytime training without LR decay. • Stronger 124M / 720M / 1B pretraining • Strong ImageNet / ViT fine-tuning performance.

jueunkim_0525's tweet photo. 🚨New Optimizer Paper
AMUSE: Anytime MUon with Stable gradient Evaluation

AMUSE combines Muon with Schedule-Free-style gradient evaluation for stable anytime training without LR decay.

• Stronger 124M / 720M / 1B pretraining
• Strong ImageNet / ViT fine-tuning performance. https://t.co/Y1qQnpDt2n

16

321

40

205

44K

Isak Westerlund

@westis96

about 1 month ago

@dwarkesh_sp David Baker

0

12

Isak Westerlund

@westis96

about 1 month ago

@MozarellaPesto From 2019. Found an old doc:

0

1

0

1

41

Isak Westerlund

@westis96

about 1 month ago

@0xCF88 LAMMPS?

1

0

62

westis96 retweeted

BlinkDL

@BlinkDL_AI

about 1 month ago

RWKV-7 G1g is here: the world's best pure RNN LLM, and a competitive LLM in general. Try https://t.co/k0AivnxCwP for bsz16 7B inference. G1h in June 🙂 p.s. const 15000+tps decoding on single 5090: https://t.co/pHP6bXITjQ

BlinkDL_AI's tweet photo. RWKV-7 G1g is here: the world's best pure RNN LLM, and a competitive LLM in general. Try https://t.co/k0AivnxCwP for bsz16 7B inference. G1h in June 🙂 p.s. const 15000+tps decoding on single 5090: https://t.co/pHP6bXITjQ https://t.co/ZiAxcWzqjd

4

117

23

65

25K

westis96 retweeted

Aaron Defazio

@aaron_defazio

about 1 month ago

🚨 New Paper 🚨 ScheduleFree+: Scaling Learning-Rate-Free & Schedule-Free Learning to Large Language Models A few modifications to Schedule-Free Learning make it completely LR tuning free, and allow it to greatly outperform schedules for long duration training! https://t.co/LzjIIsOlG8

aaron_defazio's tweet photo. 🚨 New Paper 🚨
ScheduleFree+: Scaling Learning-Rate-Free & Schedule-Free Learning to Large Language Models

A few modifications to Schedule-Free Learning make it completely LR tuning free, and allow it to greatly outperform schedules for long duration training!
https://t.co/LzjIIsOlG8

7

421

56

303

85K

westis96 retweeted

Yue Song @YueSong48287250

about 1 month ago

[1/3] Excited to share Winfree Oscillatory Neural Network (WONN): a synchronization-based neural architecture built on Winfree dynamics. WONN evolves representations through oscillatory synchronization on a toroidal phase space (S^1)^d. Project page: https://t.co/5dkz6W73jc

YueSong48287250's tweet photo. [1/3] Excited to share Winfree Oscillatory Neural Network (WONN): a synchronization-based neural architecture built on Winfree dynamics.
WONN evolves representations through oscillatory synchronization on a toroidal phase space (S^1)^d.
Project page: https://t.co/5dkz6W73jc https://t.co/aTQtIx8kA7

7

201

38

147

17K

westis96 retweeted

Sebastian Raschka

@rasbt

about 1 month ago

Gated DeltaNet has been one of my favorite "hybrid attention" newcomers in the good old transformer stack. Excited to see Gated DeltaNet-2. Adding it to my reading stack. In the meantime, I have a primer on Gated DeltaNet here: https://t.co/FoicOLtFE6

24

383

53

229

53K

Isak Westerlund

@westis96

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users