Michael Yoo Fatemi @fatemi_michael - Twitter Profile

about 14 hours ago

@aliuahma i think the best balance is to show enough examples of an idea to motivate the abstraction; in general there are lots of abstractions that fit any particular idea but only a few that fit many of them

1

0

33

Michael Yoo Fatemi @fatemi_michael

2 days ago

i miss when youtube showed random videos on the front page when you first visited, too much personalization keeps you from finding new things

0

23

Michael Yoo Fatemi @fatemi_michael

3 days ago

@AlexiGlad what would you predict?

1

0

80

Michael Yoo Fatemi @fatemi_michael

3 days ago

@_fracapuano and your dataloader is only as fast as your video decoder

1

0

54

Who to follow

elaine?

@uglyp0pcorn

we do not have ugly popcorn

riya dev

@ceoRiya

founder @getDezu. prev AWS, UMich, TJ. singer, korean speaker, poker reg. dms open

Jason Chen

@jaychen35

Aspiring aerospace engineer/test pilot 🇹🇼/🇺🇸

Michael Yoo Fatemi @fatemi_michael

3 days ago

pov, you're a diffusion model

coscorrodrift @Coscorrodrift

4 days ago

What else do you think this applies to

15

470

13

236

27K

0

1

0

43

Michael Yoo Fatemi @fatemi_michael

8 days ago

@aliuahma yeah i agree, even in the most extreme scenarios where agents one shot problems the code should at least be read; also i feel the more researchy your code becomes, the more unique info is in each line, and the less AI helps lol

1

2

0

53

Michael Yoo Fatemi @fatemi_michael

12 days ago

@felpix_ the best way to learn real analysis is by staring into the abyss yourself, ideally the night the problemset is due

1

2

0

59

Michael Yoo Fatemi @fatemi_michael

12 days ago

@AndrewLampinen Can LLM memory systems be viewed as approximations for sparse attention over extremely long contexts? Therefore, can self-improving agent techniques involving memory be seen as unifying training and in-context learning?

0

1

0

402

Michael Yoo Fatemi @fatemi_michael

13 days ago

@hlslyuri depends on if you consider RLVR synthetic data generation

0

48

Michael Yoo Fatemi @fatemi_michael

13 days ago

@dwarkesh_sp What is a non-verifiable domain? What distinguishes it from a domain where the verifier is just not good enough yet?

0

2

0

241

Michael Yoo Fatemi @fatemi_michael

13 days ago

@furongh checkpoints that are trained through high-throughput video-free dataloaders, which then get combined with video generators through cross-attention in an MoT-style model.

0

46

Michael Yoo Fatemi @fatemi_michael

13 days ago

@furongh In principle, inverse dynamics from video is object segmentation with extra hidden state estimation. Therefore, it's roughly the same difficulty as fine-tuning. This could be a different story if force or tactile data is introduced. Action experts may also be warm-started from 1/

1

2

0

294

Michael Yoo Fatemi @fatemi_michael

14 days ago

@JieWang_ZJUI cool! imo multimodal memory is underexplored, especially considering scenarios like asking a robot to remember what to do with "a specific object"

0

1

0

54

Michael Yoo Fatemi @fatemi_michael

14 days ago

yeah, IMO it's really appealing but also hard to make L0 penalty transformers work simply because hardware is so much better optimized for dense models (in my limited testing, the effective latency of loading a large block of weights has the same latency as loading 1 bc cache)

0

2

0

154

Michael Yoo Fatemi @fatemi_michael

20 days ago

@huskydogewoof This is really cool! Do you think the attractors could eventually be represented implicitly by a verifier? It seems like the flow fields are defined explicitly by a neural network here.

1

0

1

184

Michael Yoo Fatemi @fatemi_michael

22 days ago

@VictorTaelin They said there was no orchestration "scaffolded to search proof strategies", but that doesn't necessarily mean there wasn't some other more general memory mechanism involved that is technically broader than mathematical proofs

0

2K

Michael Yoo Fatemi @fatemi_michael

22 days ago

@SungjinAhn_ Very cool! What do you think causes the difference from diffusion models? maybe scaling diffusion steps amounts to finer discretization rather than more exploration? or the lack of explicit "diffusion time" creates a stochastic process with really good mixing?

5

0

92

Michael Yoo Fatemi @fatemi_michael

22 days ago

This is pretty neat. It would be cool to eventually unify the search implicit in Langevin-style sampling, and the coarse-to-fine inductive bias known of diffusion models, in a noise-prediction-free architecture through some variant of this.

Amir Zamir

@zamir_ar

29 days ago

Test-time scaling, reasoning, and generally search-like processes clearly drive significant gains in LLMs. Largely owed to the structure of language. One would think the same could apply to non-linguistic domains, like image generation, but that obviously depends on whether the structure of the domain's representation lends itself to search. 1D ordered tokens (e.g., image FlexTok, video FlexTok) seem like a natural fit since they enable a step-by-step coarse-to-fine generation. We investigated that and found they indeed enable search and scale far better with test-time compute than 2D grids. See the visuals on the webpage. Appearing in @icmlconf 2026. 🔗 https://t.co/yOFqeIJrEz 📄 https://t.co/WFZCihp1m4,

5

138

31

86

15K

0

1

0

147

Michael Yoo Fatemi @fatemi_michael

24 days ago

sometimes i forget how often "cliché" advice is actually good

0

1

0

39

Michael Yoo Fatemi @fatemi_michael

about 1 month ago

Shouldn't the linearizability of softmax attention imply some fundamental bottleneck of transformer retrieval, at least qualitatively? I first saw this in Katharopoulos et al. (2020), although it's older. I wrote a quick derivation here for context.

fatemi_michael's tweet photo. Shouldn't the linearizability of softmax attention imply some fundamental bottleneck of transformer retrieval, at least qualitatively? I first saw this in Katharopoulos et al. (2020), although it's older. I wrote a quick derivation here for context. https://t.co/7SeKus89r4

0

3

0

1

100

Michael Yoo Fatemi

@fatemi_michael

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users