M @init_malachi - Twitter Profile

On-Policy Distillation is the most active new research direction being explored in RL for LLMs. Had the chance to discuss how it works with Dwarkesh and why it fits so nicely into large-scale pipelines.

11

608

58

490

45K

init_malachi retweeted

ueaj

@_ueaj

about 16 hours ago

"Attention is just a special case of <abstract math thing> so we generalized it by <neglecting the other 30 abstractions and conditions required for frontier architecture> and we found it performed <p hacking> compared to <naive baseline>"

11

746

35

135

26K

init_malachi retweeted

0.005 Seconds (3/694)

@seconds_0

about 13 hours ago

Launching Sub-Prime Intellect, where i pay my friends in beer to let me SSH into their gaming PCs at night and run stuff on their graphics cards

12

197

5

4

5K

Who to follow

HASHTHETICS

@hashthetics

Mytho Compute Company @qvale

swag till I die yuh yuh

M

@init_malachi

about 7 hours ago

smh

Jiaxin Wen

@jiaxinwen22

about 15 hours ago

recent "generalization" papers be like: 1. use system prompts to generate synthetic data, which functions as a steering vector 2. fine-tune LMs on the synthetic data 3. WOW we see "generalization" 4. WOW we can use rank-1 LoRA to replicate this "generalization" 5. WOW we find a steering vector that can explain, predict, and control "generalization"

7

154

5

125

22K

0

120

init_malachi retweeted

Mingkai Deng

@mdeng34

about 15 hours ago

We agree that the world model should be a simulator that supports decision-making, not rendering beautiful images/videos. Our difference is in how the world state should be represented. Should the world be anchored in Gaussian splats and physics engines for program-as-simulator? Or in learned representations for model-as-simulator? We believe the latter is a more scalable, bitter-lesson-pilled approach. More in our position paper "Critiques of World Models" coauthored with Prof. @ericxing and @jinyuhou0 https://t.co/NqnxGtKNBL

3

36

9

28

4K

init_malachi retweeted

Ben Clavié

@bclavie

about 21 hours ago

*changes title to MTS* *writes a quasi-essay about the nature of research* this can't keep happening

2

24

1

8

3K

init_malachi retweeted

wordgrammer

@wordgrammer

about 10 hours ago

I don’t really think “recursive self-improvement” is a coherent concept. It’s like “a square circle”. A contraction in terms

22

83

0

9

8K

init_malachi retweeted

mass

@Memetic_Theory

about 8 hours ago

This requires a probability distribution on the capital returns from compute btw Were the first ones to show that btw https://t.co/edRoJLiaIo btw

1

15

1

8

1K

init_malachi retweeted

Eric W. Tramel

@fujikanaeda

about 12 hours ago

we got bought by Nvidia and made a bunch of contributions to Nemotron models and released a lot of successful open source data and software like: - OpenShell - NeMo Data Designer - NeMo Anonymizer - NeMo Safe Synthesizer - Nvidia PII detector …

5

168

5

69

17K

init_malachi retweeted

Arnav Gupta

@championswimmer

about 14 hours ago

When you leave an HFT, they put you on a non-compete for 1 or even 2 years! This is the biggest gift from HFTs to open source world. Aman Gupta is being paid by Jump Trading (to sit at home) just added multi-token prediction to llama.cpp which speeds up local LLM models by 2x

championswimmer's tweet photo. When you leave an HFT, they put you on a non-compete for 1 or even 2 years! This is the biggest gift from HFTs to open source world.

Aman Gupta is being paid by Jump Trading (to sit at home) just added multi-token prediction to llama.cpp which speeds up local LLM models by 2x https://t.co/E79MsJiLwF

25

3K

101

1K

215K

init_malachi retweeted

ben hylak

@benhylak

about 14 hours ago

i think models will be good at design once frontier labs can trick people like me into sitting down and labeling data for them all day for a year.

10

128

3

24

9K

M

@init_malachi

about 13 hours ago

being loved by lesbians the true career buff

0

1

0

15

init_malachi retweeted

Ryan Lopopolo

@_lopopolo

about 17 hours ago

Re: token budgets. Give your systems thinkers no limits and let them figure it out for everyone else

4

38

5

15

4K

init_malachi retweeted

arb8020

@arb8020

about 15 hours ago

secret method to instantly validate any attention/architecture/optimizer/quantization from some blog or paper you read

0

10

1

268

init_malachi retweeted

dan

@irl_danB

about 16 hours ago

everyone is building an agent or a tool you don't want an agent or a tool, you want a reactor I've been working on something cool and I think you'll like it it's simple: an agent session DAG that keeps a declared world-model up to date in an efficient (memoized) render each render node is an agent session: you declare the desired state with OpenProse markdown files once invoked, each agent session acts as the provider. the agent session uses the open source openai-agents-sdk, extensible however you like with any model (I use with opus, sonnet, haiku) the facets of the world-state are memoized, so not every agent has to run on every event, saving you on inference if that sounds a lot like React or dataflow, that's because even in our brave new world the wisdom of the agents holds fast

20

170

13

178

7K

M

@init_malachi

about 15 hours ago

@0xAlaric combat is a sport with pareto boundary-defined rules

0

1

0

56

init_malachi retweeted

kalomaze

@kalomaze

1 day ago

so we solve the constraint satisfaction problem by building representationally invariant structure around the constraints of the physics that we live in. rationalizations of invariants are also not always true. we had to start measuring to figure out a newtonian model of gravity

0

11

1

657

init_malachi retweeted

kalomaze

@kalomaze

1 day ago

data augmentation producing near exact invariants on networks that otherwise would just drift (by default) should tell people something about what the process of iterating on the optimization rule is doing humans don't get these for free either but we do live in a world that asserts invariants on us

1

15

1

1K

init_malachi retweeted

kalomaze

@kalomaze

1 day ago

"transfer learning from geometric structure happens when you distribute the solution to a problem thinly across an iterated structure" is the most parsimonious accounting i worry that hoping for a unified general relativity type discovery is kind of a category error in some way

8

101

7

48

8K

M

@init_malachi

about 15 hours ago

@yacineMTB sounds like you could find the limit of valid simulation numerics

0

2

0

340

M

@init_malachi

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users