Robert Gordan @goodforzer0 - Twitter Profile

24 days ago

@tszzl yeah I’ve wondered the same. Like at an even more granular level, why did the models all converge on the same stylistic tics (eg em-dash). It’s counterintuitive to me that the path through the loss landscape would be so deterministic

0

1

0

1

1K

Robert Gordan @goodforzer0

27 days ago

@DaveRBanerjee I mean filtering alone doesn't seem like a huge win if we're already in a data-constrained regime. The real breakthrough would be like synthesizing interesting RL problems for itself to solve.

0

11

Robert Gordan @goodforzer0

about 2 months ago

who knew the GPTs read christina rosetti

roon

@tszzl

about 2 months ago

everyone is assuming this is some kind of quirk chungus marketing campaign but if you’ve worked with 5.4 and beyond they tend to call everything goblins, gremlins etc and it’s just super noticeable and if you work with them all day you start to get annoyed

202

2K

31

190

299K

0

1

0

71

Robert Gordan @goodforzer0

3 months ago

@leothecurious @sainingxie why can't you train a language model from nothing? after all, evolution did it

0

86

Robert Gordan @goodforzer0

3 months ago

@seconds_0 @willdepue @teortaxesTex I hope not.. The probability of winning is quite low anyways, participating and sharing interesting results along the way is probably a better way to get recognition.

0

14

Robert Gordan @goodforzer0

3 months ago

@varunneal @Sam_Acqua Honestly though by default I'm a bit skeptical that there's that much headroom from cross-document TTT, there just aren't that many val tokens.

0

2

0

29

Robert Gordan @goodforzer0

3 months ago

@varunneal @Sam_Acqua Yeah that's exactly the other bug I mentioned. Isn't clear to me whether the adapter could see the suffix. Author says it can't. https://t.co/4xx2Ma728U

1

2

0

45

Robert Gordan @goodforzer0

3 months ago

@molochofficial Obviously bad, but how would you describe what it’s doing here? To me: overwritten, stuffed with metaphors that don’t land. Every sentence seems to be similarly structured (independent clause, dependent clause xN)

5

132

2

16

21K

Robert Gordan @goodforzer0

3 months ago

I'm sure others have said it too but weirdly enough I really love the personality of these coding models.. "I spent a long time trying to find a counterexample.." sounds like something my TA in school would say while giving me partial credit. This is Claude but 5.4 is v similar.

goodforzer0's tweet photo. I'm sure others have said it too but weirdly enough I really love the personality of these coding models.. "I spent a long time trying to find a counterexample.." sounds like something my TA in school would say while giving me partial credit. This is Claude but 5.4 is v similar. https://t.co/sjVwBBUswC

0

1

103

Robert Gordan @goodforzer0

3 months ago

@Butanium_ @voooooogel that’s such a good example of the unintuitive generalization ability that models can have. easy to take it for granted these days but I feel like it’s quite related to what was so magical about the first instruct tuned models

0

1

0

69

Robert Gordan @goodforzer0

3 months ago

now if only someone can pay for an h100 to see if this works

0

48

Robert Gordan @goodforzer0

3 months ago

Sorry bro not ambitious enough. I created auto-autoresearch-research, an agent that optimizes your agent that optimizes your research code. https://t.co/eKaTFS4lUW

Andrej Karpathy

@karpathy

3 months ago

oh yeah i should have linked autoresearch probably https://t.co/YCvOwwjOzF (you don't "use it" directly, it's just a recipe/idea - give it to your agent and apply to what you care about.) and the tweet about it that went mini-viral over the weekend with more context https://t.co/q5eWsvx5p2

96

3K

220

2K

351K

1

0

121

Robert Gordan @goodforzer0

3 months ago

not really but I still wrote about it https://t.co/1BntiB2CaQ

0

38

Robert Gordan @goodforzer0

3 months ago

Can you replicate an ICML paper with less than $20 in runpod credits? I tried and the answer is

1

0

120

Robert Gordan @goodforzer0

3 months ago

obvious analogy to RLHF unlocking the latent intelligence from older pretraining-only models

0

25

Robert Gordan @goodforzer0

3 months ago

More and more I'm agreeing with alignment by default. The pet theory I have is that "alignment" will be more about eliciting the capabilities of next-generation models than about their ethics.

1

0

30

Robert Gordan @goodforzer0

3 months ago

@belindazli Hi Belinda, I think this is a really cool area. Do you think that self-supervised training to improve introspection could generalize to improvements in arbitrary domains?

0

29

Robert Gordan @goodforzer0

4 months ago

... We must expect great innovations to transform the entire technique of the arts, thereby affecting artistic invention itself and perhaps even bringing about an amazing change in our very notion of art." -- Valéry, 1928

0

32

Robert Gordan @goodforzer0

4 months ago

"Our fine arts were developed, their types and uses were established, in times very different from the present, by men whose power of action upon things was insignificant in comparison with ours..

1

0

35

Robert Gordan

@goodforzer0

Last Seen Users on Sotwe

Trends for you

Most Popular Users