David Harmeyer @SecondThread - Twitter Kullanıcısı

David Harmeyer @SecondThread

yaklaşık 16 saat önce

@QPHutu DPPO and GRPO both in blue is confusing

0

2

David Harmeyer @SecondThread

6 gün önce

@junkliao @FakePsyho It's an AtCoder contest that starts at a reasonable time in Japan, but 3 AM Pacific time

0

81

David Harmeyer @SecondThread

10 gün önce

@cdbrw15 @GuillaumeLample They cut the learning rate probably. You use a larger learning rate at first to learn faster but its less stable, and then you decrease it to get to the very top of the hill

0

58

David Harmeyer @SecondThread

yaklaşık 1 ay önce

Just spent so much time working with loss curves that when I opened up my 401k today, I panicked because I saw it going up and to the right

0

4

0

125

Takip edebileceğin hesaplar

I have an account on @codeforces

Jatin Garg

@_rivalq_

Bio is for greens I am Red

David Harmeyer @SecondThread

yaklaşık 1 ay önce

Yes, looks like it's a combination of that, and not enough variation in samples that killed it. tanh() is a million times better

SecondThread's tweet photo. Yes, looks like it's a combination of that, and not enough variation in samples that killed it.

tanh() is a million times better https://t.co/yP9ArWxy6d

0

1

0

123

David Harmeyer @SecondThread

yaklaşık 1 ay önce

World's ugliest loss curve. Is this just what you get for using sigmoid in hidden layers?

3

2

0

216

David Harmeyer @SecondThread

3 ay önce

The AI times have begun

1

2

0

140

David Harmeyer @SecondThread

4 ay önce

@steipete @openclaw Legendary

0

37

David Harmeyer @SecondThread

4 ay önce

@xubinnrencs RAS syndrome

0

25

David Harmeyer @SecondThread

5 ay önce

@tsoding At that point the only reason that'd be useful is as precompute-expensive, query-expensive dataset compression. But in the real world, data is cheap and compute is expensive, so nobody would do this unless there's some immensely tight bottleneck on transmittable data.

0

39

David Harmeyer @SecondThread

5 ay önce

Sleep is just "compacting conversation..." for humans

0

176

David Harmeyer @SecondThread

6 ay önce

@sciencegirl I guarantee if you give this to actual people, they will misjudge the number of steps on the way down and trip

0

2

0

31

David Harmeyer @SecondThread

7 ay önce

@zebriez Competence

0

28

David Harmeyer @SecondThread

8 ay önce

@hyhieu226 If your reasoning isn't autoregressive, it's probably rationalization, not reasoning. Maybe rationalization makes sense in some cases, but it's a lot more artistic than it is logical.

0

1

0

36

David Harmeyer @SecondThread

8 ay önce

https://t.co/wfWuC54hAW is amazing, but the fact that it's not available in the extensions search is confusing. It strikes me as obviously the best way to use Kotlin, but it's so hard to find

0

2

0

299

David Harmeyer @SecondThread

8 ay önce

Many people fail because they go off to build a generic platform, and then find usecases afterwards. It's so much more compelling to find a problem, build a solution, and then after that's proven, make the solution generic.

0

3

0

277

David Harmeyer @SecondThread

8 ay önce

Register here: https://t.co/E77ljn7sib

0

1

0

1

336

David Harmeyer @SecondThread

8 ay önce

Looking forward to Hacker Cup Round 1, starting tomorrow morning!

1

11

1

2K

David Harmeyer @SecondThread

9 ay önce

Every time I remember that "m" comes before "n" in the alphabet I do a double-take

0

4

0

391

SecondThread retweetledi

Charlie Kirk

@charliekirk11

13 yıldan fazla bir süre önce

Good men must die, but death can't kill their names.

1K

109K

27K

6K

0

David Harmeyer

@SecondThread

Takip edebileceğin hesaplar

Sotwe'de En Son Ziyaret Edilenler

Senin İçin Trendler

En Popüler Kullanıcılar