@cdbrw15@GuillaumeLample They cut the learning rate probably. You use a larger learning rate at first to learn faster but its less stable, and then you decrease it to get to the very top of the hill
@tsoding At that point the only reason that'd be useful is as precompute-expensive, query-expensive dataset compression.
But in the real world, data is cheap and compute is expensive, so nobody would do this unless there's some immensely tight bottleneck on transmittable data.
@hyhieu226 If your reasoning isn't autoregressive, it's probably rationalization, not reasoning.
Maybe rationalization makes sense in some cases, but it's a lot more artistic than it is logical.
https://t.co/wfWuC54hAW is amazing, but the fact that it's not available in the extensions search is confusing.
It strikes me as obviously the best way to use Kotlin, but it's so hard to find
Many people fail because they go off to build a generic platform, and then find usecases afterwards.
It's so much more compelling to find a problem, build a solution, and then after that's proven, make the solution generic.