Opus 4.8 is good, I'm liking it a lot more than 4.7. I've been tempted to cancel Claude, but Opus always finds a few improvements in my Codex projects. It has a sense of "the big picture" that Codex doesn't.
@PradyuPrasad For the natural sciences I doubt it. For the social sciences, I think it's likely that some future results will replicate in language models.
@neqyve@DeepDishEnjoyer@AeonCoin It means that a more accurate approximation of a function is possible with a larger network, but doesn't explain why SGD on a larger neural network produces lower validation loss. "A more complex approximation can be more accurate" has been known for hundreds of years.
@DeepDishEnjoyer@AeonCoin I don't think the universal approximation theorem explains why AI works at all. We still don't have a good theory of why large neural networks generalize which is the important question.
@SamH0816@LinkofSunshine But if interest is compounded annually isn’t the solution exactly 4 years and not 3.8, since the interest only accrues once a year and not continuously?
@SignaIbat9@TsukinoYueVT@Kimagure31415 That all happens during slicing, the input model can be as high or low poly as you want and it’ll end up in similar detail post slicing.
@industriaalist To elaborate, I think you'd want everything to be learned with enough compute. Evolution didn't need hyperparameters to create intelligence, besides maybe the laws of physics
@industriaalist And in a high enough compute regime, you'd want zero hyperparameters as the distinction between a parameter and a hyperparameter vanishes