Niklas Sheth @niklassheth - Twitter Profile

about 6 hours ago

Opus 4.8 is good, I'm liking it a lot more than 4.7. I've been tempted to cancel Claude, but Opus always finds a few improvements in my Codex projects. It has a sense of "the big picture" that Codex doesn't.

0

17

Niklas Sheth @niklassheth

about 18 hours ago

@PradyuPrasad For the natural sciences I doubt it. For the social sciences, I think it's likely that some future results will replicate in language models.

0

23

Niklas Sheth @niklassheth

about 19 hours ago

@neqyve @DeepDishEnjoyer @AeonCoin It means that a more accurate approximation of a function is possible with a larger network, but doesn't explain why SGD on a larger neural network produces lower validation loss. "A more complex approximation can be more accurate" has been known for hundreds of years.

1

0

16

Niklas Sheth @niklassheth

about 22 hours ago

@DeepDishEnjoyer @AeonCoin I don't think the universal approximation theorem explains why AI works at all. We still don't have a good theory of why large neural networks generalize which is the important question.

1

0

63

Niklas Sheth @niklassheth

1 day ago

@SamH0816 @LinkofSunshine But if interest is compounded annually isn’t the solution exactly 4 years and not 3.8, since the interest only accrues once a year and not continuously?

1

0

11

Niklas Sheth @niklassheth

2 days ago

@SamH0816 @LinkofSunshine Doesn't this ignore the compounded annually part?

2

0

449

Niklas Sheth @niklassheth

2 days ago

@jagunanthi @LinkofSunshine I also thought about this, they could've made it a lot more obvious by making it compound monthly

1

0

69

Niklas Sheth @niklassheth

3 days ago

@teortaxesTex We know for sure they do, this task has been in every system card since Claude 4

0

3

0

1

357

Niklas Sheth @niklassheth

6 days ago

@Cat_the_worm @cobham_fan People overestimate the energy consumption of AI, this project used the same amount of energy as a few hours of PC gaming.

0

1

0

28

Niklas Sheth @niklassheth

7 days ago

@kalomaze They're working on it, probably comes out with Claude 5

0

3

0

243

Niklas Sheth @niklassheth

7 days ago

@SignaIbat9 @TsukinoYueVT @Kimagure31415 That all happens during slicing, the input model can be as high or low poly as you want and it’ll end up in similar detail post slicing.

0

2

0

83

Niklas Sheth @niklassheth

8 days ago

@JustinBleuel @ChatGPTapp Chats with LaTeX are very slow on Safari and get worse the longer they are, I have one that takes over a minute to load

0

1

0

53

Niklas Sheth @niklassheth

8 days ago

@industriaalist To elaborate, I think you'd want everything to be learned with enough compute. Evolution didn't need hyperparameters to create intelligence, besides maybe the laws of physics

0

1

0

37

Niklas Sheth @niklassheth

8 days ago

@industriaalist And in a high enough compute regime, you'd want zero hyperparameters as the distinction between a parameter and a hyperparameter vanishes

1

0

184