@RobertH67413713@DustinOgle33@pmarca Then pay 100x for the power for the nearest Chinese equivalent of a single GB300. That's about 16MW for the best Huaweii equivalent which is about H200.
@sudoingX@0xSero Is there a way to determine which REAPed model fits say 48G ?
I dont find the descriptions on most of them clearly stated so I'm sure I'm doing it wrong.
Thanks!
@CardilloSamuel@nahiiko So which weights and what kind of tuning (generally) per customer? I am having a hard time seeing which weights would be tuned per customer for most groups of customers.
Cool idea. No one planned for silicon scarcity, and within a few years that will be addressed. For those who didn't prebuy gear that is considered old today, they have the option of paying inflated prices and hope they trade out before the scarcity premium expires, but somehow buy their replacements without the premium.
Cool story.
@AirCrow@japan_nobunaga At one restaurant, the manager told me not to tip at the register--the employees dont get it. I tipped in cash directly to the waiter. YMMV.
@TheAhmadOsman I dont understand these charts, my problem, because its always... how little inference can you get away with?
I'd love someone to bench mark the current way to run the best models with and without quant at the current time (includes sharding) like Nemotron Ultra3 or GLM 5.2.
@GhostOfStoneyX2@newsonstone I would have to have an AI explain this to me, but I'm not in the market. At first blush, it looks beautiful... that is all anyone cares about. For all of the people telling you how you could have done the job, they can share their work.