@mindragon@Teknium I see XLM as a cheaper more efficient Etherium, Bitcoin has nothing of value in my eyes but the ones with smart contracts are functional at least.
@AliesTaha @part_harry_ That's sad but I had the chance to test Gemma 4 e4B locally before running Bonsai and Gemma is awesome. Smaller models did improve after all. Need better faster small "large" language models for everyone.
@TomilolaNa53446@awnihannun Forever... until the next model is released. But yes I'm glad that smaller models get the attention (training) they deserve.
@norpadon I do agree that they could be more open about what their model is, like they mentioned based on Qwen somewhere and that's the only line that gives it away.
@lompocus@PrismML 8-9b models with Q4 are 4-5gb. If they can make a 16b model that's 2-4gb and lies on that line it would be a win for Bonsai AND the community.
@PrismML If you manage to train a 16b model with at least the same intelligence density that would be huge... or tiny π€
I hope recent papers like the ones from Deepseek can and will also be applied to the next generation.
@0___________0_v@coinjoined@AndroidDev They want to turn Android into iOS. They can't really do that with AOSP but phones are always sold with the Google Play services which will turn it into iOS. If I wanted iOS I would get an iPhone.
@_quanta_@QuixiAI@Kekius_Sage Yeah it's like an ant running on a carpet. If you pull and fold that carpet it will affect the ant. The mass of the black hole is manipulating the carpet, the mass of the ant isn't important here.
@Teknium Hmm maybe because it doesn't matter what he thinks. It should be more about the definition made by openai or what the court thinks. But I don't think OpenAI had a real definition, right?
@Teknium Isn't the Advantage of non rnn models that they see it all at once? At the time it predicts a token it sees everything before which includes the whole question. If you enter the question twice it sees the whole question twice, but it doesn't see anything it didn't see before.