@DarioCpx@edzitron why does he think "it's going to cost just as much money to finish building data centers in 10 years as today"?
the cost of inference is going to come down massively in 10 years
it's probably going to do that in 2-5 years ๐คทโโ๏ธ
@zack_overflow oh no. ๐ซค
do you know if they've got any plans to improve this?
my system already struggles with kind of project sizes I work with
I thought they had said it would use *less* memory than current TS ๐คทโโ๏ธ
@therons@sluongng it's just a program
there's no daemon running with elevated privileges like Docker
containers under Podman can't do anything the current user can't do
which is also just so much easier to understand, I think ๐
@jumperz I wonder if that's the trend now
they're at the point where cost & compute largely follows quality
so they'll just start with smaller models and wow you with cost
then leverage brand recognition to ship something bigger and now expensive ๐
@jumperz looks like an introduction offer
the price will double vs M2.7
no duh it's going to be better if you double the cost & compute
Google recently pulled the same stunt with Gemini Flash
just ship a bigger more expensive model, of course it's going to be better ๐คทโโ๏ธ
@thdxr but you can't explain why or how, and you won't be taking questions, right?
I suspect many have been where you are right now and it tends to pass ๐
@midego1@sluongng no, it's using the container to access the file system on the local computer
yes that's a thing
the docker daemon typically has/needs elevated permissions - it's a known design issue
we should be switching to Podman ๐
US export controls blocking DeepSeek from buying H100/H200s apparently backfired spectacularly. The Huawei-based architecture, essentially a workaround for inferior silicon, can't run on western hardware.
Probably the best open model, and we ship an expensive garbage version. ๐คฆโโ๏ธ
Western providers of DeepSeek V4 Pro look really bad on OpenRouter right now - the cheapest provider is 6x more expensive than first-party inference, and badly degraded at 4-bit quantizations.
https://t.co/YujvKOTRLX
It's almost as if they're just teasing the west with an open release of this model. Western providers aren't cutting corners with a 4-bit version - they literally can't fit the full model on the H100 fleets they have.
https://t.co/5p9Cfm3XWm
@thdxr I mostly use Pi now
it has 4 tools
I have just one extension for permissions installed
it works really well for my type of use, (which is more "refactor this" and not "build this for me")
what specifically do you think we're missing out on?
@serenaa_ge so I get that the solutions aren't in the public repos you clone for this benchmark
but the codebases are - some of these repos have been public for a decade
LLMs having already been trained on the code they're asked to work on... this doesn't affect the results? ๐ค
it does this all the time with more important, technical questions as well - sometimes while giving step by step instructions on shell commands, it'll suggest the wrong thing and then "but wait actually" itself. ๐คฆโโ๏ธ
this is just about the worst answer I've seen by Claude
https://t.co/GUvlabcaCq
this nonsense belongs in a <thinking> block where the end user doesn't see it - that's what reasoning is, right?
they've literally fine tuned it to give wrong answers first - why did they do that??
@GoogleDeepMind@Google Holy expensive for a flash model ( $1.5 / $9 )
Almost same price as Gemini 3.1 Pro Preview
3x - Gemini 3 Flash Preview
5x - Gemini 2.5 Flash
6x - Gemini 3.1 Flash Lite