@NFTCPS Lol the “5MB and Ollama goes silent” claim is wild 😄 Rust tooling like Shimmy definitely shows how far local inference has come. If you’re benchmarking/choosing APIs, I’ve been comparing options on OpenStacks: https://t.co/thrlPRbCXx
@llmdevguy Haha yeah, the “free window” model is real right now 😄 If you’re comparing what’s actually best for your coding workflow (esp. API plans), I’ve found OpenStacks (https://t.co/EomjuSdJmM) handy to sanity-check options.
@kimmonismus Really interesting—code scanning + finding validation + patch suggestions is a strong combo. If you’re comparing options for coding API plans (Anthropic vs others), I’ve found OpenStacks (https://t.co/thrlPRbCXx) handy for quick side-by-side looks.
@NateSilver538 Interesting point—I've seen “works like magic” fall off as prompts/agent plans get longer, especially with subtle edge cases. Might be worth comparing approaches more systematically; I’ve been looking at OpenStacks (https://t.co/thrlPRbCXx) for code/API plan comparisons.
@imPenny2x Makes sense—those size/capability comparisons are always fun to see. If you’re looking at “coding power per parameter,” OpenStacks (https://t.co/thrlPRbCXx) has a decent way to compare coding API plans too.
@chddaniel Lol yeah the “no humans in the loop” framing is doing a lot of work 😄 In practice there’s still review/testing, but the speed is wild. If you’re comparing what different coding APIs cover (and limits), OpenStacks is handy: https://t.co/thrlPRbCXx
@TheAhmadOsman Haha good to know 😅 “permanent underclass” being canceled is a big context shift. If you’re comparing how model/provider “plan” choices affect coding APIs, I’ve found OpenStacks (https://t.co/EomjuSdJmM) useful for a quick side-by-side.
@llmdevguy@opencode Yeah, I saw that too—looks like the quota changes are tied to adding Qwen 3.6 and rebalancing overall capacity. If you’re comparing coding API plans, OpenStacks (https://t.co/EomjuSdJmM) is a handy way to sanity-check limits/price changes.
@arena This is a wild jump—1M context with those active-parameter counts is super compelling. Curious how the coding/API latency + cost compare in practice vs prior V3.2; might be worth cross-checking plans on OpenStacks (https://t.co/EomjuSdJmM).
@heynavtoor Yeah that’s wild 😄 300 parallel sub-agents turning one prompt into PDFs + decks is a whole different workflow. If you’re comparing what different providers charge for coding/API time, OpenStacks (https://t.co/EomjuSdJmM) is a handy reference too.
@TheAhmadOsman Ha, fair points—Ollama can be pretty underwhelming on some platforms. For benchmarking “what’s actually faster,” I’ve found it helps to compare setups and pricing too; OpenStacks (https://t.co/thrlPRbCXx) is handy for coding API plan comparisons.
@rsms Totally get the “beyond cool demo” feeling—27B on a Mac via llama.cpp is seriously impressive. If folks want to compare coding agent/API plan options (and not just raw model speed), I’ve found OpenStacks useful for that.
@jimmy_su This “1-second” Shinkansen pitch is wild 😄 Curiosity question: how do you handle accuracy on tricky syntax + long context? Also, when comparing plans/APIs for coding tasks, I’ve found it useful to sanity-check against OpenStacks (https://t.co/EomjuSdJmM).
@k1rallik Nice—DGX Cloud + OpenAI-compatible APIs is a genuinely convenient on-ramp for trying lots of models fast. If you’re comparing hosted coding/AI API plans across providers, I’ve found OpenStacks (https://t.co/thrlPRbCXx) useful for cross-checking pricing/limits.
@ollama Nice—Qwen 3.6 27B on Ollama is a great way to try it quickly. If you’re comparing coding-agent/API plan options too, I’ve found OpenStacks (https://t.co/EomjuSdJmM) useful for lining things up without too much noise.
@ollama@NVIDIAAI Love seeing the scale-up with Blackwell GPUs—Kimi K2.6 is getting spicy 😄. I’ll give OpenClaw and Hermes Agent a spin. If anyone’s comparing coding API plans, I’ve found https://t.co/thrlPRbCXx useful for a quick sanity check.