@sudoingX So far it’s really good. Fast and precise. I gave it an old project of mine to refactor and it has done a really good job in a short amount of time and it didn’t even use up a 5h limit on $20 plan
@sudoingX You are the GOAT. This might be the best combination out there. Best harness + one of the best models for a fraction of the cost. I'm gonna test it for sure.
composer 2.5 is opus 4.7 class coding at 1/10 the cost. but it was cursor only. that just changed.
i just shipped cursor as a hermes agent provider tonight. PR open upstream to nousresearch/hermes-agent, available from my fork right now while it merges.
what this means: composer 2.5 + hermes memory + hermes skills + cron + acp subagents + multi-platform delivery, all in one harness. cheapest frontier coding model + deepest agent runtime. neither alone gets you here.
the math:
- composer 2.5: $0.50 input / $2.50 output per 1M
- opus 4.7: $5.00 / $25.00 (10x cost)
- gpt-5.5: $5.00 / $30.00 (12x cost)
- gpt-5.5 pro: $30.00 / $180.00 (70x
cost)
same coding benchmark band (79.8% swe-bench multilingual vs opus 4.7's 80.5%, 63.2% cursorbench v3.1 vs 61.6%) at a fraction of the budget.
PR: https://t.co/36LfyN6KHg
fork: https://t.co/5OgS1A9w5w
article with full receipts drops sat ~9pm ICT.
@cyb3rops 5. The real realisation. Before LLMs you haven’t been able to write code that didn’t have these issues either, and also you haven’t been able to finish one project in a time it now takes to do 20 of them.
@jappleby Are there any alternatives though? I removed my X account a while ago, explored all the viable alternatives and keep coming back here. Unfortunately that's where the knowledge is (at least for fields I pay attention too)
@LLMJunky@turmyshevd I will give you that for the ultra power users or small companies owning a GPU cluster might make sense, but for 90% of population running a sub in the cloud will be cheaper in a 3-4 years time window with access to a broader range of models.
@LLMJunky@turmyshevd I own RTX5090, I'm running Qwen 3.6 27b dense on full 262k context and I'm not getting even close results to GLM5.1 with the same harness (Hermes). We are definitely in different tiers but I can't imagine doing even my use cases on a local model.
@LLMJunky@turmyshevd Im not even mentioning electricity costs cause everyone is behaving like local models are free once you have GPU. Not my experience. Easily more than 20USD a month in electricity costs only
@LLMJunky@turmyshevd Maybe it’s a skill issue but in my experience Qwen 3.6 is nowhere near Opus 4.5. Also I can run GLM5.1 all day with 20USD/month on ollama without running into any limits. That 9k card is 45 months worth of sub. And any current GPU will be obsolete in 3-5 years
@LLMJunky@turmyshevd Are you doing API calls to a Qwen-level model? Because 9k won't get you an Opus 4.7 level model. To run Kimi 2.6 locally you'd need at least 200-300k hardware investment and then the economics of it looks completely different.
@Teknium I thought you are using Open-Source models to develop Hermes. GLM5.1 doesn't cut it? It works wonders for me with Hermes, but my coding level is probably 1/100th of yours
@sudoingX I had most of these before I saw this post. I don’t have tailscale because I prefer to run my own WireGuard server. Termius is nice although it wasn’t a gamechanger I expected it to be. Terminal work from a phone is just too inconvenient for me even with tmux