@rucam365 Deepinfra provides various models, US infrastructure/privacy/compliance.
API rates creating budget pressure, so to service growing demand and consumption, we need to identify efficient models. Deepinfra has a good selection, no question about data residency, etc
@realrealcat@slashreboot@mlech26l I down-modeled to haiku and sonnet for many tasks, to keep my it ops workflows under $300/day on API pricing. Payback period of weeks on that rig.
That said, a lot of capability in other models and providers, running locally is hard and expensive relatively.
DSv4 better option
@cryptgreg@Govindtwtt We learn many ways. By example is one.
But our value proposition is not in knowing bespoke language and syntax. It is in understanding systems, their interaction, and their manipulation.
I dont think we will achieve enlightenment, but we can improve faster with less friction
@cryptgreg@Govindtwtt Eliminating the friction, repeatable actions, these things present opportunity to invest our time in higher levels of thinking.
Our value is in strategy, identifying opportunities, prioritizing. Theres a huge friction tax on execution of many tasks, which AI effectively bypasses
@cryptgreg@Govindtwtt How do you spend time when all the friction was removed from your workflows?
Old: change config in GUI that was rearranged for 4th time this year, or artisinally hand craft obscure command syntax
New: "write a script to change setting" read it, dry run, confirm, apply
@OnlyTerp Lead me to look at deepseek v4 thru Azure MaaS to mitigate geopolitical/data risks but get that DSv4 cheapness, however not as strong a bargain that way. Comparable to GPT5.4 mini for my scenarios.
Can't push real work thru their API directly, but a goto for personal projects
@OnlyTerp Using it with deepseek v4 flash backend in claude code seems like an incredible amount of usage. 240mm tokens, <$6 consumption.
Cool to see how ultracode + workflows work, without hitting my Claude consumption. Very powerful
Virtually unusable costs at API rates at the frontier
@realchrisebert@mikejulian On sub I could go thru $300/day on sonnet4.6 with 4 claudes for <$200/month, executing configuration changes, servicing tickets, building and deploying projects, automating infrastructure
@realchrisebert@mikejulian We just pushed over to enterprise from Teams sub, which I had provided data and advised against. Year contract, business not prepared for the cost. 70 seats.
Finding ways to push things down model and it has taught me how to optimize for efficiency trying to stay under budget
@NicolaManzini Would be interested in feedback for how it goes.
I've also found Haiku plainly dumb to talk to, but reading and searching its the right answer, and for implementing well defined scripting tasks it works as well. Sonnet if reasoning/troubleshooting/planning etc. Opus architecting
@NicolaManzini Start pushing things to gpt 5.4 mini. Similar capability to sonnet 4.6, but cheaper tokens = more usage/$
5.4 is also half the cost of 5.5, but mini is a fraction of that.
We aren't all engineering nuclear subs. Spend tokens on planning, save tokens on execution.
@peterom I've found the same in personal projects, but the geopolitical and security risks of relying on the Deepseek API directly are real - their terms are not similar to western terms.
I'm pushing my pro tasks to GPT 5.4 mini as a compromise, right balance of capability and security
@samuelcolvin My next move is switching execution to GPT 5.4 Mini, as an executor I think it has advantages over sonnet4.6 that are more than being just a fraction of the cost.
@samuelcolvin We just switched from teams to enterprise, so I've pushed a lot to sonnet4.6, though Opus keeps my repo cleaner and catches more things I forget, but I only use it for architecture/strategy planning, all exec goes thru Sonnet to keep my usage costs somewhat under control