Tell me if I'm close: Using opus to break down the task and then deciding which cheap OSS model can be used for which subtask?
(I'm building something similar! 😁 )
Hey @dharmesh do you think the end state for coding is going to be humans providing the system design and hoping AI follows it when it writes and reviews all code? Or more collaborative so humans know what is going on in their codebase?
Anthropic Product Lead:
"At Anthropic, our engineers are running swarms of 300+ agents daily.
Give your agents 100+ tools - just don’t load them all into context."
In a 30-minute talk, the Anthropic team shows how to deploy agents to production.
Claude + loops + routines + dynamic workflows - that’s the secret.
Watch the talk, then save the playbook below.
Hey @levelsio do you think the end state for AI driven coding is going to be humans providing system design and hoping AI follows it as it writes and reviews all code? Or less toxenmaxxing and more collaborative so humans know wtf is going on in their codebase?
@JayaGup10@JayaGup10, what would your level of interest be in an IDE which prioritizes efficient token spend and human collaboration. This is the opposite of what Codex and Claude Code are building towards.
@JayaGup10 For coding: Frontier models to break down the task into smaller tickets (with a lot of context) which are then resolved by locally hosted models.
@krandiash@harjtaggar Just to confirm, you are the transcription and narration layer? Once the text is available you call some other LLM (like a 5.4 mini for example)?
Dumb question about reusable rockets: Can’t we just attach a bunch of parachutes that deploy once the rocket’s fuel is burnt? And then airbags just before impact? @SkyrootA
IMO we're going to see a divergence in usage of AI for coding:
1. Fully AI native teams - "New" startups with a handful of people and token costs >> salaries
2. Budget AI users - Older companies that already have 1000s of employees.
@rawat_ritvik Btw, why no distillation? In my experience with DETR; Model -> Distillation -> Quantization gave me a ~75% improvement in inference time at a ~15% reduction in mAP.
@real_vaishak So the model basically learns to come up with Python controller code and that gets modified based on VLM feedback? Smart! Lmk if you want to collaborate on the next iteration. :)