@garrytan Garry you are too optimistic. From what I gathered with my US and German colleagues, applying AI in healthcare will still take quite some time to get that level of boost.
@jun_song Understanding the vision is one thing, and having taste is quite another thing. I do see much better results with a combination of certain models and the taste skill from @LexnLin. Most of capable models just need a bit kick.
@MiaAI_lab@u1tra_instinct Cool. What's the total tks we can get from the 12 concurrent sessions? Can't wait to put this on when the 2 Sparks arrive in a month.
@sachindetrax@MiaAI_lab@UnslothAI@NVIDIAAI Use Qwen3.6-35B-A3B Q4 quant. Fitting between 16GB VRAM and 16GB CPU RAM is okay with MTP enabled. Speed would take a hit, expect 20~40 tks generation with one session depending on your GPU (I assume 16GB VRAM is an old GPU).
@TheAhmadOsman One sensible way to do the math is how they depreciate IT assets in accounting, 5 years to zero and free milk afterwards. Individuals can fire up old PCs, buy used GPUs and start small. I'm already running local models on a 5 year old company workstation so it's literally free.
Also, obra's superpowers skills are amazing. Getting to know how to dev with Obsidian API & publishing procedure took some time but really interesting to format codes to avoid plugin review warnings & such with coding agents.
Built my first Obsidian plugin with pi agent and Qwen3.6-35B-A3B Q4KM UD served with llama.cpp. This is also my first project developed purely by local model & vibe coding (I don't know even the most basic java/ts). Check it out: https://t.co/h3CCmO7fHN
@s0me1suspicious@catalinmpit The thing is, these are company devices that I cannot take them apart and put the 2 GPUs into one build, hence the separate setup.
@hxiao Local inference for sure is not there yet, but for most users who don't run that many agents on longer tasks, they can satisfy most tasks. With recent developments with Fable, I've started to shift some of my work to rely on local models. This is going to start for most folks.
@TheAhmadOsman Hello Ahmad, today I used only a local model to develop a plugin for Obsidian. This is the first project I developed with only local models and I'm sure it won't be the last. It took me a couple of weeks' spare time to get it up and running. Thanks for the inspiration.
@TheAhmadOsman Felt this on Jun 1st when our corporate's github copilot switched to usage-based billing. Looking for an excuse to ask my boss to buy a GPU for local AI work.
@TheAhmadOsman Our company provides github copilot subscription and it turned into usage-based billing and I burnt through half of monthly credits on the first day. I don't think it's sustainable without a local model at this point, or a cheap Chinese API.
@natolambert@Zai_org Main issue is the lack of compute. It's hard for them to balance between training new models and selling more coding plan. Deepseek is in a much better situation because of their effort to train and build infra on Huawei chips.