@mycoliza@emphaticist recently ServeTheHome reviewed a mikrotik switch with SFP56 ports. afaik the only NICs speaking SFP56 are some mellanox ones and one new announced-but-unavailable AMD NIC. it's a physical manifestation of the phrase "technically the spec also allows..."
@jerhadf I genuinely like doing simple things fast with Haiku (or, in other tools, the fast GLM). It'd be cool if people could get that experience with some mitigation/safety net for when a fast model doesn't appear to be the right tool
@jerhadf Y'all have lots of options now (Haiku no-think...Opus high) w/speed and usage limit tradeoffs, and leave users to guess. 'Auto - balanced', or letting the model suggest calling in a stronger one if stuck or given a hard task (e.g. Haiku w/a multi-file refactor) could help
Huh: this is a 48B model that itself won't do much, but efficient attention that works well'd be pretty nice for "let's have a chat re: these 10k lines of code" and such
Context Arena Update: Added kimi-linear-48b-a3b-instruct [11-08] and kimi-k2 (Thinking) [11-06] to the MRCR leaderboards.
The Linear 48b results are fascinating! It actually outperforms the new Gemini 3.0 Pro Thinking on 4-needle and 8-needle tasks at higher context lengths (512k+). I've added it to 2needle, 4needle, and 8needle.
kimi-k2 (Thinking) lands lower on the leaderboards (Rank #22 for 2-needle AUC @ 128k), with a hard context ceiling around 262k. I did not run it for 2needle and 4needle.
All results at: https://t.co/gLEWzxoXWG
The performance curve for the Linear model is distinct: while it underperforms Gemini 3 significantly at shorter contexts (<=256k) on the difficult 8-needle test, its degradation slope is much flatter. Gemini starts higher and drops fast; Kimi starts lower but holds steady, overtaking Gemini at the higher end.
However, note that kimi-linear-48b has noticeable performance drops past 128k on the easier 2 & 4 needle tests. Additionally, due to lower token efficiency compared to Gemini/GPT, only ~60% of the 1M token tests successfully ran (hitting limits/OOM). So some caution with the results at the 1M level.
kimi-linear-48b results:
2-Needle Performance (@ 128k / @ 1M):
- AUC: 96.5% (vs Gem 3: 99.5%) / 81.7% (vs Gem 3: 85.5%)
- Pointwise: 96.0% (vs Gem 3: 99.0%) / 77.0% (vs Gem 3: 72.2%)
4-Needle Performance (@ 128k / @ 1M):
- AUC: 85.5% (vs 85.8%) / 62.7% (#1, beating Gem 3: 57.3%)
- Pointwise: 83.7% (vs 80.8%) / 51.5% (#1, beating Gem 3: 34.3%)
8-Needle Performance (@ 128k / @ 1M):
- AUC: 54.9% (vs 73.0%) / 43.8% (#1, beating Gem 3: 39.0%)
- Pointwise: 49.0% (vs 54.2%) / 35.3% (#1, beating Gem 3: 24.5%)
A very different architectural approach yielding impressive stability at scale. Because of its current price point, it is very competitive for long context (MRCR).
Enjoy.
@Kimi_Moonshot@GoogleDeepMind@googleaidevs@OpenAI@OpenAIDevs
cool idea. we could execute a restricted set of bytecode instructions for operating on objects stored by the server. i propose the following instruction set:
- GET
- HEAD
- PUT
- POST
- DELETE
- CONNECT
- OPTIONS
- PATCH
A mysterious expert in submersibles was interviewed by the Coast Guard during the Titan investigation. His name is redacted, but we barely get into the interview before it becomes obvious who it is.
@sarah_micheleg yeah, we push new releases to a few instances before the rest (so if there's a bug we catch that first day, it bites fewer folks). so the others will probably be w i d e tomorrow. if folks are not loving the w i d t h, use that support form, we want to hear