rf @rf - Twitter Profile

Pinned Tweet

rf @rf

over 3 years ago

I'm also at @[email protected] / https://t.co/25Ta1JqLbX

1

2

1

0

rf @rf

5 months ago

@tender_subject @deepimpactcrier @Support jebus 🤦 obvs you want the OG account back either way, but hey, a glorious microblogging future awaits on bsky

0

1

0

71

rf @rf

5 months ago

@mycoliza @usgraphics @oxidecomputer well hello, 12.8T of switch capacity in 4U

0

1

0

92

rf @rf

5 months ago

@ID_AA_Carmack I should have expected it would be an Andy Weir book 😂

0

1

0

310

Who to follow

Abkarino

@AbkarinoMHM

SW Engineer, video game lover & hacker, I like to discover how things work, a former member of team Rebug, Team EgyDev Co-Founder. PayPal: [email protected]

💻 Bastard Operator From Hell | ❤️ @mandolinsara | 🟡 Libertarian

rf @rf

6 months ago

@mycoliza @emphaticist recently ServeTheHome reviewed a mikrotik switch with SFP56 ports. afaik the only NICs speaking SFP56 are some mellanox ones and one new announced-but-unavailable AMD NIC. it's a physical manifestation of the phrase "technically the spec also allows..."

0

52

rf @rf

6 months ago

@jerhadf I genuinely like doing simple things fast with Haiku (or, in other tools, the fast GLM). It'd be cool if people could get that experience with some mitigation/safety net for when a fast model doesn't appear to be the right tool

0

1

0

30

rf @rf

6 months ago

@jerhadf Y'all have lots of options now (Haiku no-think...Opus high) w/speed and usage limit tradeoffs, and leave users to guess. 'Auto - balanced', or letting the model suggest calling in a stronger one if stuck or given a hard task (e.g. Haiku w/a multi-file refactor) could help

1

0

1

35

rf @rf

6 months ago

Huh: this is a 48B model that itself won't do much, but efficient attention that works well'd be pretty nice for "let's have a chat re: these 10k lines of code" and such

Dillon Uzar

@DillonUzar

7 months ago

Context Arena Update: Added kimi-linear-48b-a3b-instruct [11-08] and kimi-k2 (Thinking) [11-06] to the MRCR leaderboards. The Linear 48b results are fascinating! It actually outperforms the new Gemini 3.0 Pro Thinking on 4-needle and 8-needle tasks at higher context lengths (512k+). I've added it to 2needle, 4needle, and 8needle. kimi-k2 (Thinking) lands lower on the leaderboards (Rank #22 for 2-needle AUC @ 128k), with a hard context ceiling around 262k. I did not run it for 2needle and 4needle. All results at: https://t.co/gLEWzxoXWG The performance curve for the Linear model is distinct: while it underperforms Gemini 3 significantly at shorter contexts (<=256k) on the difficult 8-needle test, its degradation slope is much flatter. Gemini starts higher and drops fast; Kimi starts lower but holds steady, overtaking Gemini at the higher end. However, note that kimi-linear-48b has noticeable performance drops past 128k on the easier 2 & 4 needle tests. Additionally, due to lower token efficiency compared to Gemini/GPT, only ~60% of the 1M token tests successfully ran (hitting limits/OOM). So some caution with the results at the 1M level. kimi-linear-48b results: 2-Needle Performance (@ 128k / @ 1M): - AUC: 96.5% (vs Gem 3: 99.5%) / 81.7% (vs Gem 3: 85.5%) - Pointwise: 96.0% (vs Gem 3: 99.0%) / 77.0% (vs Gem 3: 72.2%) 4-Needle Performance (@ 128k / @ 1M): - AUC: 85.5% (vs 85.8%) / 62.7% (#1, beating Gem 3: 57.3%) - Pointwise: 83.7% (vs 80.8%) / 51.5% (#1, beating Gem 3: 34.3%) 8-Needle Performance (@ 128k / @ 1M): - AUC: 54.9% (vs 73.0%) / 43.8% (#1, beating Gem 3: 39.0%) - Pointwise: 49.0% (vs 54.2%) / 35.3% (#1, beating Gem 3: 24.5%) A very different architectural approach yielding impressive stability at scale. Because of its current price point, it is very competitive for long context (MRCR). Enjoy. @Kimi_Moonshot @GoogleDeepMind @googleaidevs @OpenAI @OpenAIDevs

DillonUzar's tweet photo. Context Arena Update: Added kimi-linear-48b-a3b-instruct [11-08] and kimi-k2 (Thinking) [11-06] to the MRCR leaderboards.

The Linear 48b results are fascinating! It actually outperforms the new Gemini 3.0 Pro Thinking on 4-needle and 8-needle tasks at higher context lengths (512k+). I've added it to 2needle, 4needle, and 8needle.

kimi-k2 (Thinking) lands lower on the leaderboards (Rank #22 for 2-needle AUC @ 128k), with a hard context ceiling around 262k. I did not run it for 2needle and 4needle.

All results at: https://t.co/gLEWzxoXWG

The performance curve for the Linear model is distinct: while it underperforms Gemini 3 significantly at shorter contexts (<=256k) on the difficult 8-needle test, its degradation slope is much flatter. Gemini starts higher and drops fast; Kimi starts lower but holds steady, overtaking Gemini at the higher end.

However, note that kimi-linear-48b has noticeable performance drops past 128k on the easier 2 & 4 needle tests. Additionally, due to lower token efficiency compared to Gemini/GPT, only ~60% of the 1M token tests successfully ran (hitting limits/OOM). So some caution with the results at the 1M level.

kimi-linear-48b results:

2-Needle Performance (@ 128k / @ 1M):
- AUC: 96.5% (vs Gem 3: 99.5%) / 81.7% (vs Gem 3: 85.5%)
- Pointwise: 96.0% (vs Gem 3: 99.0%) / 77.0% (vs Gem 3: 72.2%)

4-Needle Performance (@ 128k / @ 1M):
- AUC: 85.5% (vs 85.8%) / 62.7% (#1, beating Gem 3: 57.3%)
- Pointwise: 83.7% (vs 80.8%) / 51.5% (#1, beating Gem 3: 34.3%)

8-Needle Performance (@ 128k / @ 1M):
- AUC: 54.9% (vs 73.0%) / 43.8% (#1, beating Gem 3: 39.0%)
- Pointwise: 49.0% (vs 54.2%) / 35.3% (#1, beating Gem 3: 24.5%)

A very different architectural approach yielding impressive stability at scale. Because of its current price point, it is very competitive for long context (MRCR).

Enjoy.

@Kimi_Moonshot
@GoogleDeepMind @googleaidevs
@OpenAI @OpenAIDevs

20

466

58

227

295K

0

1K

rf @rf

6 months ago

finally HLVM

neural oscillator of uncertain significance @mycoliza

6 months ago

cool idea. we could execute a restricted set of bytecode instructions for operating on objects stored by the server. i propose the following instruction set: - GET - HEAD - PUT - POST - DELETE - CONNECT - OPTIONS - PATCH

47

6K

165

375

323K

0

580

rf @rf

7 months ago

@mycoliza @keysmashbandit Catte

0

45

rf @rf

7 months ago

@tom7 Hah, I read the name "Tom Murphy" without realizing! To be fair they didn't disambiguate from Toms Murphy 1-6 (or 8+).

0

5

0

233

rf @rf

8 months ago

@copyconstruct Where is it? Searched for phrases and so on, no luck

1

4

0

1

26K

rf @rf

8 months ago

@mycoliza computer ebay, fs, just some of the infohazards offered to you on the internet

0

97

rf retweeted

Jake M. Grumbach @JakeMGrumbach

8 months ago

And look at the ages of donors to AOC vs Hakeem Jeffries

40

4K

302

283

227K

rf retweeted

Scott Manley

@DJSnM

8 months ago

A mysterious expert in submersibles was interviewed by the Coast Guard during the Titan investigation. His name is redacted, but we barely get into the interview before it becomes obvious who it is.

DJSnM's tweet photo. A mysterious expert in submersibles was interviewed by the Coast Guard during the Titan investigation. His name is redacted, but we barely get into the interview before it becomes obvious who it is. https://t.co/qtZ3Gntp1P

428

25K

2K

4K

3M

rf @rf

8 months ago

@ManishEarth Mexico City's 19 Sep 2017 quake was two hours after an earthquake drill (on the anniversary of the very bad 19 Sep 85 earthquake)

0

2

0

48

rf retweeted

Séamus Malekafzali

@Seamus_Malek

8 months ago

Got sent this clip from the Iranian comedy show "Bachelors". Subtitled it because I thought it was hilarious

8

598

29

197

34K

rf @rf

8 months ago

@sarah_micheleg yeah, we push new releases to a few instances before the rest (so if there's a bug we catch that first day, it bites fewer folks). so the others will probably be w i d e tomorrow. if folks are not loving the w i d t h, use that support form, we want to hear

0

1

0

21

rf @rf

8 months ago

@TheToriParadox lameMORIAAAAA deUNAYER!!

0

35

rf @rf

8 months ago

@francoisfleuret ...he needed to cut 1/7th out, so his new cookie would be the same size as the ones in the box

1

4

0

318

rf

@rf

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users