@lhl@randomfoo.net @lhl - Twitter Profile

Pinned Tweet

over 3 years ago

Just an FYI, I've migrated to posting on my own Fediverse instance: https://t.co/X2j4mjYivQ (for those interested, I've also wrote up my setup: https://t.co/5fesHd4TXc which is running https://t.co/HWnYRp44Ki)

1

12

2

3

0

@[email protected] @lhl

about 1 month ago

@MarkWal06578936 @0ddette @xunzic_beetle After a long chat referencing Pompeian frescoes, Roman mosaics, surviving literary evidence, and considering how refined in anatomy, rendering, and sculptural subtlety the statues themselves are, these generations can't be less likely than the clown-car/paint-by-numbers versions

lhl's tweet photo. @MarkWal06578936 @0ddette @xunzic_beetle After a long chat referencing Pompeian frescoes, Roman mosaics, surviving literary evidence, and considering how refined in anatomy, rendering, and sculptural subtlety the statues themselves are, these generations can't be less likely than the clown-car/paint-by-numbers versions https://t.co/Uj8EXMvR9W

0

9

@[email protected] @lhl

about 2 months ago

@Birdyword The main issue isn't visual, but the 24/7 chronic noise. Low freq rumble is covered in ANSI S12.9 and ISO 1996-2 and there's no *technical* reason this couldn't be solved, just inadequate noise ordinances, and data center builders deciding not to due to cost and well, not caring.

0

3

0

399

@[email protected] @lhl

3 months ago

@giffmana The raw score is just ChromaDB embedding results. Don't get me wrong, I think it's great that people can just do things, but my Claude found a lot of issues w/ a lot of the README claims (not just evals): https://t.co/zcRFBVlnEb

0

3

0

2

590

Who to follow

@[email protected]

@nelson

Mastodon: https://t.co/PEHzrxzzYi Bluesky: https://t.co/3WmwOGGkfU

Bradley Horowitz

@elatable

https://t.co/yaQ9yQ7nw8 GP @ Wisdom Ventures, Board @circle, angel investor

Cal Henderson

@iamcal

Previously: Slack co-founder & CTO

@[email protected] @lhl

3 months ago

@andrewfarah @AsifVabche @hnshah Here’s my Python version btw - does bookmarks, likes, posts, articles, archives: https://t.co/ra2TMZjCY4

0

2

0

1

30

@[email protected] @lhl

4 months ago

@A_y_u_s_h_i_X @deredleritt3r @GaryMarcus @FT Pro cites both S.15 and S.192 (and S.89) in its response but while I'm not an Indian tax professional, I believe you are conflating employee and employer tax burden https://t.co/lv2eyQ0cYC - anyway, I think this supports @deredleritt3r 's Pro claim more than you think it does.

1

3

0

105

@[email protected] @lhl

8 months ago

@TheZachMueller While you're doing RW tests, would you mind attention-gym/nvbandwidth/memtest_vulkan on these if they're easy to script? (I think repo/dataset actually great, especially if it's easy for people to fork/PR into)

0

280

@[email protected] @lhl

8 months ago

@AliTavallaie @rasbt @dontfearai @lmsysorg Not full support. If you want aotriton (FA) you have to manually build, and even then it still doesn’t get through a full attention-gym benchmark run. CK btw only compatible w gfx9 - ROCm on CDNA != ROCm on RDNA (much worse)

1

0

83

@[email protected] @lhl

10 months ago

@sparkycollier @ClementDelangue Of course I think our Shisa V2 and especially our 405B release was pretty cool https://t.co/nOnnfelin8

0

34

@[email protected] @lhl

10 months ago

@sparkycollier @ClementDelangue I don’t have a post but I’ve done evals on virtually every single major JA model: https://t.co/tHBdZcjVJ1 . There’s all maybe https://t.co/pe9BBnieGA if you ignore the scores (largely don’t reflect capabilities).

1

0

74

@[email protected] @lhl

about 1 year ago

I don't post much here anymore, but maybe this is worth an exception. I've spent basically all year working on an open model that is incredibly strong in Japanese. For those interested, full details published here: https://t.co/nOnnfelin8

shisa.ai @shisa_ai

about 1 year ago

We're incredibly proud to release the newest and most powerful member of our open, bilingual (JA/EN) Shisa V2 family: Llama 3.1 Shisa V2 405B The strongest model ever trained in Japan, it points to how even small Japanese AI labs can compete globally! 🤗 https://t.co/L2SXHEM0OH

shisa_ai's tweet photo. We're incredibly proud to release the newest and most powerful member of our open, bilingual (JA/EN) Shisa V2 family: Llama 3.1 Shisa V2 405B

The strongest model ever trained in Japan, it points to how even small Japanese AI labs can compete globally!

🤗 https://t.co/L2SXHEM0OH https://t.co/2Bf8klrscF

1

26

8

5

4K

0

6

1

0

856

@[email protected] @lhl

about 1 year ago

@VictorTaelin Largely non-actionable but I have a fair amount of research on bacterial meningitis and infection and rehab/recovery from research from a few years ago that may be useful later: https://t.co/HXbwMHyhES

0

55

@[email protected] @lhl

about 1 year ago

@gosrum @2022_technology あー、やっぱりgemini-2.0-flash-expは評価が甘めですね😂 もしご興味あれば、こちらGPT-4.1で評価したQwen 3の8Bと30BA3Bのスコアです〜 https://t.co/yaFisiPZJ0

2

0

1

147

@[email protected] @lhl

about 1 year ago

@typedfemale It’s more than that. DYOR, but for laser, T-CAT based TransPRK is almost always better than LASIK. ACD willing, and if you can afford the outpatient procedure with an experienced surgeon, I found that V5 ICL was the best option for risk and outcomes.

0

3

0

1

334

@[email protected] @lhl

over 1 year ago

@nisten For bs=1 llama.cpp does better than vLLM. For anything more you should be using sglang.

0

11

3

2K

@[email protected] @lhl

over 1 year ago

@realGeorgeHotz @AMD I'm not so sure on the 7900 XTX hardware - need VOPD w/ no stalls to hit peak FP16, L1 cache is shared between 2 WGPs, DMA seems weak (can't hit anywhere near peak MBW even on simple bs=1 inference). High throughput, low latency, high concurrency LLM inference is nontrivial, btw.

0

1

0

96

@[email protected] @lhl

over 1 year ago

@cognitivecompai @reguile1 @realGeorgeHotz @growing_daniel 7900XTX has 123 FP16 TFLOPS but only w/ dual issue VOPD. 3090 is 71.2 TFLOPS (142 w sparsity). 3090 also does 284/568 INT8 TOPS (7900 has no native INT8). For FP16 it may be possible to make 7900 XTX faster w/ perfect pipelining, but no one has done it it yet.

0

2

0

251

@[email protected] @lhl

over 1 year ago

@sdw @Duderichy Helps being in Tokyo. Anytime I go to Hands or Loft, get assaulted w new choices and need to go do research, lol

0

1

0

75

@[email protected] @lhl

over 1 year ago

@Duderichy @sdw I often see people mention the G-1008 but I’m a G-1111 fan (has a slidable catch, much nicer file and design) of if you like the squarer look the G-1305 has a magnetic catch.

1

5

0

2

555

@[email protected] @lhl

over 1 year ago

@nisten @Vultr For single-user speed `-tp 8` vs `-tp 4` should further decrease TPOT. You can also trade off some TTFT for better throughput & TPOT w/ something like `--num-scheduler-steps 8`. The most important thing I found for perf on MI300X was VLLM_USE_TRITON_FLASH_ATTN=0 (use CK FA)

0

2

0

93

@[email protected] @lhl

over 1 year ago

@JFPuget jokes/memes aside, I pretty much stick to mamba/conda these days if I need different CUDA versions, eg: `mamba install -c "nvidia/label/cuda-12.1.1" cuda-toolkit -y` (and set CUDA_PATH/HOME) gets me stood up in a 12.1 env in about 30s.

0

176

@[email protected]

@lhl

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users