⚠️ Fragile @thatmelilv - Twitter Profile

about 1 month ago

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

1K

23K

3K

19K

13M

thatmelilv retweeted

𝙫𝙞𝙫𝙞 ₊˚⊹ᰔ @smthlikevivi

about 2 months ago

we found his rolex

1

1K

539

21

58K

thatmelilv retweeted

Panya JT. @panyajt_pers

about 2 months ago

panyajt_pers's tweet photo. https://t.co/F0He11eLst

12

3K

6K

544

695K

thatmelilv retweeted

(･_･ 🚂 ᴬᴿᴵᴿᴬᴺᴳ @pbbung

2 months ago

พี่เอยสร้างเรื่องแล้วอะ เอนดูจขชตอบเม้นมาก555555555555555 จากน้อฃเสื้อขาวกลายเป็นปะป๊าทันที55555555555555

2

1K

3K

110

117K

Who to follow

bf ♡

@b2frn

⠀ ⠀ ⠀ ⠀ “we exist in every moment of yours” ⠀ ⠀ ⠀ ⠀ ⠀ ⠀✳︎ multi-fandom ♡ ⟢ —

nathakito

@nathakito

SU76

นุจะผ่านเดือนนี้ไปจยางง่ะ

thatmelilv retweeted

2 months ago

41

338

15

36

63K

thatmelilv retweeted

Albert Gu

@_albertgu

3 months ago

The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

_albertgu's tweet photo. The newest model in the Mamba series is finally here 🐍

Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models.

We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes.

This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

41

2K

312

844

447K

thatmelilv retweeted

ท๊าดา @TadaaVoila

3 months ago

สรุปตอนนั้นที่ขอโทษมันไม่ได้เห็นใจเหยื่อรอก มันโพสเพราะความซวยมาถึงมันเฉยๆ

3

3K

11K

182

338K

thatmelilv retweeted

小

@lostyouinmay

3 months ago

when I “แอบชอบ” I don’t say “จีบนะ” but i said :

3

4K

3K

320

235K

thatmelilv retweeted

• win 🏳️‍🌈

@i_winnn

4 months ago

– หอศิลป์ กรุงเทพฯ (BACC) – พิพิธภัณฑ์ศิลปะร่วมสมัย (MOCA) – พิพิธภัณฑสถานแห่งชาติ พระนคร – พิพิธภัณฑ์บ้านจิม ทอมป์สัน – พิพิธภัณฑ์ธนาคารแห่งประเทศไทย – ศูนย์การเรียนรู้ ป่าในกรุง ปตท. – มิวเซียมสยาม – วัดพระแก้ว – วัดอรุณราชวราราม – วัดสระเกศ – สวนรถไฟ (สวนวชินเบญจทัศ) – สวนสันติชัยปราการ – สวนเบญจกิติ – ตลาดนัดจตุจักร – ตลาดน้ำตลิ่งชัน – ตลาดน้อย – ตลาดน้ำขวัญเรียม – ตลาดน้ำคลองลัดมะยม – ตลาดน้ำคลองหลวง – ตลาดสำเพ็ง – ถนนเยาวราช – ย่านวังหลัง – ท้องฟ้าจำลอง

13

16K

12K

7K

960K

thatmelilv retweeted

Skyboyz

@Skyboyz15

4 months ago

เงินเดือน 1.2 แสน ต้องขายรถ 3 คัน ไม่มีตังค์ใช้ .. สิ่งที่ต้องแลก มาเป็น สส. 🥺 ▪️ผ่อนรถ 25,000 บาท ▪️ค่าน้ำมันรถ 20,000 บาท ▪️ให้เงินภรรยา 10,000 บาท ▪️เหลือใช้ 30,000 บาท ▪️กินข้าวที่สภา #ท็อปชุติพงศ์ #พรรคประชาชน #เลือกตั้ง69 #เลือกตั้ง2569

77

2K

123

408K

thatmelilv retweeted

Inyas Yamal

@Inyas2525

4 months ago

He did everything right. Loved her, supported her, carried her through the darkest parts of her life. And life still served him the cruelest ending. Loyalty doesn’t always guarantee a happy story. This is why people say relationships are a gamble. You can give someone your youth, your money, your career, your emotional strength… and still lose.

20

7K

327

366

558K

thatmelilv retweeted

SundayBoy

@SundayBoyInvest

4 months ago

Bro said he kinda mid 😭😭😭 ถามจริง ๆ เอา normal distribution มากาง นายอาร์มตกฝั่งซ้ายจริง ๆ หรอ ออกไปแตะหญ้าบ้างเถอะ ไม่จำเป็นต้อง take L ทุกดราม่าก็ได้นะ 55555555555

21

861

1K

118

166K

thatmelilv retweeted

น้องติ๊งโหน่ง #LEMONADE @TiNGNONGNOMSOD

5 months ago

TiNGNONGNOMSOD's tweet photo. https://t.co/k3m1pAO3o0

0

4K

5K

367

3M

thatmelilv retweeted

Dimitris Papailiopoulos

@DimitrisPapail

8 months ago

A comical interpretation of this is that all RL+LLM papers using BF16 are... wrong. What's that? 5,000 papers in 2025?

17

583

35

274

85K

thatmelilv retweeted

(✿ ✿)° @chayahyunx

about 1 year ago

ข่าวบอกหลัง #แผ่นดินไหว จะเกิด aftershock ปักเวลานี้ๆๆ แต่ความจริงแล้ว aftershock มันบอกเวลาจะเกิดล่วงหน้าไม่ได้เป๊ะๆ ขึ้นอยู่กับหลายปัจจัย แต่จะเกิดแน่นอนและระดับความรุนแรงจะลดลงจาก mainshock ตามกฎของ Omori

9

13K

37K

903

3M

thatmelilv retweeted

Avi Chawla

@_avichawla

about 1 year ago

A collection of awesome MCP servers for AI Agents:

53

4K

528

7K

383K

thatmelilv retweeted

Tom Yeh

@ProfTomYeh

over 1 year ago

AlphaFold by hand✍️ Excel ~ I designed this exercise to show (1) MSA multi-head attention, (2) Pair triangular update, two key components of the EvoFormer architecture.👇Join the AI Math community. Download xlsx.

2

1K

207

748

105K

thatmelilv retweeted

AK

@_akhaliq

over 2 years ago

Meta presents Deconstructing Denoising Diffusion Models for Self-Supervised Learning paper page: https://t.co/7FURzDfd73 examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to deconstruct a DDM, gradually transforming it into a classical Denoising Autoencoder (DAE). This deconstructive procedure allows us to explore how various components of modern DDMs influence self-supervised representation learning. We observe that only a very few modern components are critical for learning good representations, while many others are nonessential. Our study ultimately arrives at an approach that is highly simplified and to a large extent resembles a classical DAE. We hope our study will rekindle interest in a family of classical methods within the realm of modern self-supervised learning.

_akhaliq's tweet photo. Meta presents Deconstructing Denoising Diffusion Models for Self-Supervised Learning

paper page: https://t.co/7FURzDfd73

examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to deconstruct a DDM, gradually transforming it into a classical Denoising Autoencoder (DAE). This deconstructive procedure allows us to explore how various components of modern DDMs influence self-supervised representation learning. We observe that only a very few modern components are critical for learning good representations, while many others are nonessential. Our study ultimately arrives at an approach that is highly simplified and to a large extent resembles a classical DAE. We hope our study will rekindle interest in a family of classical methods within the realm of modern self-supervised learning.

2

480

105

239

131K

thatmelilv retweeted

Monkol Pitak @MngkhlPitakv

over 2 years ago

@MVDeepQuotes อายุไม่ถึง 15 ต่อให้ฆ่าคนก็ไม่ต้องติดคุก แต่ถ้าทำผิด 112 ติดคุกแน่นอน

3

2K

12K

56

466K

thatmelilv retweeted

Phattarapong Leelaphat @PLeelaphat

over 2 years ago

ผมขอแสดงความเสียใจต่อครอบครัวผู้เสียชีวิตและขอให้ผู้บาดเจ็บหายเป็นปกติโดยไวนะครับ เพิ่งอภิปรายเรื่องนี้กับกสทช. และวันแถลงนโยบายของนายกฯไปเมื่อไม่นาน หวังว่าประเทศเราจะมีระบบการแจ้งเตือนภัยพิบัติและเหตุความรุนแรงอย่างเป็นทางการได้สักทีนะครับ #cellbroadcast #พารากอน #สยามพารากอน

18

2K

27

354K

⚠️ Fragile

@thatmelilv

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users