alvin @e3he0 - Twitter Profile

2 days ago

The internet tells you to validate before you build. Sometimes the fastest way to validate is to build the damn thing. You learn more from 1 week of shipping than 3 months of planning.

0

1

0

16

alvin @e3he0

3 days ago

@JakeOJeffYT getting fat and bald during the process but hey we are shipping something atleast lol

0

1

0

19

alvin @e3he0

7 days ago

@suni_code https://t.co/qdJanozrXw

0

11

alvin @e3he0

about 2 months ago

anyone at yc??? wld love to meet you!

1

2

0

77

alvin @e3he0

2 months ago

yhhh okayy we’ll be at @ycombinator soonn

0

1

0

58

alvin @e3he0

3 months ago

which can be loaded into cache somehow but even if we managed to do that we could only do that for one token after activation vector is multiplied with weights its useless and should be disregarded because its in cache and you need it again when you do the same ops another token

0

2

0

42

alvin @e3he0

3 months ago

Ran a 1B parameter LLM on my CPU and profiled it. 30 seconds total. 0.058 seconds of actual compute. 39.76% cache miss rate. The CPU spent 99.8% of the time waiting for data. Inference isn't slow because your CPU is weak. It's slow because weights can't move fast enough from RAM.

e3he0's tweet photo. Ran a 1B parameter LLM on my CPU and profiled it.
30 seconds total. 0.058 seconds of actual compute.
39.76% cache miss rate. The CPU spent 99.8% of the time waiting for data.
Inference isn't slow because your CPU is weak. It's slow because weights can't move fast enough from RAM. https://t.co/QrHJCioDff

2

1

0

83

alvin @e3he0

3 months ago

Modern Computer architecture works on assumption that if you access X, then you'll prob access X again sooner and it works like magic fuck that was genius but when itcomes to llm for the the weights size is in megabytes

1

2

0

51

alvin @e3he0

3 months ago

wrote forward propagation by hand on paper today just to actually understand it. not gonna lie derivatives and the idea of slope finally clicked. building toward making inference cheaper on my RX 7600. Maybe JUST maybe its a pipedream but whatever...

0

3

0

37

e3he0 retweeted