hooved @0xhooved - Twitter Profile

0xhooved retweeted

Softwired Technologies @softwiredtech

7 months ago

A technical overview of the internals of the @__tinygrad__ #webgpu backend. https://t.co/Oxd5ZmeeVm

2

36

10

44

13K

hooved @0xhooved

about 1 year ago

@hsu_steve Here’s a live demo of llama-3.2-1B that runs in the browser on newer phones like iPhone 15, that I made for tinygrad: https://t.co/xQoO9aq6mL

3

38

5

22

16K

0xhooved retweeted

the tiny corp

@__tinygrad__

about 1 year ago

Try YOLO in your browser! tinygrad is probably the easiest framework to do this sort of stuff with.

5

280

8

83

12K

hooved @0xhooved

about 1 year ago

The kernel graph for Llama inference in @__tinygrad__

0

38

5

12

5K

hooved @0xhooved

over 1 year ago

You can force usage of the WASM backend (even if WebGPU is enabled) using this link: https://t.co/v6GKem3KUa The model uses 1.2 GB of memory, with the llama-3.2-1B weights having been quantized to int8 (with float32 scales) using tinygrad.

0

2

0

411

hooved @0xhooved

over 1 year ago

Check out tinychat, a browser LLM app built with @__tinygrad__, which runs llama-3.2-1B locally on both WebGPU and WASM, including on newer phones such as iPhone 15. 🔗👇 🧵

0xhooved's tweet photo. Check out tinychat, a browser LLM app built with @__tinygrad__, which runs llama-3.2-1B locally on both WebGPU and WASM, including on newer phones such as iPhone 15. 🔗👇 🧵 https://t.co/jg8O7ec1fk

3

57

4

22

7K

hooved @0xhooved

over 1 year ago

For best performance, try https://t.co/xQoO9apyxd on a PC/laptop with WebGPU enabled. If WebGPU isn't enabled (or your device doesn't support large enough WebGPU buffers), then the app will automatically fall back to using WASM, which still works but is slower.

2

3

0

1

446

hooved

@0xhooved

Last Seen Users on Sotwe

Trends for you

Most Popular Users