I wrote an LLM inference engine from scratch in Rust.
No PyTorch. No ONNX. No Python.
Just one binary that downloads a model and runs GPU-native inference on Apple Silicon + NVIDIA.
It beats llama.cpp at Q4 on Apple Silicon. Hereโs how it works ๐งต
Built solo to understand GPU inference end-to-end.
I also document where it loses today (e.g. CUDA prefill trade-offs) so benchmarks stay honest.
If this is useful, star it:
https://t.co/CC9msIZ7lq
#RustLang#LLM#LocalLLaMA
I wrote an LLM inference engine from scratch in Rust.
No PyTorch. No ONNX. No Python.
Just one binary that downloads a model and runs GPU-native inference on Apple Silicon + NVIDIA.
It beats llama.cpp at Q4 on Apple Silicon. Hereโs how it works ๐งต
@helloitsaustin People who typically fit the higher echelons such as distinguished fellows, don't get interviewed in traditional ways. They decide if the company and its mission is worth their time.
The US federal budget is $7 trillion.
There are 535 in the Senate and Congress.
They collectively allocate that money.
Specifically: 7000/535 = 13 billion per official.
Which means AOC is a political billionaire.
She allocates far more than any market billionaire.
Indeed she allocates ten billion liquid, per year.
A market billionaire has one billion, illiquid, per life.
So: AOC spends 10-100X what a market billionaire has.
It's not even close.
She's right about one thing, though.
AOC didn't earn the tens of billions she spends.
The state taxed that money, by force.