‼️TO ANYONE IN MY NETWORK‼️
tldr; fund travel to SF for me + my friend, and we'll pay you back if we win a hackathon 🙏.
My friend and I really want to participate in YC's Web Agents Hackathon: https://t.co/dPkl31kRr4. We've both been admitted, but because the event is next weekend, flight tickets are costing us upwards of $500 each.
We have a lot of amazing ideas, from browser-embedded recommendation systems for reading research papers, to AI safety guardrails that prevent malicious prompting on the user layer itself. If we go, we're entirely committed to building something incredible.
I have experience with agentic infra + customized server architectures for multimodal inference workloads, and my friend is a genius at all things computer and GPU architecture (he can tell you exactly how the chiplet structure of a Blackwell works).
I can't guarantee we'll win..but our odds are pretty good. For any amount we win that is >= $1k, we'll give you 50% of ALL our winnings.
If you're interested in gambling on a pair of college kids and willing to buy us round-trip flight tickets from BOS to SFO for $1k...hit me up.
repost (for reach) if you want :)
‼️🙏🔨
My friend (and roommate) Vishnu is super cracked at robotics and is looking for a job
He’s been doing work on VLAs and simulation farms at Dexmate and runs a research team back at a Berkeley lab
Insanely hard worker, I literally never get to see the guy, and he’s a passionate conveyer of ideas
Please dm @VishnuSangli if you'd like to chat!
Years ago, MIT students got a police car on top of the Great Dome. To this date, people still don't know how they pulled it off.
I live by this now.
Do something so crazy people are left wondering how you even pulled it off.
Paradigm Agent and Model Evaluations
We designed a robust evaluation system to assess and improve the quality of the data our agents generate.
More on what we did to ensure top performance across all of our key verticals (30+ use cases) below.
It isn't a Microsoft Excel killer.
It's a new user interface for AI.
Thanks @annarmonaco for giving us an up-close look at @paradigmai
Recorded on my Apple Vision Pro.
i spent the last 3 months building https://t.co/6HZis0g5SC, a minimal TPU that runs both batched inference and training fully on-chip!
here's a breakdown of how it was built 🧵
Paradigm is the AI-native spreadsheet to eliminate menial work. Thousands of users have saved 10,000+ hours with Paradigm, and you can be next.
Get your first month free today, then plans start at just $20/month.
📢 Introducing ServingBench, a research-to-production integration benchmark for ML serving engines
A little project I’ve been brewing:
In 12 months, traditional ML will feel antiquated—every “small” model will just be another LLM.
We’re on the cusp of a surge in hyper-minimized NVIDIA Blackwell B200s with FP4 support. To actually leverage them, you need to transform PyTorch prototypes into SRAM-/VRAM-savvy kernels
And this will be vital for ANY “AI-native” product, from JARVIS-style assistants to true on-edge companions.
Yet there’s always a sizable lag between paper and production in vLLM/TensorRT-LLM. Months are lost hand-crafting and tuning kernels for either individual productionizing or on the actual frameworks' end.
But researchers should stick to theory, not sweat over “magic” tile sizes for their tiled matmuls!
HuggingFace dataset: https://t.co/nZ16907XL2
Code: coming soon
Much like Kevin’s concerns, there’s even more of an absence of full vLLM implementations than there are CUDA / triton kernels, so dataset curation is slightly challenging, although it shouldn’t be too bad of a task with deep-dive agents.
TODO:
- Keep improving ServingBench with more examples
- Run these on an actual Blackwell (modal environment) for the final step of each evaluation. Although i'd like to get more validation before spending more money lol
Ideally with a larger dataset anyways a fine-tuned LLM is more contextualized on what parameters / knob values work best and what remains is some well-guided gradient descent