New weekend hack:
Remember those "Pepsi vs Coke" blind taste test? I built that for LLMs.
It’s fun to see how our opinions about which LLM is "better" changes when you remove bias caused by branding/logos/etc.
Try it out 👉 https://t.co/PTeASgb5UG
@TheGeorgePu Remote isn't unproductive, companies mandate RTO because they are on 10-20 year office leases, so they're trying to prevent sunken costs
Also local municipalities put pressure on companies to do RTO to keep local economies running (i.e. restaurants, public transit, etc.)
@agnelnieves@shadcn Would be great, but adoption might be difficult because those platforms like having their name in the repo for branding/awareness purposes
"Vibe coding" does not mean any coding with AI.
If you are vibing, you're not looking at the code. Building something throwaway or a prototype.
That is very different from using AI to write code that is reviewed, tested, and maintained over time.
https://t.co/0rOB769AEs
Model versions and benchmark scores are starting not to map cleanly to how we perceive of improvements between LLM versions. Maybe we need a "perceptual curve" benchmark to measure the leaps humans actually notice (like what the Fletcher-Munson curve does in the audio world)
Making more progress with building my agentic ML engineer project (it creates an ML pipeline and trained model for you given a prompt + data)
Have the basic flow working locally, next step is moving it out of local dev into a scalable cloud env!
@mehulmpt This part here is very important. -> "even if you don't know the answer but have strong fundamentals, you can come up with the answer and explanation."
Curious how you'd assess a candidate who provides a well-reasoned response but misses the target?
Teaching myself google's new MLE-STAR agent by building a tiny app that takes your plain-language prompt & dataset and returns a full ML pipeline + trained model. Learning a lot, can’t wait to share more! 🚀