@swyx@METR_Evals Imo this is the most important coding eval atm. I'm honestly so tired explaining my fellow ai researchers why I don't merge their PRs despite they solve the issue
@mil000 > It ends up pushing founders into predatory hostels and other bad housing situations
Is this why some angry founder pulled a gun on partners lol?
@Nick_Davidov I guess thatโs what happened:
- 4.6 is very creative, but tends to ignore ~10% of instructions in complex prompts, some patterns are overfit
- Anthropic tried to make it follow prompt instructions more literally
- turns out a lot of engineers like when Claude thinks for them
@garrytan Coverage is โexecuted unique lines of code by the test suite / total non-test lines of codeโ. Your way of calculating coverage is incorrect
@Nick_Davidov From what I see, thereโs a strong negative correlation between how much effort someone puts into managing their social graph and their actual social status
I donโt have a strong opinion on RLS overall, but exposing databases to agents via tools and relying on RLS for security is definitely a terrible idea.
The best pattern Iโve seen is using hydrated copies with sync engines + isolated getters and setters
RLS was a mistake and folks exposing that level of complexity to less technical users is asking for trouble.
It was a mistake in Firebase. Itโs a mistake in Supabase. It will be a mistake in the next product too.
I personally - even knowing how to secure it - would never touch it. Itโs the worst security footgun you can imagine. One small mistake and your data is available to the world.
LLMs are great for human in the loop applications, but fail at deterministic developer tasks.
@interfaze_ai is a new AI model that outperforms general LLMs on high accuracy tasks like: OCR, Object Detection, Web scraping, Speech-to-text, Classification and more.
Congrats on the launch, @yoeven and @khurdula!
https://t.co/vWEQt4h5Kg
@lorecirstea@effectfully@effectfully you said "what if the company doesn't make enough money". I corrected you by saying the company doesn't need to make money to give equity to employees. This correction is valid regardless of all other things we discuss in this thread.