@SamarthMakhija@jshobrook@difficultyang wdym by watching claude code hill climb? like successive model releases improving on an eval? do you have an eval you like
@michaeljmcnair a lot future real world capability improvements will come from RL training on proprietary evals/environments. not clear that these capabilities can be distilled without the environment(s) as well
@finn_hulse afaict their early mission of AI interviewing for real jobs was and is still retarded. then they pivoted to a more upscale version of scale which was proven already
@finn_hulse also, you’re appealing to consensus listing these current unicorns but are also arguing to be anti consensus. think a better set of examples could be struck
@finn_hulse re: mercor, i don’t see why what they’re doing CANT be vibecoded? micro1 and handshake doing the same thing, also at unicorn status, and all of their expert hiring processes are automated.