Cursor Blame, git ai, and other things like them are offering solutions in search of a problem. The idea that it matters what produced a line of code, be it a series of keystrokes, a language model, or an LSP action is a fallacy. It's like wanting to know if a developer wrote their code in vim or emacs. Code is code - the same final artifact will reliably produce the same outcomes
i'm only halfway through the interview, but he's said that rl is basically bad because it selects for trajectories that are noisy. what he failed to mention was that rl is an iterative process, the advantages are computed for the optimal trajectories iteratively, each cycle decreasing the noise and increasing the signal