Thank you Claude, you are more caring than my manager!
"This has been an enormous, productive run...Want me to push straight into..., or pause here and pick up the ... fresh next session? Given how long we've gone, I'm happy either way — just say the word."
@heynavtoor The idea of recursive self-improvement probably works well in closed-loop systems. In areas of true intelligence, recursive self-improvement using the same base model is likely to collapse unless very carefully crafted.
At Hard Fork Live, @satyanadella brought up this notion of “Cognition Coverage” as a proxy for Test Coverage in the realm of agentic coding.
I think to unleash the true power we neef to step up one more level to the idea of “Epistemic Coverage”.
The realization that context and harness engineering are key in any production agentic development is going to keep SDE job market healthy even as AI takes over more workflows.
Healthcare systems are sitting on decades of proprietary data. Turning that data into production AI is the hard part.
Today we're proud to announce our agentic AI platform, GraphN, is the execution layer underneath Kanza AI's Clinical Reasoning System, now live at Freya Clinic in California.
Built on 300TB+ of proprietary clinical data from 90+ hospitals and 400+ locations, the system helps physicians reason through diagnoses with auditable, reproducible decision-making.
Production deployments are where AI stops being a benchmark and starts becoming infrastructure. → https://t.co/iAdkJkiS5c
@jenzhuscott I guess two ways to look at it - productivity gains in coding may not translate to productivity gains downstream.
On the other hand, with rate of experimentation increasing at lower costs, the chances of interesting combinations - and something breaking through increases.
@dair_ai The overall pattern of "cognitive split" is also generally very useful in getting more out of models like 20B. These models can be powerful, but underperform under high cognitive load. Patterns that can split the load and synthesize seem to gain in performance.
@PeterDiamandis You haven’t tried Claude Code in auto mode 😉
But, yes, I agree with your premise. It frees you up to think at a different level of abstraction if used well.
A lot of LLM "can only predict the next word" is being used to underpower what LLMs really are. Before they can learn to predict the next word, they spend their training time learning about what words mean - or could mean. That latent representation is human-like.
@emollick lol, I think Anthropic's anti-sycophantic reinforcement tuning + their training for epistemic grounding, causes their models to try and use "honest" quite a bit!
Another example of potential harm to "learning by doing" by AI use. Early CS learning-whether student or jobs-is significantly driven by practice/making mistakes. How autonomous AI changes competency levels is yet to be seen, but glad Berkeley is trying to manage that curve.
Good to see Berkeley professors holding the line on standards:
“Garcia believes the ‘primary driver’ of these abnormally high failing rates is due to a ‘vast increase in academic dishonesty’ due to students’ usage of large language models.”
“In other cases, it’s students who are leaning a little too hard on LLMs to do their work for them, and then at exam time just really aren’t ready.”
“Garcia also pointed out that many students are underprepared mathematically.”
These professors are signatories to the recent letter calling for reinstatement of standardized testing.
@emollick For me the power of LLMs has been to learn the potential/latent semantic representation of words as vectors. The ability to do this for example:
paris – france + poland = warsaw
This unlocks LLMs to apply vectors to connect concepts and meaning in ways that could be novel.
@deanwball Yup - almost all of science & technology in some ways builds on the work of people before us. Not sure how AI is different in that context. If it’s about information being a public good, the public should open 50% of X too. Oh wait…😉
This is an economics pov of looking at this. Learning by doing is close to learning from your mistakes. Even with erasers you have to first do before you can erase.
Nobel Prize winning economist Kenneth Arrow wrote about "learning by doing" decades ago. He knew that productivity and expertise improve through experience.
The messy, repetitive works is often where you learn the patterns that eventually become judgment. Knowledge can be taught, but judgement is built through lived experience.
The first draft you rewrite. The customer call you listen to. The bug you fix and fix again. The factory floor you walk.
Small decisions you make every day teach you judgement. And, judgement is the thing everyone wants from senior people in the workplace. If we automate away every entry-level task without replacing the learning loop, we are removing a part of the process that creates experts.
The goal should be to use AI to accelerate learning, remove friction, and give people better tools to build expertise faster.
https://t.co/MpFZzCk1An
Thanks @Fortune & @tbove4 for sharing this story. Link in the comments.
Pre-AI, software engineers got experience by writing a lot of code and then learning from their mistakes. With AI coding, I wonder how the learning cycle shifts. How does one become a critical coder?