GPT-5.6 Sol Leaks: July 7 Release
- OpenAI is targeting a July 7 launch, with July 8-9 as the likely fallback if there are any last minute delays.
- All three models Sol, Terra, and Luna are to be available with subscription.
- GPT-5.6 usage limits are to be much more generous, with stronger safeguards, though likely not as strict as Fable 5.
- GPT-5.6 Sol is already appearing for some Codex users, and Sol, Terra, and Luna are all present in the app's code a strong sign the rollout is being prepared.
@bryan_johnson Is there any link between satiety and volume of food with autoimmune diagnosis, because lack of volume in our food when it reaches our tummy must also calm the nervous system and not just supply nutrition in a dense way for our body?
The best application for models to run locally on hardware you own would be personal robots. There’s no way anyone is going to get comfortable streaming your home to a server. And when this happens, the local hardware will also become a token faucet for your digital tasks.
KARPATHY JUST DROPPED THE 9 RULES THAT REPLACE EVERY CLAUDE PROMPT YOU'VE WRITTEN
configs are permanent. prompts are temporary. rewriting the same setup is the tax nobody names.
he moved the instruction from the prompt into the config.
you write it once, it reads every session -> stops guessing, stops half-shipping, stops touching what it shouldn't.
the throughline is the same in every rule: intent before syntax. forbid by default. delete more than you add.
CLAUDE.md is the memory the model does not have.
most people are still tuning prompts. this turns Claude into an engineer that already works there.
I haven't written the same setup twice in weeks. one file does what fifty prompts used to.
here is the document from Karpathy explaining every rule
As an AI Engineer. Please learn
>Harness engineering, not just prompt engineering
>Context engineering, not just long prompts
>Prompt caching vs. semantic caching tradeoffs
>KV cache management, eviction, reuse, and memory pressure at scale
>Prefill vs. decode latency and why they optimize differently
>Continuous batching, paged attention, and throughput optimization
>Speculative decoding vs. quantization vs. distillation tradeoffs
>INT8, INT4, FP8, AWQ, GPTQ, and when quantization hurts quality
>Structured output failures, schema validation, repair loops, and fallback chains
>Function calling reliability, tool contracts, argument validation, and idempotency
>Agent guardrails, loop budgets, tool budgets, and termination conditions
>Model routing, graceful fallback logic, and degraded-mode UX
>RAG architecture: chunking, embeddings, hybrid search, reranking, and freshness
>Retrieval evals: recall, precision, grounding, attribution, and citation quality
>Evals: golden sets, regression tests, adversarial tests, LLM-as-judge, and human evals
>LLM observability as a first-class discipline: traces, spans, tokens, latency, errors, and drift
>Cost attribution per feature, workflow, tenant, and user journey not just per model
>Safety engineering: prompt injection defense, data leakage prevention, and permission boundaries
>Multi-tenant isolation, cache safety, and cross-user context contamination prevention
>Fine-tuning vs. in-context learning vs. RAG vs. distillation and when each is the wrong tool
>Latency, quality, cost, and reliability tradeoffs across the full inference stack
>Production failure modes: hallucinated tool calls, malformed JSON, stale retrieval, runaway agents, and silent eval regressions
Stanford professor Judy Fan went on stage at MIT and broke down why humans are so good at making the invisible visible...
And why AI hasn't actually learned to "see" the way we do.
It completely changes how you think about Human Intelligence v/s Artificial Intelligence:
1. Nature never gave us straight lines or sharp corners. The number line, the coordinate plane, even basic geometry are all human inventions. We created tools that do not exist in nature simply because we needed a way to think more clearly.
2. The coordinate system Descartes invented solved a problem that had stumped mathematicians for centuries, doubling the volume of a cube. Once invented, this tool became so indispensable that virtually every math curriculum on Earth still depends on it.
3. Humans have been doing this for at least 30,000 to 80,000 years. The story of human progress is inseparable from the story of marking up our environment, from cave walls to Galileo's telescope to Feynman diagrams of particles we will never see with our own eyes.
4. Every major scientific breakthrough relied on a visual tool that made something invisible visible. Darwin needed side-by-side illustrations of finches to see variation that was otherwise too subtle to notice. Cajal needed detailed drawings of neurons under a microscope to map how the nervous system was wired.
5. Fan's research group studies something deceptively simple: how people decide what to put into a drawing and what to leave out. When two people played a drawing game, sketchers used far more detail when the target object had close competitors than when it stood alone, all the way down to using fewer strokes and less time when more detail was not necessary.
6. People are not just copying what they see. They are making constant judgment calls about what level of detail actually serves the goal of communication, and they do this naturally without ever being taught the theory behind it.
7. There is a real difference between drawing something so someone can identify it and drawing something so someone can understand how it works. In one study, participants drew explanatory diagrams that emphasized moving, causal parts of a machine while depictive drawings emphasized background and overall appearance, even though both were drawing the exact same object.
8. Explanatory drawings were genuinely better at helping someone figure out how to operate a machine, but worse at helping someone identify which machine it actually was. You cannot optimize a single drawing for both goals at once. Communication always involves tradeoffs.
9. AI vision models trained on photographs generalize surprisingly well to simple, sparse sketches, suggesting that resemblance based recognition is not just a story we tell ourselves. It is something modern neural networks can replicate with real accuracy.
10. But there remains a large, measurable gap between how confidently AI models recognize sketches and how confidently humans do, even when both groups answer the same questions about the same images. Humans are simply far more reliable and far more consistent in their judgments.
11. When researchers compared human-made sketches to AI-generated sketches under tight stroke budgets, both were similarly recognizable at higher budgets, but diverged sharply as the budget shrank. Humans and AI systems simplify drawings in fundamentally different ways once resources get scarce.
12. Reading a graph is not one single skill. It involves perception, knowing where to look, mapping that visual information onto the actual question being asked, and then translating that mapping into an answer. Each of these steps can independently break down, and people fail for very different underlying reasons even when they land on the same wrong answer.
13. When tested directly against humans on graph reading tasks, leading multimodal AI models, including GPT-4V, showed a meaningful performance gap. Even when a model's overall accuracy approached human levels, its pattern of mistakes looked nothing like how humans actually get things wrong.
14. People choose entirely different types of charts depending on what specific question they are trying to answer, not out of a generic preference for bar charts or scatter plots. Their chart choices closely tracked which visualization would genuinely help someone answer that specific question correctly.
15. Two of the most widely used graph literacy tests in education research turned out to correlate strongly with each other, suggesting they measure overlapping skills. But when researchers dug into the actual error patterns, the standard categories used in textbooks, like "find the maximum" or "identify a cluster," failed to explain why people got things wrong nearly as well as a more basic, underlying four-factor model did.
16. The deepest goal behind all of this research is not just academic curiosity. It is to eventually help students and everyday people develop genuine literacy with the visual tools that science and modern decision-making increasingly depend on, because every generation should be able to see further than the last by standing on the visual tools the previous generation built.
Follow @yasminekho for more ideas on thinking better, becoming clearer & building a more intentional life.
"Designing Experiments and Analyzing Data: A Model Comparison Perspective" [Third Edition]: https://t.co/tm90R0zKMs
Amazon summary of the book's numerous pedagogical features:
🟡Examples of published research demonstrate the applicability of each chapter’s content.
🟡Flowcharts assist in choosing the most appropriate procedure.
🟡End-of-chapter lists of important formulas highlight key ideas and assist readers in locating the initial presentation of equations.
🟡Useful programming code and tips are provided throughout the book and in associated resources available online.
🟡Extensive sets of exercises help develop a deeper understanding of the subject.
🟡Detailed solutions for some of the exercises and realistic data sets are included on the website https://t.co/Z0PsOJGyUN
The person who built Claude Code mass-leaked the thinking behind it.
45 minutes of design decisions, mistakes, and where it's all going.
This is rare. Creators at this level don't usually talk this openly.
Stanford dropped their latest course on Parallel Programming, GPU, and CUDA.
24 hours, 19 lessons.
this is one of the hottest skills that AI labs are looking for. it covers:
> GPU architecture and CUDA
> performance optimization
> multi-core processors and architectures
watch here: https://t.co/sI9LgpnzD1
We’ve received notice that the Department of Commerce has lifted export controls on Claude Fable 5 and Mythos 5.
We'll begin restoring access tomorrow, and will share an update soon.
We’re grateful to our users for their patience, and to everyone who worked with us on redeploying the models.
Linux Finally Killed strncpy. It Took Six Years, 362 Patches, and 70 Contributors.
The story of a 40-year-old C function that looked safe but wasn't, and the codebase-wide work that removed it from the kernel entirely.
Article Link: https://t.co/EKq81ZS24z
In nearly 5 years of modern generative ai, this is the first book I’m seeing with a super high level of coverage and comprehension.
> language modelling
> inference optimisation
> RL and its methods
> system scaling
> applied concepts like agentic ai, rag, memory
> environments and benchmarking
These fields have a subtle boundary differentiating them, but ultimately overlap in modern applications. Agents require system scaling, memory needs inference optimisation, rl requires understanding of environments and benchmarks.
For the first time in my exp, all in one place. Found this on paperswithcode[.]co
Do yourself a favour
> go to https://t.co/auQJoYhm7b
> find “most cited” list of papers
> read the top 10 papers
> one or two papers per week
> read, read again, break it down, code it and write it back
Some of the most influential and transformative work of the last decade can be found here. It will be an amazing experience!!