@karpathy Running this pattern for months. The interesting wall I hit: cross-domain pollination — when insights from one KB should automatically inform another. Did you run into this yet? How do you handle it with multiple topic wikis?
U mnie to działa - wiem, jak to brzmi 😁 Po wykorzystywaniu GitHub Projects ( choć kazdy ai woli issues) wróciłem do epików w Markdown i działa to lepiej. Mam też mocne poczucie, że projekty w JS i Pythonie są lepiej ogarniane przez Claude Code. Z praktyki widzę, że język ma znaczenie. I tak, zarządzanie agentami to nie to samo co zarządzanie ludźmi.
“The Leftwing nut jobs at Anthropic have made a DISASTROUS MISTAKE trying to STRONG-ARM the Department of War” - Donald Trump
This is the best news I wasn’t waiting for. Anyone who knows Dario Amodei’s story, leaving OpenAI because AI safety mattered more than growth, saw this coming. The man is consistent to the bone.
Today thousands of AI engineers around the world are probably losing their minds that the best model just got kicked out of the Pentagon. But this is exactly the moment that separates companies with principles from companies with slide decks about principles.
Values in business cost real money. That’s exactly why they matter, you pass that test only when it actually hurts.
If anyone deserves an award for courage in tech it’s Amodei. Also Skynet is not coming as fast as I thought, because OpenAI models are definitely not Cyberdyne Systems.
#Anthropic #AI #Claude #Skynet
We’re building an LLM chip that delivers much higher throughput than any other chip while also achieving the lowest latency. We call it the MatX One.
The MatX One chip is based on a splittable systolic array, which has the energy and area efficiency that large systolic arrays are famous for, while also getting high utilization on smaller matrices with flexible shapes. The chip combines the low latency of SRAM-first designs with the long-context support of HBM. These elements, plus a fresh take on numerics, deliver higher throughput on LLMs than any announced system, while simultaneously matching the latency of SRAM-first designs. Higher throughput and lower latency give you smarter and faster models for your subscription dollar.
We’ve raised a $500M Series B to wrap up development and quickly scale manufacturing, with tapeout in under a year. The round was led by Jane Street, one of the most tech-savvy Wall Street firms, and Situational Awareness LP, whose founder @leopoldasch wrote the definitive memo on AGI. Participants include @sparkcapital, @danielgross and @natfriedman’s fund, @patrickc and @collision, @TriatomicCap, @HarpoonVentures, @karpathy, @dwarkesh_sp, and others. We’re also welcoming investors across the supply chain, including Marvell and Alchip.
@MikeGunter_ and I started MatX because we felt that the best chip for LLMs should be designed from first principles with a deep understanding of what LLMs need and how they will evolve. We are willing to give up on small-model performance, low-volume workloads, and even ease of programming to deliver on such a chip.
We’re now a 100-person team with people who think about everything from learning rate schedules, to Swing Modulo Scheduling, to guard/round/sticky bits, to blind-mated connections—all in the same building. If you’d like to help us architect, design, and deploy many generations of chips in large volume, consider joining us.
I’m an AI & cybersecurity specialist. What I see daily terrifies me. I’ve waited for someone to show the world what’s coming. “The AI Doc: Or How I Became an Apocaloptimist” — from the Oscar-winning creators of EEAAO. Most important film of the decade. Share this. Algorithms suppress it.
https://t.co/88m8aipAqn
I'm Boris and I created Claude Code. Lots of people have asked how I use Claude Code, so I wanted to show off my setup a bit.
My setup might be surprisingly vanilla! Claude Code works great out of the box, so I personally don't customize it much. There is no one correct way to use Claude Code: we intentionally build it in a way that you can use it, customize it, and hack it however you like. Each person on the Claude Code team uses it very differently.
So, here goes.
New research just dropped: this prompting technique cuts AI hallucinations by 50%.
It's called Model-First Reasoning.
Instead of asking "How do I solve [xxx] problem?"
You first force the AI to list: what's involved, what can change, what actions are possible, and what's not allowed.
THEN you ask it to solve using only what it wrote down.
So what makes this different from Chain-of-Thought?
CoT lets the AI think and solve, but at the same time.
It sounds smart. It flows well. But it makes stuff up along the way.
Model-First Reasoning creates a hard wall instead.
Define first. Solve second. No mixing.
The AI can ONLY use what it wrote down in step one. That's the trick.
The researchers tested it on medical scheduling, route planning, resource allocation, and logic puzzles.
Same pattern everywhere: fewer broken rules, more consistent outputs.
Why it works:
✦ LLMs make things up because they assume stuff you never told them.
✦ When you force them to write everything down first, there's nowhere to hide.
✦ It makes a stronger case for why "Human-in-the-loop" works much better, too: we make sure every step is validated before going to the next.
You can read the paper here: https://t.co/vTuQvsNyCk.
OpenAI, Anthropic, and Google AI engineers use 10 internal prompting techniques that guarantee near-perfect accuracy…and nobody outside the labs is supposed to know them.
Here are 10 of them (Save this for later):
I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year and a failure to claim the boost feels decidedly like skill issue. There's a new programmable layer of abstraction to master (in addition to the usual layers below) involving agents, subagents, their prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations, and a need to build an all-encompassing mental model for strengths and pitfalls of fundamentally stochastic, fallible, unintelligible and changing entities suddenly intermingled with what used to be good old fashioned engineering. Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it, while the resulting magnitude 9 earthquake is rocking the profession. Roll up your sleeves to not fall behind.
I love the expression “food for thought” as a concrete, mysterious cognitive capability humans experience but LLMs have no equivalent for.
Definition: “something worth thinking about or considering, like a mental meal that nourishes your mind with ideas, insights, or issues that require deeper reflection. It's used for topics that challenge your perspective, offer new understanding, or make you ponder important questions, acting as intellectual stimulation.”
So in LLM speak it’s a sequence of tokens such that when used as prompt for chain of thought, the samples are rewarding to attend over, via some yet undiscovered intrinsic reward function. Obsessed with what form it takes. Food for thought.
A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task
Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task
This represents a ~390X efficiency improvement in one year
Azure VM internet access changes coming Sept 30, 2025! Default outbound access for Internet will be removed. Learn new routing options: Set “defaultOutboundAccess” to false & configure explicit outbound connectivity.
#Azure#Networking#CloudSecurity https://t.co/bL8wqWS0Fe
here is sora, our video generation model:
https://t.co/CDr4DdCrh1
today we are starting red-teaming and offering access to a limited number of creators.
@_tim_brooks@billpeeb@model_mechanic are really incredible; amazing work by them and the team.
remarkable moment.
Do you store your "DNS dynamic update registration credentials" in a DHCP?
Cute, it means I have a new tool for you 😁😈
Enjoy the DHCP Server DNS Password Stealer. The C source code, and the compiled exe, as usual: https://t.co/iJnBMc6qRs
Explore the frontier of AI with #Google's Gemini 👾. Dive into its revolutionary capabilities and what it means for the future of tech. 🔗 https://t.co/duC1Ht6jKC #ArtificialIntelligence#Innovation
Need the Azure Icons to draw your architectures? Here is how to download the new Azure Architecture Icons ⚡
https://t.co/M7nRWjYso6
#Microsoft#Azure#MicrosoftAzure