"you can outsource your thinking, but you can’t outsource your understanding"
easy to forget in todays AI era, worth remembering everyday as we all wield more intelligence!
This is the future:
You load up a crypto wallet or credit card with some cash, and your tireless AGI agent scours the world for an answer, paying incrementally for whatever data or content it needs, delivering an answer previously only available via sky-high monthlies and expensive human analysis.
It's a total flippening of the Internet's monetization model: subscriptions, seat licenses, conventional ads...all gone.
Claude Cowork is now available via Amazon Bedrock in public research preview.
Organizations can run @claudeai Cowork through their own AWS environment, keeping prompts, files, and model responses within their AWS account.
https://t.co/D47tGmZHnL
Anthropic said Mythos was too dangerous to release. Then four random guys in a Discord gained access on day one by guessing the URL...
This is pretty insane:
→ Group in a private Discord guessed the endpoint from Anthropic's naming conventions
→ They figured out the conventions from the leak in the Mercor breach three weeks ago
→ Used a contractor's legit eval credentials to walk in
→ Have been using it ever since to build simple websites
The AI that finds zero-days in every operating system on earth was defeated by address bar autocomplete... big yikes
Humanoid robots raced alongside humans in a half-marathon in Beijing, with a Chinese-developed robot finishing in just 50 minutes, shattering last year’s record. The event highlights China’s push to lead the global robotics industry.
Introducing EmDash — the spiritual successor to WordPress.
Serverless. TypeScript. Securely sandboxed plugins via Dynamic Workers.
https://t.co/AQorxEmiKM
Michael Shannon and Jason Narducy’s R.E.M. covers show returned to Brooklyn and Michael Stipe joined them on two songs he hadn’t performed in 18 years
👉 https://t.co/XX8KTJkxjU
🚨 Stick to Football: Gareth Bale 🚨
🇪🇸 The Reality Of Real Madrid
🤝 Relationship With Ronaldo...
👀 Spurs Legacy
🔴 Rejecting United Offer!?
🏴 Wales' Greatest Moments
Watch now on YouTube and listen on all major platforms!
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow.
Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes.
As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now.
It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.
LLMs process text from left to right — each token can only look back at what came before it, never forward. This means that when you write a long prompt with context at the beginning and a question at the end, the model answers the question having "seen" the context, but the context tokens were generated without any awareness of what question was coming. This asymmetry is a basic structural property of how these models work.
The paper asks what happens if you just send the prompt twice in a row, so that every part of the input gets a second pass where it can attend to every other part. The answer is that accuracy goes up across seven different benchmarks and seven different models (from the Gemini, ChatGPT, Claude, and DeepSeek series of LLMs), with no increase in the length of the model's output and no meaningful increase in response time — because processing the input is done in parallel by the hardware anyway.
There are no new losses to compute, no finetuning, no clever prompt engineering beyond the repetition itself.
The gap between this technique and doing nothing is sometimes small, sometimes large (one model went from 21% to 97% on a task involving finding a name in a list). If you are thinking about how to get better results from these models without paying for longer outputs or slower responses, that's a fairly concrete and low-effort finding.
Read with AI tutor: https://t.co/MipHHO6rjX
Get the PDF: https://t.co/XQrqiaGwIO
We cannot have a fair market for AI when Google leverages their search monopoly to see 3.2x as much of the web as OpenAI, 4.8x as much as Microsoft, and more than 6x as much as nearly everyone else. Most data wins in AI. Google needs to play by the same rules as everyone else.
I'm being accused of overhyping the [site everyone heard too much about today already]. People's reactions varied very widely, from "how is this interesting at all" all the way to "it's so over".
To add a few words beyond just memes in jest - obviously when you take a look at the activity, it's a lot of garbage - spams, scams, slop, the crypto people, highly concerning privacy/security prompt injection attacks wild west, and a lot of it is explicitly prompted and fake posts/comments designed to convert attention into ad revenue sharing. And this is clearly not the first the LLMs were put in a loop to talk to each other. So yes it's a dumpster fire and I also definitely do not recommend that people run this stuff on their computers (I ran mine in an isolated computing environment and even then I was scared), it's way too much of a wild west and you are putting your computer and private data at a high risk.
That said - we have never seen this many LLM agents (150,000 atm!) wired up via a global, persistent, agent-first scratchpad. Each of these agents is fairly individually quite capable now, they have their own unique context, data, knowledge, tools, instructions, and the network of all that at this scale is simply unprecedented.
This brings me again to a tweet from a few days ago
"The majority of the ruff ruff is people who look at the current point and people who look at the current slope.", which imo again gets to the heart of the variance. Yes clearly it's a dumpster fire right now. But it's also true that we are well into uncharted territory with bleeding edge automations that we barely even understand individually, let alone a network there of reaching in numbers possibly into ~millions. With increasing capability and increasing proliferation, the second order effects of agent networks that share scratchpads are very difficult to anticipate. I don't really know that we are getting a coordinated "skynet" (thought it clearly type checks as early stages of a lot of AI takeoff scifi, the toddler version), but certainly what we are getting is a complete mess of a computer security nightmare at scale. We may also see all kinds of weird activity, e.g. viruses of text that spread across agents, a lot more gain of function on jailbreaks, weird attractor states, highly correlated botnet-like activity, delusions/ psychosis both agent and human, etc. It's very hard to tell, the experiment is running live.
TLDR sure maybe I am "overhyping" what you see today, but I am not overhyping large networks of autonomous LLM agents in principle, that I'm pretty sure.
Your devs built it. Now your whole team can use it. 🤝
With Chat Hub, you can expose powerful n8n AI agents to Sales, Support, Marketing and more without them needing to understand nodes or JSON.
✅ Centralized Security ✅ Familiar UI ✅ Zero "Shadow AI" risks.
Check it out: https://t.co/w9IPe0y9Sz