"Autodata: An agentic data scientist to create high quality synthetic data"
If there's auto-research, shouldn't there also be a auto-data generation?
In this new Meta paper, they proposed Autodata, which makes synthetic data generation work more like a data scientist, with an agent that creates tasks, tests them on weak and strong models, studies what failed, and revises the data until it gives the target model a useful learning signal.
And it's not just about making harder data, it's data that is just right to learn from. In their experiment, a 4B model beat standard Self Instruct training and even outperform a larger 397B baseline on legal reasoning.
Andrej Karpathy joined Anthropic five weeks ago.
Yesterday my friend on his team sent me the Claude.md file he actually uses.
It completely changed how I work with Claude.
From the very first message, the difference was obvious.
With this file, Claude finally stops fighting me and starts working exactly the way I need it to.
Bookmark it before it gets taken down.
Read it now, then check the article below.
In nearly 5 years of modern generative ai, this is the first book I’m seeing with a super high level of coverage and comprehension.
> language modelling
> inference optimisation
> RL and its methods
> system scaling
> applied concepts like agentic ai, rag, memory
> environments and benchmarking
These fields have a subtle boundary differentiating them, but ultimately overlap in modern applications. Agents require system scaling, memory needs inference optimisation, rl requires understanding of environments and benchmarks.
For the first time in my exp, all in one place. Found this on paperswithcode[.]co
We stress tested many frontier AI models for multimodal medical reasoning (including GPT-5, Claude 3.5, Gemini 2.5 Pro). They’re not ready. Faulty reasoning, use of inappropriate shortcuts, hallucinations. Published today @NatureMedicine https://t.co/P6eHZEmfbW
Research from SCI members @GarryPNolan, @GuolanLu, and colleagues introduces single-cell spatial pharmacobiology, revealing how stromal barriers can limit antibody delivery and inform precision cancer therapies. https://t.co/rv5kn3wQRP
Excited to launch Biomni everywhere: web, desktop, mobile, and MCP!
Our vision is simple: biological research should happen wherever inspiration strikes. Whether you’re at your bench or riding the subway, Biomni is there when the idea hits. Start a new binder from your phone, then continue seamlessly across devices.
We're also launching Biomni MCP, so you can access Biomni’s biology capabilities wherever you work:
BREAKING: The most controversial article of the year, claiming that early morning immunotherapy works better than in the afternoon, is now retracted.
After reading the responses provided by the authors to the inconsistencies raised in the web, the @NatureMedicine editors no longer have confidence in the integrity of the results. The only prospective evidence that time-of-day matters for immunotherapy is now gone.
https://t.co/aXUY6aekhl
To me, this means (at least) two things.
First, it confirms that prudence on this topic was and remains critical. For as inexpensive it may be to give a drug earlier or later in the day, it carries a much more relevant cost: the one of scientific integrity. We owe our patients to make decisions based on solid data. We should not give up this practice too easily, particularly in the presence of several concerning red flags.
Second, this retraction should also prompt a broader reflection on the current state of peer review, in which unpaid reviewers struggle to keep up with a steady rise in submitted papers. Journals need to improve the process by implementing a formal, consistent, in-depth review of each paper by paid professionals. A practice that, in this case, may have avoided a retraction arriving after 22 citations and after inclusion of this study in at least one meta-analysis. And possibly, after some physicians had already changed their practice in IO administration.
For a thoughtful recap of this story, I recommend this well-written new piece in @ScienceMagazine by Laura Agudelo. I’m grateful to Laura for including my perspective in the article.
https://t.co/qHM5fMjwQ3
Autodata: An agentic data scientist to create high quality synthetic data
"We introduce Autodata, a general method that enables AI agents to act as data scientists who build high quality training and evaluation data."
Data creation stage + data analysis stage+meta-optimization
RETRACTED!
TL;DR: This study, which claimed the time of day had massive effects on immunotherapy efficacy, but which appeared fraudulent, now seems to certainly have been fraudulent.
Many thanks to the editors at Nature for handling this quickly and correctly.
Claude just became a craacked video game designer.
With the launch of Unreal Engine's MCP server last week, you can now build entire video games just by talking to Claude.
I spent the past few days building with it, and I'm telling you, this is going to forever change how video games get made and who gets to make them.
In this video I show you exactly how to set up the Unreal Engine MCP yourself and run through three demos: building a full playable city, cloning a real city from Google Earth, and creating custom buildings in Blender.
Here's the agent harness I mention too: https://t.co/mos9EwnZ2h
Intro
What I built in a few hours
Setting up the Unreal MCP server
Fixing the port 8000 connection issue
The agent harness that avoids the pitfalls
Demo 1: Building a city with City Sample
Demo 2: Cloning a real city from Google Earth with Cesium
Demo 3: Custom buildings with Blender headless
Outro
A senior Anthropic engineer just dropped 11-page PDF on "Loop Engineering" for agentic systems.
The shift: you stop prompting the agent. You build the system that prompts it instead.
Schedule → Discover → Build → Verify → Repeat
Every loop runs one turn, five moves:
• Discovery: it finds its own work - failing CI, open issues, recent commits - instead of being handed a list.
• Handoff: each task gets an isolated git worktree so parallel agents don't collide.
• Verification: a second agent, told to assume the code is broken, reviews the first. The "thing that can say no."
• Persistence: results get written to disk, never left in a context window that gets flushed.
• Scheduling: an automation wakes it on a timer. That's what makes it a loop.
The key insight: an agent grading its own work always praises it.
This 11-page PDF changed how I'm building agentic systems today.
Read it now, then explore the article below.