Roadmap to FrontierCS 2.0 is live.
If continual learning and AI auto-research are going to matter, benchmarks need to test more than one-shot answers. FrontierCS 2.0 moves open-ended evaluation toward feedback-driven environments, repo-level tasks, and controlled evaluator interaction.
Example: Try your own agent on the Erdős-unite-distance conjecture disproven by @OpenAI in FrontierCS 2.0:
I took a nonlinear path into tech.
I had 4 majors in 7 years.
psychology + math in undergrad,
CS + data science in grad school.
And I didn’t take any coding classes in undergrad.
Now I’m a SWE at big tech, exploring AI agents.
At first, tech felt like a practical path. Compared with my undergrad majors, it seemed to offer better pay, better work-life balance.
So I applied to CS programs designed for people from nontraditional backgrounds. I had very little coding experience when I started.
Then the market changed in 2023.
Hiring slowed down, and even companies like Amazon and Meta had hiring freezes.
I had weak coding fundamentals and very few projects.
Unsurprisingly, my first internship search was rough. I only got a handful of interviews, and the only internship offer I landed was unpaid.
The next year, I optimized for recruiting.
- Built projects.
- Did mock interviews.
- Found friends going through the same process.
- And did a lot of LeetCode (I don’t think I was naturally talented at coding interviews. I did 400+ problems before they finally started to make sense).
Eventually, after about 1.5 years, I got SWE internship offers from companies including Amazon, Apple, and Salesforce.
That grind helped me break into tech.
But it also taught me something:
Breaking into tech is one milestone.
Figuring out what kind of problems I actually care about building for is another.
That’s part of why I’m exploring AI and agents now.
At first, I chose tech for practical reasons.
Now I’m trying to find the problems in tech that I genuinely want to understand and build around.
If you’re on a nonlinear path too, I hope this makes it feel a little less lonely.
Feel free to ask anything :)
Small disclaimer: I can’t really tell you if you should switch to CS though. This is just my path, and everyone’s situation is different.
This is the real shift!
Dynamic Workflows = parallel subagents + summarized verification
We're moving from "chat with one model" to "manage an execution system"
New in Claude Code (research preview): dynamic workflows.
Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks.
Use the word "workflow" in a prompt to get started.
Hi, I’m Jane — a SWE exploring AI and agent systems 👋
I’m curious about how useful and reliable agents are actually built and shipped.
I’ll share what I’m learning from papers, coffee chats, experiments, side projects, and the occasional random fun thing.
I took a nonlinear path into tech, studied across 4 fields in 7 years, and didn’t take any CS classes in undergrad — so I know what it feels like to learn from scratch, and I care about making technical ideas easier to understand.
If you’re new to CS, AI, agents, or coding, welcome. If you’re already deep in the field, I’d love to learn from you too.
Let’s learn together :)
Bigger context windows won’t make AI agents reliable.
A 200K-token model can still forget the goal, retrieve the wrong memory, or drift after 100 turns.
The real problem isn’t memory size.
It’s context control:
- what the model sees,
- what it forgets,
- and how the harness keeps it aligned.
Full breakdown: