Could be possible to build an automated robotic factory around mercury to build increasingly many solar satellites that put their solar production into anti matter. After 20 years of exponential growth we would have 1000 tons of anti matter which is enough to make a 1000 tons ship (enough for 10 people) go to alpha centauri in 20 years
Opensource degrees for models
1. Open development, everyone can see and contribute to the process
2. Everything is open source except for the process which is closed
3. Weight and code open but the data is closed
4. Only API
5. Only internal usage
6. Only the model can use itself
GLM 5.2 is 3
Opus Gemini GPT are 4
Mythos is 5
Hopefully we don't get to 6!
Would be amazing to have more projects in degree 1, an example outside of models would be linux where not only you get the code and can contribute but you can read the emails discussing the development process and understand *how* linux is being built.
Opensource has benefits and drawbacks but I think it helps to think how open a project is and whether that is the optimal choice.
Humans are still much better at roadmaps and plans than agents.
Why is that?
How are you scaling the planning phase now it's so easy to spawn a project every hour with coding agents ? Agent can't keep track of plans and don't have good taste.
Should people spend most of their time doing planning?
Agents should be able to understand reality.
I have this model of planets with gears. It's a toy for kids so most likely it's not completely correct.
Trying to ask different llms and agents if they can measure the speeds.
Results in next messages.
I sent Claude analysis to chatgpt and chatgpt to Claude and asked to work more to find the truth since they disagree.
They basically both stayed on their position but chatgpt provided this image to show how planets move on the video which I think is pretty convincing.
Also I think the correct solution should include having the agent asks me to take more videos (with specific angles asked by the agent) for a better estimate.
Clearly chatgpt doesn't have it quite right though, it missed mercury!
I think we need agents to be able to crack this kind of problem very easily if we want any hope of agents being useful in the world.
From a quick try like this, my thought is this is more of a harness problem than a raw LLM capability problem. I'll try with different agents and see what other LLM and agents can do on this.
@doodlestein@satyanadella Makes a lot of sense. I think we need to figure out the RL env part link to skills and clis as well. Make the skill hill climbing machines more self contained. So you can point any new LLM to that and ask it "look at this skill, improve it".
Do you have any thoughts on that?
Imagine if
- other labs do not catch up to Anthropic
- Anthropic keeps making better models
- these new models are even better at science
- anthropic does not allow these models to be used for science (which is the restriction they started today)
Then it means science will only be possible as a member of Anthropic.
Or in other words, it will be the end of science.