This changes how we think about training data
OpenBMB open-sourced the world's largest synthetic pre-training datasets
L3-tier refined data that powered MiniCPM5-1B
Plus a 15M+ scale SFT dataset with reasoning chain annotations.
100% open source
Here's what matters π§΅
Built a data extraction service with MiniMax M3!
Fed it a half-baked prompt.
It generated the complete FastAPI service: endpoints, Pydantic validation, batch processing, Swagger UI_.
Here's the step-by-step build π§΅
Turn any git repo into an AI Agent!
Meet GitAgent, which turns any git repo into a portable AI agent.
Drop two files into any repo:
β agent.yaml (manifest)
β SOUL.md (identity)
100% Open Source
More details π§΅
This is huge!
Claude Fable 5 is now available in GitHub Copilot.
The first model in Anthropic's Mythos class, designed for long-horizon, autonomous coding and knowledge-work tasks
BUT Claude Fable 5 requires data retention to operate.
more details π§΅