Parallax scales seamlessly from 4 to 256 GPUs, maintaining a scheduling overhead of under 10ms for model allocation and path selection.
Unlike systems confined to controlled LAN environments, Parallax sustains high throughput and tight latency even over variable WAN connections.
models to scale cleanly across infrastructure, and requests to be dynamically routed along the most efficient path. Ultimately, these features position Parallax as a truly hyper-scalable operating system for sovereign AI application. @Gradient_HQ
and real-time DAG routing via RTT profiling and Lattica NAT traversal, Parallax functions as a robust production service rather than a research prototype. It allows hosts to join from anywhere.
A year ago today @Gradient_HQ demonstrated Parallax for the first time.
Serve open models and build clusters from anywhere in the world. Heterogeneous mix bringing accessibility to another dimension. GPU and Macs in one
Lattica is the unified communication link that helps execute AI workloads over globally distributed swarms.
Heterogeneous FLOPS in different parts of the globe can transport, serve and train. Happy one year demo anniversary @Gradient_HQ
GLM-5.2 (High) hits 73% on agentic coding at ~45K output tokensโmatching Claude Opus 4.8 (Max) at 70% while using comparable compute. GLM-5.2 (Max) pushes to 75% at 85K tokens, within striking distance of Claude Opus 4.8's ceiling.
Previous generation GLM-5.1? Peaked at 57%.
Introducing GLM-5.2: Frontier Intelligence, Open Weights
- Significant improvements in coding and agentic tasks
- Strong long-horizon capabilities with a 1M context window
- Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency
- MIT-licensed open weights
- Same API pricing as GLM-5.1
Tech Blog: https://t.co/LAsxUdN0JZ
Weights: https://t.co/g0A1C4UWx4
API: https://t.co/Kc3E22cbN7
Coding Plan: https://t.co/Nk8Y98HNhU
Chat: https://t.co/WCqWT0qCQb
GLM-5.2 High achieves near-Opus-4.8 performance at roughly half the token cost. If you're running agentic coding workflows at scale, that's a meaningful economics difference.
Evaluated on Terminal-Bench 2.1, DeepSWE, and SWE-Atlas QnA via Claude Code 2.1.167.
The open-weight beast with 1M context, strong multimodal support (text โ text + vision/audio/video), and seriously competitive pricing:
โข Input: $0.30 / 1M tokens
โข Output: $1.20 / 1M tokens
โข Max input: ~512K tokens
Is on Commonstack.
This is one of the strongest open-weight releases lately โ massive context, capable multimodal, and accessible pricing. Perfect for long-context RAG, agents, and serious workloads.
https://t.co/HKIkNFDvxJ
"Intelligence should be open"--> well said. GLM-5.2 brings strong coding + 1M context, now rolling out to plans with full MIT open-source next week. Real momentum for accessible, buildable frontier AI. The future is open and sovereign.
Intelligence should be open, accessible, and ready to build with, empowering every developer, everywhere.
GLM-5.2 is now available to all GLM Coding Plan users, including Lite, Pro, Max, and Team plans.
https://t.co/AedZACyzej
As our new flagship model, GLM-5.2 delivers powerful coding capabilities, usable 1M-context support, and continued strengths in long-horizon tasks.
API and Chatbot services will launch next week. The model will also be officially open-sourced next week under the MIT License.
The future of AI is open, and it belongs to the people.