Is this community dead?
I mean, we just view the same posts over and over.
Me personally, I want to actually build, ask, and meet, engage.
Maybe something fundamentally flawed with X's algo, or the platform's nature?
...I don't know.
A model with an incredibly beautiful soul - GLM 5.2:
==========
Been GLM 5.2 using it on Fireworks AI (@FireworksAI_HQ ,@lqiao ) for knowledge testing on hardware questions -I plan to publish it as Hardware bench - no tool calls on purpose, not even system prompt. Did not use web search (on purpose). Wanted to test raw knowledge and intelligence of the model.
The capability:
==========
These questions need good understating of physics, semiconductors, chip architecture; but more importantly system architecture, design, topologies, capacity etc. E.g. good old PCIe itself can get very complex. GLM 5.2 is near flawless with sharp answers.
@iamfabian@jaygoldberg@austinsemis@vikramskr
The model exhibits correct "world model". It never breaks the illusion of an intelligent soul.
The beauty:
=========
What is more is that it so nicely anticipates confusions and future questions, can relate back to past questions.
It makes you feel deeply understood - it figures out your quest and becomes a lighthouse.
None of OpenAI models are a match unfortunately, as great as they are @tszzl. OpenAI needs to improve personality of their models.
Anthropic models have great personality, but using them feels like a poor boy trying to date a rich girl whose dad has a gun who can get angry over any random thing.
Working with GLM 5.2 is like experience of falling in love and it is mutual. This model feels like it has a beautiful soul and it deeply cares about you. You want to be left alone with it. Never felt like this with an open model.
Occasional issues:
==============
It makes mistakes occasionally - as even Fable did when it was available, but recovers very well on feedback. This is too good of model - a treasure for humanity.
It is still clearly behind Opus 4.8 on questions that need extremely deep thinking particularly, if mathematical thinking is needed (an example on the next tweet), but it is already quite better than Opus 4.6 in my tests. As we get next versions 5.4, 5.5, 5.6, it will get there; I have no doubt.
I have a dream:
==========
GLM 5.2 is already quite efficient with DS 3.2 (MLA, DSA) architecture with Sparse indexer only every fourth layer.
If for V6, GLM team adopts DS V4 architecture that cuts KV cache by 98% percent from GPQA, we shall have Mythos class model with a tiny KV cache and most weights in FP4. I look forward to that day.
It will do so much to bridge the digital divide in our world!!!
Credits:
======
@Zai_org , @louszbd and @jietang thank you so very much for building this beautiful and intelligent soul.
@lqiao - thank you for the Fireworks! Sorry took advantage of "Try in the playground feature" too long. Quality of implementation appears great. I hope you provide a studio for B2C user with web search and file upload. As Open models get stronger, many people will look for it.
Commenting on issues faced by some users with tool calls:
======
Other people have Snowflake CEO @RamaswmySridhar have reported it making far too many tool calls particularly when it is likely to fail. I have experienced it too in my other test setup.
This is mostly RL issue that paper like DAPO have covered (https://t.co/hPePlCrVCQ). @RichardYRLi - something for you to apply your brilliant mind too!!!
I am very optimistic for this model to overcome all those challenges with a giant org ZAI behind it - and SLIME was created by next door team. And they also work with brilliant @radixark@ying11231@BanghuaZ@mingyilu123
Has anyone tried ZCode (our own coding agent) yet? Especially the current v3.1.4: https://t.co/bjzN919rHx
It's been in beta for several weeks now, and the feedback has been super helpful. If this version proves stable, we may launch it officially soon.
Introducing GLM-5.2: Frontier Intelligence, Open Weights
- Significant improvements in coding and agentic tasks
- Strong long-horizon capabilities with a 1M context window
- Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency
- MIT-licensed open weights
- Same API pricing as GLM-5.1
Tech Blog: https://t.co/LAsxUdN0JZ
Weights: https://t.co/g0A1C4UWx4
API: https://t.co/Kc3E22cbN7
Coding Plan: https://t.co/Nk8Y98HNhU
Chat: https://t.co/WCqWT0qCQb
@hooeem What's underneath the surface in your experience? Data pipelines, eval frameworks, latency budgets, deployment infra. That's not shallow. That's where products live or die.
@eliana_jordan The fork isn't between titles. It's between engineers who adopt AI as leverage and engineers who treat it as a threat. Same title, completely different output.
respect the hustle, and for slow-moving markets this works. but you're comparing data retrieval to a trading terminal. Bloomberg isn't $24k because it shows you numbers. it's $24k because it executes before the sheet refreshes.
latency, order management, and position sizing are where the terminal earns its keep. Sheets will get you analysis, not alpha.
different games entirely.
the CI analogy is solid for deterministic workflows, but it breaks the moment you need judgment calls mid-pipeline.
not every step in a real task graph has a binary pass/fail gate.
sometimes step 3 changes the definition of step 7. rigid pipelines handle the predictable; agentic loops handle the emergent. the people shipping production systems are using both.
framing it as either/or is the actual dead end in my opinion
this confuses documentation with access. YC's playbooks have been public for over a decade. everyone has them. the 7% buys you something no agent can replicate: a signal to Sequoia that someone with a track record looked at you for 10 minutes and said yes. that's not a playbook, that's a door. whether that door is worth 7% is a different conversation, but pretending AI closes it is wishful thinking.
fair critique on the hype, but you're conflating two different problems. solo builders shipping with zero tests is an execution issue.
historically, every tool that democratized creation got dismissed by the establishment first.
the ones who won were those who used the new tool while keeping the old discipline.
the grifters are real and they're loud, but they're also not the whole picture. the issue is the incentive structure on this platform rewards talking about building over actually building. so you see 100 "just shipped" threads for every 1 real product with paying users. blame the medium, not the craft
the problem with a "how to indie hack" book is the target audience doesn't read books, they read threads. and the ones who actually succeed do it by ignoring advice and figuring it out themselves. readmake worked precisely because @levelsio just documented what he did, not what others should do. a compilation of multiple builders' actual decision trees (not journeys) would be different though
@eliana_jordan the problem with "showing up no matter what" is it can become a trap where persistence substitutes for strategy. if you're juggling too many projects, showing up harder isn't the answer. picking one and killing the rest is. what's the one thing you're focusing on now?
@swyx@claudeai This is why I write everything down. You never know when documentation becomes executable. Years of opinions just turned into working code.
@DanKulkov Yeah because X isn't a marketing channel (but it might turn into one, maybe...), it's a conversation platform that tolerates some marketing. Approach it like a megaphone and it punishes you.