While everyone keeps talking about how good Fable 5 was, Japan already left them behind.
Sakana Fugu Ultra is not just another interesting model. It is currently the best model in the world and most people still haven’t realized it.
The industry remains obsessed with Fable 5 and Mythos as if they were the ceiling, when a Japanese lab just proved there is a superior way of doing things.
Here is the real potential it is offering right now:
It surpasses or matches Fable 5 on several of the toughest coding and agentic reasoning benchmarks.
It uses intelligent multi-agent orchestration instead of depending on a single giant model.
It delivers frontier performance without the export control problems affecting American models.
A single API that does the heavy lifting of coordinating agents, verifiers, and specialists for you.
It changes the logic: it is no longer just about who has the biggest model, but who orchestrates it better.
It opens the door to a new paradigm where collective intelligence beats individual intelligence.
The ones still tied to a single model thinking that is the maximum are already becoming obsolete.
Japan didn’t ask for permission. They just won.
#SakanaFugu #AI #JapanAI #FrontierModels
# Building a Conversational Lead Qualification Agent — Where I Got Stuck
Hey everyone, sharing something I've been working on and where I hit a wall. If you are building support or conversational intake agents, I would love your input.
I am building a text-first agent that qualifies leads via chat (multi-turn, data capture, emergency handling, Q&A). I wired up a vector retrieval dataset plus an LLM for response drafting.
The problem: everything connected, tests green, but in real chat it still sounded like a form. It was closing emergency leads with just 2 data points, ignoring user questions mid-flow, and asking for the same field in fragments.
The root cause I found: the dataset was feeding how the agent sounded, not what it decided. The LLM was drafting a well-written version of a wrong business decision. Result: a dumb question, written beautifully.
How I am handling it. I split responsibilities:
Retrieval informs · Policy engine decides · Flow executes · Guardrails protect · LLM only humanizes
The LLM never decides whether a lead is ready. That is handled by a deterministic engine. And I split emergency handling into two stages: ready for callback is not the same as lead complete. Capture the minimum for a human to call immediately, but keep enriching without losing urgency.
It has improved a lot, but there are still fine details to work out: tone, handling user questions without breaking the flow, knowing when to close.
My question to anyone who has done this in production:
How do you balance determinism vs LLM in flow control?
Where do you draw the line between the model decides and the rules decide?
What bit you when you first shipped it?
Any advice or references are appreciated 🙏 @btaylor
The uncomfortable truth about AI agents:
A smart LLM does not mean a useful agent.
You can have the best model in the world.
But if you let it respond on its own, without context, without structure, and without operational memory, it starts failing in real situations.
It repeats phrases.
It misses the intent.
It asks the wrong questions.
It sounds robotic.
And when the user is under pressure, the agent breaks.
That is exactly what we are fixing in GcClaw.
We are not building “another chatbot.”
We are adding real industry metadata.
We are classifying intents.
We are organizing scenarios.
We are creating specific context for each type of contractor.
We are using RAG, few-shot examples, and operational signals so the agent does not just respond…
It understands what is happening.
Because in construction, customers do not always write perfectly.
Sometimes they say:
“My roof is leaking”
“I need someone today”
“Not sure what happened”
“How much will it cost?”
And the agent has to know if that means emergency, estimate, repair, inspection, or a hot lead.
That is where most agents fail.
Not because they lack intelligence.
Because they lack context.
The future of AI agents is not just bigger models.
It is better metadata.
Better classification.
Better context.
Better workflows.
Better memory.
And more human responses.
GcClaw is being built for that:
An agent that understands the industry before trying to sell to the customer.
Because if your AI does not understand the business, it is just improvising.
And in construction, improvising costs money.
We keep building.
#AI #ConstructionTech #Contractors #AIAgents #BuiltInPublic #Startup
It’s good to be back.
I was sick for a few days, and I also needed to make an important decision about GcClaw.
The launch was planned for Monday, but GcClaw has not launched yet.
I want to apologize to everyone who has been following the project and supporting the cause.
The reason is simple:
We want to launch fast, yes.
But we don’t want to launch something that does not meet the standard the construction market requires.
An agent for contractors cannot be just a good-looking demo.
It has to understand the industry.
It has to capture real leads.
It has to detect urgency.
It has to ask the right questions.
It has to work when a customer writes poorly, gives incomplete answers, or changes intent.
And most importantly, it has to help contractors stop losing jobs.
We were able to get a little more time to keep polishing the product before opening it to the public.
I would rather be transparent now than sell hype later.
Over the next few days, I’ll be sharing more about what GcClaw can already do, the tests we’re running, and why we’re making these decisions.
Thank you to everyone who continues to support.
We keep building.
GcClaw was not canceled.
GcClaw is getting ready to launch the right way.
#AI #ConstructionTech #Contractors #Startup #BuiltInPublic
Most contractors keep asking the same question: how much does AI cost?
I just got out of a Slack meeting and heard it again.
If you're still starting there in 2026, your business falls behind in the next 3 years. Not because of budget. Because you're asking the questions in the wrong order.
Before asking what it costs, this is what a business needs to do when it wants AI:
1. Ask if you actually need it. If your operation is chaos, AI doesn't fix it. It accelerates it.
2. Map your process first. Where time gets lost, where leads get lost, where money leaks. Without this, you're buying technology blind.
3. Pick ONE measurable bottleneck. Not "I want AI everywhere." One. The lead that comes in Sunday at 9 PM and nobody answers until Monday, for example.
4. Calculate what THE PROBLEM costs you, not the tool. If you lose 3 jobs a month from responding late, that's thousands of dollars. That's what the investment gets compared against.
5. Small pilot, clear metric, 30 days. It works, you scale. It doesn't, you adjust. That's how you implement — not with an annual contract for something nobody will use.
The right question was never how much AI costs.
It's how much the problem you don't want to look at is costing you.
#ConTech #AI #Construction #AgenticAI
This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.
I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!
# The Biggest Threat to AI in Construction Isn't the Technology. It's the Cost.
Over the last two years, we've seen a race to implement AI.
More agents.
More copilots.
More automation.
More models.
Now the market is asking the question that actually matters:
Where is the ROI?
Sam Altman recently acknowledged that AI budgets have become a growing concern for many companies.
And this will have a major impact on the construction industry.
Because construction doesn't need more impressive AI demos.
It needs:
✅ Less rework
✅ Fewer RFIs
✅ Fewer project delays
✅ Lower cost overruns
✅ Higher productivity per project manager
✅ Better margins on every project
The next competitive advantage in construction won't come from who deploys the most AI.
It will come from who can turn AI into measurable business outcomes.
$1 invested in AI → $10 in operational return.
We're entering a new phase.
The era of AI adoption is ending.
The era of AI ROI is beginning.
In construction, the winners won't be the companies with the most advanced AI.
They'll be the companies that use AI to reduce costs, accelerate decisions, improve project execution, and increase profitability.
Everything else is noise.
#ConstructionTech #ConTech #AI #AgenticAI #ConstructionInnovation #DigitalConstruction #FutureOfConstruction #AEC #BuiltEnvironment
The integration experience in @gohighlevel is unnecessarily complex for tasks that should be straightforward. What appears to be a simple setup often requires multiple configuration steps, validations, and technical adjustments. The documentation does not always reflect the real-world challenges users face during implementation. As a result, adding an integration becomes a frustrating and resource-intensive process that negatively impacts productivity and user experience.
We're running some tests with Llama 4 Scout instead of Maverick.
We chose Scout because we need the agent to be realistically deployable with controlled costs in on-demand setups. It gives us a better balance between performance and practicality for the kind of solution we're building.
We're specializing it as a vertical agent for construction, integrating RAG with real project documentation, BIM tools, structured generation of RFIs and change orders, and multimodal capabilities to work with jobsite photos.
We're already testing it with real project data. The goal is to build agents that actually integrate into contractors' day-to-day operations.
If your company is facing similar challenges adopting AI in construction, we'd like to connect.
#ConTech #ConstructionAI #VerticalAI #Llama4
These are the 10 operational workflows every serious construction company should be automating with AI agents right now:
1. RFIs
2. Submittals
3. Change Orders
4. Daily Reports
5. Meeting Minutes
6. Document Control
7. Schedule Updates
8. Pay Applications
9. Punch Lists
10. QA/QC + Safety Reports
An actual AI agent doesn’t just “read documents.”
It analyzes drawings, specifications, emails, field photos, and project notes… and then it takes action.
- Prepares an RFI ready to send
- Compares a submittal against the specs
- Flags deviations before they cost money
- Detects scope changes that are quietly killing your margin
- Generates daily reports automatically
- Summarizes meetings and extracts action items
- Alerts you when documents are outdated
- Organizes closeout without driving everyone insane
And this is where many people get confused:
“Claude is better.”
“ChatGPT is better.”
“Gemini is better.”
Of course they’re good.
They are private companies with billions in infrastructure.
But in construction, the real advantage is not always using the most famous model.
The advantage is building specialized workers connected to your data, your documents, your workflows, and your internal rules.
Lower cost.
More control.
More context.
More ownership.
The next wave of ConTech will not be another beautiful piece of software.
It will be **Agentic Construction Operations**.
Which of these workflows would you automate first in your company?
#ConTech #Construction #AIAgents #AgenticAI #ConstructionTechnology #ProjectManagement
Real tips to get the most out of Grok Build 🔥
I just tested the new coding agent from xAI in depth and here are my tips of what really works:
1. Install it with a single command and run it inside your project (`cd your-project && grok`). Always start with clear prompts like “Explain this repo” so it understands the full context.
2. Use Plan Mode in complex tasks. It shows you the step-by-step plan, clean diffs and lets you approve, comment or rewrite before it touches the code. This avoids costly mistakes.
3. Take advantage of parallel sub-agents. For large projects Grok launches several specialized agents at the same time (git, tests, research, etc.) and you can see their progress in real time.
4. Create custom Skills with the /skillify command. Turn repeated flows into reusable commands and save a lot of time.
5. Add an AGENTS.md file to your project. Grok reads it automatically and respects your conventions, rules and style preferences.
6. Activate always-approve when you already trust the flow to go faster, but use it carefully.
7. Use headless mode (`grok -p "your prompt"`) to integrate it into scripts, CI/CD or automations.
8. Install plugins and connect MCP servers (Linear, Postgres, browsers, etc.) so Grok can interact with your external tools.
9. At the end of each session use /feedback to help the xAI team improve it quickly.
Grok Build is not just another code assistant. It is a complete agent that plans, builds, tests and deploys with you. If you are on SuperGrok or X Premium+ and develop seriously, try it now.
Are you using Grok Build? What tip has helped you the most?
#GrokBuild #xAI #CodingAgent #BuildFaster
My first time using Grok Build and it blew my mind.
I hope the @xai team and @elonmusk keep improving Grok Build nonstop.
I tried it for the first time today and it is clear to me: it is the agent you need to take your project from idea to production quickly and seriously.
It helps you think, build, iterate, and deploy as if you had an ultra-capable technical cofounder by your side 24/7.
If you are building something and want to get serious, try Grok Build.
#Grok #xAI #BuildInPublic #GrokBuild
Traveling to Mississippi today.
I have a friend who builds around 150 homes per year with his company.
I offered to help him implement AI back in January.
The fear of adopting something new is understandable.
When you run a serious company, you don’t bring in technology just because it is trending.
First, you need to understand whether it can truly protect margin, reduce friction, improve speed, and give the team more control.
Since then, we started with basic workflows.
Nothing exaggerated.
No empty promises.
Just starting to organize processes, understand where time is being lost, and identify where AI could create real impact.
Today, he finally decided to implement AI seriously.
And that is what I enjoy the most about these types of projects.
Learning the real workflows.
Studying how the company operates from the inside.
Seeing where information gets lost.
Where projects get delayed.
Where margin leaks.
Where teams keep making decisions late because the data is scattered.
In construction, AI is not about having a nice chatbot.
It is about connecting estimating, scheduling, purchasing, change orders, warranty, field updates, and communication to make better decisions.
The data also shows that we are still early.
In homebuilding, much of AI adoption is still concentrated in marketing and planning.
But the strongest opportunity is in operations.
Production.
Costs.
Schedule.
Margin.
And for serious companies, this has to be done the right way.
With local agents.
Private data.
Security.
Connected workflows.
Because you cannot throw sensitive project, client, vendor, and financial information into any tool and expect that to scale.
The future of AI in construction will not be one chatbot for everything.
It will be secure, connected, workflow-specific agents running on real processes.
Real work.
Real pain.
Real opportunity.
#ConstructionAI #Homebuilding #AgenticAI #ConTech