AI service firms are commanding 30x multiples right now. Yes, thirty.
That's why a16z, Sequoia, and YC are chasing services, not SaaS.
Most agencies will see this and reach for the wrong move. They'll keep selling hours, bolt on AI, and cut headcount to pad the margin.
But that's playing the small game.
Here's why:
00:00 Why Services Beat SaaS
01:13 The $1 Software vs $6 Services Opportunity
02:52 Why Managed Growth Loops Matter
04:49 Agents, Loops, and Human Judgment
06:43 How Single Brain Powers AI Service Businesses
07:22 The Services-as-Software Manifesto
08:41 The New AI-Native Org Chart
10:13 Building Outcome-Based Offers
11:13 Final Thoughts
@WhoIsFishie interesting .. my approach was to do a 75mx75m grid search on google maps for businesses and then search the names - i was reluctant to go the three letter combo. seems like that’s the way to go
🎉 Introducing https://t.co/rz7zaXEFY9 : The AI-Powered Dhivehi Content Generation Tool! 🚀
(Our Voice input feature is in beta and is improving rapidly)
Today marks a big milestone toward the rebirth of supersonic passenger flight. For the first time in decades, a civil supersonic airplane has taken to the skies. The @boomsupersonic XB-1 met all of its first flight objectives—and we’re only going further, higher, and faster from here.
Groq is serving the fastest responses I've ever seen. We're talking almost 500 T/s!
I did some research on how they're able to do it. Turns out they developed their own hardware that utilize LPUs instead of GPUs. Here's the skinny:
Groq created a novel processing unit known as the Tensor Streaming Processor (TSP) which they categorize as a Linear Processor Unit (LPU). Unlike traditional GPUs that are parallel processors with hundreds of cores designed for graphics rendering, LPUs are architected to deliver deterministic performance for AI computations.
The LPU's architecture is a departure from the SIMD (Single Instruction, Multiple Data) model used by GPUs and favor a more streamlined approach that eliminate the need for complex scheduling hardware. This design allows every clock cycle to be utilized effectively, ensuring consistent latency and throughput.
For developers, this means that performance can be precisely predicted and optimized which is critical in real-time AI applications.
Energy efficiency is another area where LPUs shine. By reducing the overhead of managing multiple threads and avoiding the underutilization of cores, LPUs can deliver more computations per watt.
Groq's innovative chip design allows multiple TSPs to be linked together without the traditional bottlenecks found in GPU clusters making them extremely scalable. This enables linear scaling of performance as more LPUs are added simplifying the hardware requirements for large-scale AI models and making it easier for developers to scale their applications without rearchitecting their systems.
So what does this all mean? LPUs could provide a massive improvement compared to GPUs for serving AI applications in the future! If anything it will be great to have alternative high performing hardware since A100s and H100s are so in demand
Beam 1.5.0 is live and it's by far my favorite...
Select any area on your screen and ask ChatGPT about it.
One click as everything with Beam. It does not get any easier than that...