AMD acaba de dar un golpe fuerte en la IA local.
Lisa Su subió al escenario con un mini PC del tamaño de un libro grueso en una sola mano y ejecutó en vivo un modelo de 235 mil millones de parámetros. Sin datacenter. Sin cloud. Sin alquilar GPUs.
El protagonista es el Ryzen AI Max+ 395 (Strix Halo). Es el primer chip x86 que une CPU y GPU con 128 GB de memoria unificada. En Linux, el GPU puede usar hasta ~110 GB de esa memoria.
Para ponerlo en contexto: una RTX 5090 tiene 32 GB y una 4090 tiene 24 GB. Este pequeño equipo ofrece más del triple de memoria accesible para modelos grandes, en un chasis compacto.
En pruebas específicas de inferencia (como DeepSeek R1), superó en más de 3x al rendimiento de una RTX 5080 cuando el modelo no cabe en la VRAM de la tarjeta de Nvidia.
El precio real del equipo con 128 GB (GMKtec EVO-X2) suele estar entre $1,800 y $2,500 según ofertas (el kit oficial de AMD es más caro).
Para quien usa mucho IA, esto cambia las cuentas: en vez de pagar cientos de dólares al mes en suscripciones (Claude, ChatGPT Pro, Cursor, etc.), puedes correr modelos potentes localmente con Ollama, LM Studio o similares. Privacidad total, sin límites de tokens y sin que te corten el servicio a las 3 a.m.
No es que las suscripciones vayan a desaparecer mañana, pero para muchos casos de uso (RAG con documentos privados, prototipos, agentes locales, etc.) esta opción se vuelve muy atractiva.
Estamos viendo el inicio de una nueva etapa de IA local accesible y potente??
$pltr today is $nvda in 2016
1. it owns a monopoly-like position in digital twins
2. it has captured less than 0.01% of its total addressable market
3. its growth accelerates as it scales, making it a true Singularity Scaler.
$APLD backlog went from $7B to $36B in twelve months. The deck still calls the Polaris Forge 2 tenant an unnamed hyperscaler.
The bond documents say $ORCL, and lenders re-priced the debt 250 basis points cheaper on that name. $ORCL
$PLTR --- $PLTR — the “government dependency” narrative is dead and buried. Q1 2026 U.S. commercial revenue surged 133% YoY, with full-year 2026 guidance raised to 120% growth (implying >$3.22B scale). Proof: its AIP (AI Platform) has hit full-scale monetization in the enterprise via its proprietary Bootcamp model.
On June 4, PLTR officially launched on Google Cloud Marketplace, enabling deep two-way integration between Foundry ↔ BigQuery and Gemini ↔ AIP. It also announced an exclusive partnership with top law firm Kirkland & Ellis to rebuild private equity fundraising with AI; expanded construction AI with McCarthy; and upgraded its SAP collaboration with proprietary enterprise data migration tools.
1. From “PPT Hype” to “Cold Hard Cash”: The Ultimate AI Beneficiary
The AI value chain follows three stages: Compute (NVDA, Clouds) → LLMs (OpenAI, Anthropic) → End-Use Apps & Software. By 2026, markets are tired of pure concept plays and laser-focused on who actually makes enterprises money with AI.
Palantir’s Ontology architecture solves the critical pain point: LLMs cannot natively process enterprises’ messy, siloed internal data. It is one of the very few companies that seamlessly weave AI Agents into daily enterprise workflows to drive real productivity — the chokepoint layer of the AI software era.
2.Dual Engines: Commercial + Government = Unshakable Moat
Enterprise Side: The Bootcamp model delivers custom AI solutions in days, driving sky-high retention & expansion (e.g., 133% U.S. commercial growth).
Gov/Defense Side: Geopolitical tensions make defense AI non-negotiable. Palantir holds the highest IL5/IL6 security certifications (extended to edge deployment). UK MoD multi-year £100M+ renewals and full deployment of Ukraine’s Brave1 Dataroom fueled 84% Q1 growth for defense — an ironclad earnings floor.
3. S&P 500 Weighting Premium + Explosive Free Cash Flow
PLTR is consistently GAAP-profitable (net income positive for multiple quarters), with industry-leading gross margins generating massive quarterly free cash flow. Its inclusion in the S&P 500 has unlocked passive fund buying that provides long-term share price support.
#NEWS: Another one. 💥
Applied Digital has signed a 210 MW lease at Delta Forge 2, expanding our AI Factory franchise model to a fifth campus. $APLD
⚡ 1.4 GW of total leased capacity
⚡ 5 campuses across multiple states
⚡ Multiple hyperscaler customers
⚡ ~$36B+ contracted lease revenue
⚡ ~$86B+ including renewal options
Applied Digital continues to demonstrate that its AI Factory franchise model can be replicated across geographies while maintaining the execution required to support the next generation of AI.
Read the Press Release → https://t.co/BdWyWY2qd7
We're Hiring → https://t.co/aHXa23ZKp4
Hey @ChatGPTapp how $BB QNX works
QNX is a real-time operating system (RTOS) developed by BlackBerry. Think of it as the software foundation that sits between a device’s hardware and the applications running on it.
Simple Example: Modern Car
A modern vehicle may have dozens of computers controlling:
Dashboard displays
Navigation
Cameras
Driver-assistance systems
Battery management (EVs)
Audio and infotainment
QNX manages these functions and ensures critical tasks happen on time, every time.
Why QNX Is Different
Most consumer operating systems (Windows, Android, Linux) are designed for flexibility and user features.
QNX is designed for:
✅ Reliability
✅ Safety
✅ Security
✅ Real-time response
For example, if a vehicle camera system needs a response within milliseconds, QNX guarantees that timing.
The “Microkernel” Design
Unlike traditional operating systems, QNX uses a tiny core called a microkernel.
f(x)=\text{Critical Functions Isolated from Noncritical Functions}
Conceptually:
The microkernel handles only essential functions.
Drivers and services run separately.
If one component crashes, the entire system often keeps running.
This is why automakers and industrial companies like it.
Where QNX Is Used
Vehicles from companies such as Mercedes-Benz, BMW, Volkswagen, and many others
Medical equipment
Factory robots
Railway systems
Industrial automation
Aerospace systems
How BlackBerry Makes Money
BlackBerry typically licenses QNX to manufacturers.
A simplified flow:
Automaker buys QNX licenses.
QNX is embedded in millions of vehicles.
BlackBerry collects licensing fees and royalties.
Additional tools, support, and safety certifications generate more revenue.
$AMD's heading to $5T MC LT| Lowest $/M tokens 🧵
The real reason why Institutions are FOMOing into AMD while other Semi stocks are underperforming ($NVDA $AVGO)
Not Financial Advice! DYOR!
Under Dr. Lisa Su’s leadership, @AMD has transformed from a distant challenger into a formidable force in AI infrastructure, delivering the industry’s most compelling TCO story for high-volume inference. Her clear vision open ecosystems, aggressive annual roadmaps, rack-scale innovation, and relentless focus on tokens-per-dollar has positioned AMD’s Helios racks as the go-to solution for hyperscalers and AI natives struggling with exploding token costs, collapsing the cost down to $0.0003-$0.0005/M tokens. I will link various threads on this analysis to supply chain and wafer ratio if you are interested in understanding the full picture.
In the last 3-4 months, explosive Agentic AI demand significantly increased Inference demand for Agentic AI models with 5-10 agents. If you are a listener of CNBC or Bloomberg, u should know enterprises and companies are complaining abt cost of token, and how it starts to spike up way too much to make sense. The fact that most data center today are run by $NVDA Chips, where the cost is way too high for Training or Inference.
1. Token cost
Here are some quick comp, so u understand why $META @OpenAI@AnthropicAI $MSFT $AMZN Softbank $GOOGL and many more small to medium AI Natives are buying AMD CPUs and GPUs as much as they want, or pretty much AMD chips are sold out for the next 3-5 years.
Inference (Cost per Million Tokens)
~$NVDA B200 / HGX: ~$0.02–$0.08 on optimized workloads (FP4/MXFP4, speculative decoding). Significant improvement over Hopper but still premium-priced. GB200 NVL72 rack-scale: $0.05–$0.25+
~$AMD Helios Racks: $0.0003-$0.0005 per M tokens, dramatically lower than NVIDIA equivalents in owned infra. MI355X node-level: Up to 40% more tokens per dollar vs. competing solutions ( B200), driven by higher memory capacity (up to 288GB+ HBM), strong bandwidth, and lower acquisition costs.
Training
~$NVDA Rubin Rack is estimated $0.7-$1.2/M Tokens
~$AMD Helios Rack is estimated $0.65-$1.0/M Tokens
2. Why Hyperscalers and AI Natives Are Choosing AMD
Token consumption (especially Agentic) is outpacing even NVIDIA’s efficiency gains, making diversification mandatory for economic viability. Massive deals reflect this reality like $META, @OpenAI, $MSFT, Softbank, $AMZN, Oracle, LumaAI, G42...
Dr. Lisa Su’s Vision in Action: Since taking the helm, Su has driven AMD’s turnaround with disciplined execution, annual GPU cadence (MI300 → MI350 → MI400), full-stack software (ROCm 7), open ecosystems (UALink, OCP designs), and customer-centric rack-scale solutions like Helios. Her emphasis on “tokens per dollar” and TCO has turned AMD into the pragmatic choice for sustainable AI scaling.
Power/Energy Efficiency:
~Helios Rack-level is estimated at 120kW-140kW with 50% more HBM4 where Inference and Training cost matter
~Rubin Rack-Level is estimated at 160kW-230kw
AMD Helios shines in owned TCO, memory density, and energy flexibility at hyperscale.
Cost to build 1GW data center
1GW Helios Rack full build is estimated $30-$35B
1GW Rubin Rack full build is estimated $45-$55B
3. Superior CPUs to pair with GPUs on massive scale 5-10-20GW
Agentic AI. autonomous, multi-step workflows with orchestration, tool use, parallel agents, data movement, and enterprise integration has dramatically increased the importance of strong host CPUs alongside GPUs. This shifts the CPU-to-GPU ratio higher and makes balanced systems critical toward 1:1 to 5:1 as enterprises testing more than 5-10 agents.
AMD EPYC Venice excels
~Leadership core density (up to 256 Zen 6 cores per socket) for running many agents in parallel, orchestration layers, and high-throughput control-plane tasks.
~Superior performance-per-core and power efficiency ( up to 2.1x higher perf/core and 2.26x better SPECpower vs. NVIDIA Grace in benchmarks).
~Tight integration in Helios: One Venice CPU + multiple MI450 GPUs per node, enabling efficient data feeding to GPUs ("zero-copy"), parallel execution, and full rack utilization for complex agentic loops.
Hyperscalers (Meta, Microsoft, Amazon, Google, Softbank) and AI natives (OpenAI, Anthropic...) are adopting high-core EPYC at scale specifically for these agentic demands, as CPUs now handle a larger share of non-model work (orchestration, policy enforcement, tool calls). This complements AMD’s lower-cost GPUs for overall TCO wins.
Conclusion:
NVIDIA’s Vera Rubin cannot compete with a 2 years old EPYC Turin, but AMD under Dr. Lisa Su has engineered the lowest cost-per-million-tokens, highly competitive energy-efficient solutions, and superior CPU orchestration for agentic AI at scale with Helios. Dr. Su has championed this shift since at least 2023, foreseeing the rise of agentic workflows that demand far more orchestration, parallel agents, and balanced compute well before the industry fully embraced it. Her long-term vision of AI moving from simple prompts to always-on, multi-agent systems has driven AMD’s investments in high-core EPYC CPUs and integrated rack-scale solutions, perfectly positioning the company for today’s realities.
Hyperscalers and AI natives effectively have no choice but to buy more AMD system for Agentic AI as leadership in economical, power-aware, high-volume internal + agentic use. However, due to supply constraints where Supply is far behind Demand, this makes multi-vendor reality along with in-house chips drive faster industry progress, lower overall costs, and better sustainability.
Not Financial Advice! DYOR!
Video source: Microsoft Build 2026
https://t.co/AYcvtohfE5
Enterprise AI ready where you are. Supermicro PCIe GPU solutions featuring AMD Instinct™ MI350P GPUs simplify on-prem deployment with scalable AI performanceEnterprise AI ready where you are. Supermicro PCIe GPU solutions featuring AMD Instinct™ MI350P GPUs simplify on-prem deployment with scalable AI performanceEnterprise AI ready where you are. Supermicro PCIe GPU solutions featuring AMD Instinct™ MI350P GPUs simplify on-prem deployment with scalable AI performanceEnterprise AI ready where you are. Supermicro PCIe GPU solutions featuring AMD Instinct™ MI350P GPUs simplify on-prem deployment with scalable AI performanceEnterprise AI ready where you are. Supermicro PCIe GPU solutions featuring AMD Instinct™ MI350P GPUs simplify on-prem deployment with scalable AI performance.