OpenInfer

about 2 months ago

60 days. Three deals. One bet: agentic inference will not run on one chip. → Feb 24: Intel Xeon + SambaNova → Mar 13: NVIDIA Rubin + Groq LPX (disaggregated inference) → Last week: Meta + millions of AWS Graviton CPU cores Three stacks. Three processor mixes. One pattern. ━━━━━━━━━━ NVIDIA said it bluntly: "prefill and decode place different demands on hardware." Layer agentic behavior on top (tool calls, planning, retrieval, verification, multi-agent coordination) and the demands shift again on every step. The future of AI infrastructure isn't more GPUs. It's more kinds of compute, coordinated across a topology. ━━━━━━━━━━ Three things change once multi-processor agentic inference is the default: → Accelerator door opens. Every credible silicon player gets a seat. → Tail latency becomes an architectural decision, not a tuning problem. → Scalability shifts axis. Agentic inference > model inference. ━━━━━━━━━━ At @openInfer we call this vertical disaggregation. First proof point: Intel Xeon CPU + NVIDIA GPU. +50% capacity, zero additional GPUs. The harder problem is dynamic workloads, multiple models, across aggregated hardware. That's what agentic inference actually is: multiple SLAs, multiple models, dynamic behavior changes, served by multiple compute topologies. → Intel: CPU + accelerator layer → NVIDIA: GPU + LPU layer → Meta: CPU layer → Next race: the orchestration layer that knits them together Sources: Intel + SambaNova: https://t.co/nkhzNnMyJs NVIDIA Groq 3 LPX: https://t.co/dc158IEZTX Meta + AWS Graviton: https://t.co/72ulJNw4Wu Vertical Disaggregation (OpenInfer): https://t.co/io7e1kL2I6 #A

1

2

1

0

172

about 2 months ago

Before reaching for another GPU, audit the silicon you already have.

0

1

0

18

about 2 months ago

@bastani_behnam @chamath OpenInfer is the inference stack for this new Era.

0

11

openInfer retweeted

2 months ago

We just published how we unlocked +50% inference capacity on a 27B model — no new GPUs, no new nodes, at a fraction of the cost. Turns out the CPU sitting next to your GPU isn't dead weight. We just had to stop treating it like it was. Full breakdown ↓

$bastani_behnam's tweet photo. We just published how we unlocked +50% inference capacity on a 27B model — no new GPUs, no new nodes, at a fraction of the cost. Turns out the CPU sitting next to your GPU isn't dead weight. We just had to stop treating it like it was. Full breakdown ↓$

1

392

28

288

2M

2 months ago

We got +50% inference capacity on Qwen 3.5 27B. no new GPUs. no new nodes. just stopped ignoring the CPU. https://t.co/fUiXI205Pr

0

63

openInfer retweeted

2 months ago

@AnthropicAI's @openclaw Restrictions Exposed a Deeper Infrastructure Problem. OpenInfer Is the Fix and its free. We at @openInfer unlock and round background agents to hardware topologies that are designed for them: same model, fraction of the cost. Drop-in replacement. Zero code changes. FREE beta live now https://t.co/7BI9oiRD1l https://t.co/m1cyOO2DnC #openinfer #openclaw #anthropic #agenticAI

1

0

131

2 months ago

@sfumato_v @madzadev @grok @openclaw more information here : https://t.co/I25BKeTNRO also available through our web: https://t.co/4SIF995psx

0

248

2 months ago

@bastani_behnam Run openClaw for Free: Try our beta: https://t.co/4SIF995psx

0

1

0

317

2 months ago

Come and Try out our Beta (FREE): OpenClaws Restriction is Fixed We are opening up our Beta (https://t.co/4SIF995psx) hosting openClaw background task on lower end, complex cloud topologies, demonstrating value of an inference system built the agentic world.

1

0

1

167

3 months ago

We have building the infrastructure for Agentic flow in mind. What we saw with @AnthropicAI announcement is a demonstration that agents need to be treated differently than conversational ai

3 months ago

@EricBuess @bcherny probably an infra that is built for agentic applications. thoughts? something we have been cooking @openInfer https://t.co/crttugJNm1

0

1

0

191

0

1

86

openInfer retweeted

3 months ago

@EricBuess @bcherny probably an infra that is built for agentic applications. thoughts? something we have been cooking @openInfer https://t.co/crttugJNm1

0

1

0

191

openInfer retweeted

4 months ago

@OpenAI tripling revenue. @AnthropicAI at $14B ARR. @nvidia at $130B. Who's paying? Enterprises. But change is coming — cheaper AI, new competitors, margin compression. Some companies will get devalued. Others will explode. That's exactly why we built @openInfer Featured in @CIOonline https://t.co/aixhP4nMpc

0

2

1

0

59

5 months ago

The "wow" phase of AI is over. We’ve entered the era of 𝗣𝗿𝗮𝗰𝘁𝗶𝗰𝗮𝗹 𝗔𝗱𝗼𝗽𝘁𝗶𝗼𝗻. I’m excited to share my latest interview with @technewsworld. Special thanks to @jpmello for the great conversation on @OpenAI's 2026 strategy. Key focuses: 𝗔𝗜 𝗮𝘀 𝗜𝗻𝗳𝗿𝗮𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲: Moving from novelty to a foundational operating layer. 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗙𝘂𝘁𝘂𝗿𝗲: AI agents solving real-world problems in health and science. 𝗗𝗲𝗹𝗶𝘃𝗲𝗿𝗶𝗻𝗴 𝗥𝗢𝗜: Scaling to meet global enterprise needs. Now, the real work of transforming how the world functions begins—driven by the need for transformational infrastructure and @openInfer 𝗯𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗶𝗻𝗳𝗿𝗮 to support this new era. Full interview: https://t.co/uLVnGwg6wC #OpenAI #openInfer #AI #TechTrends #openinfer @TechNewsWorld

0

63

6 months ago

𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗶𝘀 �� 𝘀𝘆𝘀𝘁𝗲𝗺 𝗽𝗿𝗼𝗯𝗹𝗲𝗺, 𝗻𝗼𝘁 𝗮 𝗺𝗼𝗱𝗲𝗹 𝗼𝗻𝗲. This is the shift we are building OpenInfer for. #openinfer #edgeai #inference

6 months ago

𝗧𝗵𝗲 𝗻𝗲𝘅𝘁 𝗔𝗜 𝗰𝗼𝗺𝗽𝘂𝘁𝗲 𝘀𝗵𝗶𝗳𝘁 𝗶𝘀 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲. Edge data is exploding. Inference must move to data. This requires a ground-up system, not a model upgrade. NVIDIA + Groq is an early signal. 2026 is when inference infrastructure becomes the battleground.

0

3

1

474

0

92

openInfer retweeted

7 months ago

🎙️ New Podcast Episode I joined The Software Leaders UNCENSORED Podcast to talk about why the future of AI is at the edge and how we are building OpenInfer to make reliable, secure, and energy efficient physical AI possible. Here is what I cover: • How my experience across @Meta , @Google , and @Roblox shaped @openInfer's edge first mission • Why AI needs to run where data is created and how our unified stack makes that real • How we push innovation through custom inference and system mementos to bring datacenter level AI to the edge • What I learned from 250 enterprise leaders on why most AI projects fail • How to stay ahead in a field that changes every 90 days Full episode link in the comments. #AI #openinfer #edgeai #inference

1

0

86

7 months ago

Imagine a world physical AIs could recall the past, could come together and build a stronger reasoning. To make inference happen on edge, Memory Constraints needs to be addressed as a system architecture (HW+SW). @openinfer is sharing a glimpse of what is possible if an edge system is designed to recall the past. Check us out https://t.co/HOOIOqKeGs #openinfer #physicalai #inference #edgeai

7 months ago

Bringing inference to edge requires massive innovation around Memory system. Restructuring how inference on edge should be run, we are sharing a capability to remove the lack of meaningful on-device memory. Our latest release lets models hold persistent context, reason over larger spans, and collaborate intelligently. all running locally on the OpenInfer engine. This is how we break past edge memory limits. 🔗 https://t.co/YxEf5mJuUn #edgeAi #openinfer #mementos #inference

0

1

138

0

2

0

77

openInfer retweeted

7 months ago

Bringing inference to edge requires massive innovation around Memory system. Restructuring how inference on edge should be run, we are sharing a capability to remove the lack of meaningful on-device memory. Our latest release lets models hold persistent context, reason over larger spans, and collaborate intelligently. all running locally on the OpenInfer engine. This is how we break past edge memory limits. 🔗 https://t.co/YxEf5mJuUn #edgeAi #openinfer #mementos #inference

0

1

138

8 months ago

Exciting news: OpenInfer is now proudly part of @MicrosoftforStartups and @IntelPartnerAlliance! Together, we’re accelerating the future of #EdgeAI - bringing intelligence to every physical surface, from cloud to edge. Low latency, limited bandwidth, offline-ready, and zero cloud cost! https://t.co/kI8BDerJcU #EdgeAI #IntelPartnerAlliance #MicrosoftforStartups #BuiltwithMfS #openinfer

0

2

1

0

90

9 months ago

Here’s how it works: 1️⃣ Submit a one-pager idea by Oct 3, 2025 → [email protected] 2️⃣ We review & select top concepts 3️⃣ Finalists present live in San Mateo 4️⃣ Winners pitch to top VCs + access OpenInfer early!

0

58