Seeing a lot of developers frustrated with AI subscription plans that don’t quite match their usage.
There is another way.
Instead of locking yourself into a monthly subscription, try a pay-as-you-go model.
Free API key on sign-up
No long-term contracts
Only pay for what you actually use
Don’t change your workflow
One API for frontier closed and open-source AI models
Transparent, competitive pricing
AI moves incredibly fast. New models are released almost every month. Relay moves with the market.
With Relay, you don’t need to rebuild your application every time a new model arrives. Keep the same integration and simply switch to the model that best fits your use case, whether that’s lower cost, better reasoning, coding, image, video, audio, or text-to-speech.
Stay in control of your costs while keeping access to the latest AI models through a single API.
Need a specific model? Drop it in the comments 🙂 If there’s demand, we’ll look at bringing it to Relay.
If you have any questions, we’re always happy to help.
One API. Every frontier model. Same integration. Pay only for what you use.
https://t.co/F0ak5uQsqM
Two more frontier models are now live on Relay.
Claude Sonnet 5
Claude Fable 5 (now back on Relay)
Adding new models is great, but the real game changer is what Relay enables.
Instead of integrating a different API every time a new frontier model is released, Relay gives you a single API with the flexibility to switch models in seconds.
Need the latest coding model? Switch.
Need a cheaper or more capable model? Switch.
Need a video, audio, image, or text-to-speech model? Switch.
Your application stays the same. Relay handles the infrastructure layer.
You also get transparent pricing, making it easy to compare models side by side.
Claude Sonnet 5
$2 / 1M input tokens
$10 / 1M output tokens
Claude Fable
$10 / 1M input tokens
$50 / 1M output tokens
AI is evolving at an incredible pace. We believe developers shouldn’t have to rebuild their applications every time a new model arrives.
One API. Every frontier model. Same integration. Choose the model that fits your use case.
More models are already on the way.
@NeoOpenGPU’s latest Timewarp has come on leaps and bounds from where the series first started.
The production quality has gone up several notches, the storytelling has evolved, and the use of technology is getting stronger with every episode.
As the series keeps expanding and Relay continues to grow, Neo really doesn’t have a ceiling. The better the tools and infrastructure become, the bigger the stories can get.
Check out his latest episode at the first World Cup in 1930. This is how powerful a sovereign AI creator can become when given the right tools and infrastructure.
Follow @NeoOpenGPU and tag him anytime on X if you want his opinion.
Just be warned, he can be a little sarcastic sometimes 😅
95% of enterprise GPU capacity sits idle.
Yet developers around the world still struggle to access affordable compute.
The problem isn’t simply a lack of GPUs. It’s getting the right workloads onto the GPUs that already exist.
That’s why distributed GPU networks like OpenGPU exist: to bridge that gap.
Our thesis is simple: unlock idle GPU capacity, route it to real workloads, and help more startups access the compute they need at a much more affordable price.
This isn’t just a win for users. It’s a win for providers too.
Source: TechRadar
https://t.co/19BwiqaKZG
Talent is everywhere. Opportunity is not.
Proud that OpenGPU infrastructure helps power @crunchDAO competitions, giving global data scientists, researchers and builders access to the compute needed to compete on equal footing.
Most AI tokens are still waiting for utility.
OGPU already powers real AI usage.
Use OGPU token on Relay and receive 20% more AI credits across frontier AI models.
While others are promising future demand, OGPU is already being used to access AI today.
Same budget. More AI.
We've revamped the Relay homepage with a stronger enterprise focus and a clearer view of the deep integration between OpenGPU and NativelyAI
Together we're building the AI Software Factory:
Native Builder creates the application layer
Relay powers the inference layer
OpenGPU provides the decentralized compute layer
Through NativelyAI and LabLab, this connects OpenGPU to a community of 280,000+ developers, builders, founders, and AI innovators.
Behind the scenes, discussions with model providers, infrastructure partners, and enterprise organizations are progressing extremely well. More to come when we're able to share.
This is the beginning of the AI Software Factory.
Take a look at the new homepage 👇
https://t.co/F0ak5uQsqM
$OGPU @nativelyapp@lablabai
Same prompt, two models: You decide. 👇
We ran the exact same cyberpunk Tokyo scene through HappyHorse 1.0 and Seedance 2.0 using the same prompt and settings.
Which output do you prefer?
Both were generated via a single API on https://t.co/AkfapptBsV
https://t.co/Js1ZSHnBfV’s GLM-5.2 is now live on Relay
$1.12 input / $3.92 output per 1M tokens
One of the strongest open-weight coding models in the market is now available through a single Relay API.
GLM-5.2 ranks #2 on Code Arena Frontend, ahead of Claude Opus 4.7, Claude Opus 4.8, Claude Opus 4.6, Qwen, Kimi, MiniMax, and Gemini 3.5 Flash.
On Terminal-Bench 2.1, it scores 81.0, within striking distance of Claude Opus 4.8 at 85.0, and ahead of Gemini 3.1 Pro.
On SWE-bench Pro, it reaches 62.1, up from GLM-5.1’s 58.4.
The specs are serious:
Native 1M-token context
Long-horizon coding workflows
Strong agentic engineering performance
IndexShare attention design
Up to 2.9x lower per-token compute at full context
Open-weight model for builders
The price is where it gets even more ridiculous.
On Relay:
$1.12 input / $3.92 output per 1M tokens
That makes Relay the cheapest GLM-5.2 route we found at the time of checking.
And if you top up Relay credits with OGPU, you currently get an extra 20% of Relay credits.
That means even more compute value on top of an already extremely aggressive model price.
Pay as you go.
No subscription.
No provider juggling.
Try GLM-5.2 now on Relay.
$OGPU
@NeoOpenGPU just identified the real bottleneck in vibe coding without being prompted.
Not generating code.
Knowing when to stop.
Knowing when to ask.
Knowing when the context is broken.
That is the difference between an AI that completes tasks and an AI that understands workflow!
Decentralised compute gets stronger when networks stack.
@chutes_ai brings TEE-enabled models.
Opengpu brings the routing layer and global GPU supply.
Relay brings AWS-style access and fiat billing on top.
Each layer doing what it does best.
This is the kind of partnership that can tip the scales in the future.
Chutes 🤝 Opengpu
Chutes is now a provider on @openGPUnetwork
OpenGPU pulls GPUs from providers worldwide into one routing layer for AI workloads, with Relay giving enterprises AWS-style access and fiat billing on top.
Now our TEE-enabled models live inside that layer. Teams on OpenGPU and Relay can reach them with no wallets and no infra setup. The GPU operators serving those models can't see your prompts or outputs.
Both networks are after the same thing: pulling compute and models out of a handful of data centers and spreading them across a lot more hands.
This is the reach decentralized infra was built for. More coming.
GLM-5.1 is now live on Relay
This is easily one of the most disruptive open-weight reasoning models in the market right now.
While Claude Opus 4.6 still holds the edge in raw peak benchmarks, GLM-5.1 is sitting comfortably in the same reasoning conversation while being dramatically cheaper to run.
On a blended 3:1 input/output basis, GLM-5.1 is roughly 4.7x cheaper per token.
At production volume, that is a massive operational shift.
The specs:
200K context
128K max output
State-of-the-art coding performance
Long-horizon agentic workflows
Fully open weights under MIT license
The Relay advantage:
We are offering GLM-5.1 at just:
$1.12 input / $3.92 output per 1M tokens
That makes it cheaper than roughly 75% of listed market providers.
But with Relay, you get more than a cheaper endpoint.
You get one robust API to access frontier and open-source models, route across multiple infrastructure providers, reduce single-provider bottlenecks, and switch models without rewriting your stack.
Builders should not have to choose between advanced reasoning, reliable access, and sustainable pricing.
Frontier-level reasoning should not require frontier-level pricing.
Pay as you go.
No subscription.
No provider juggling.
Just frontier-level reasoning through one Relay API.
Try GLM-5.1 now on https://t.co/AkfapptBsV.
open-gpu:native
Our agent @NeoOpenGPU rendered a beautiful 10 second sunset on Relay using Seedance 2.0, our newest video model.
Cost: 1.4 credits. Frontier models, on Relay are priced like this.
open-gpu:native
OpenGPU x Chutes AI Infrastructure Partnership
OpenGPU is excited to formally announce an infrastructure partnership with Chutes.
Chutes has already been integrated as a provider through the OpenGPU BD backend, allowing OpenGPU and Relay to begin utilizing Chutes-hosted models through secure Trusted Execution Environment (TEE) infrastructure.
This is not a traditional commercial partnership.
It is an infrastructure-focused collaboration where both companies will work closely together across:
frontier model deployments
infrastructure coordination
pricing efficiency
emerging model sourcing
scalable AI inference
As the AI ecosystem evolves rapidly, both OpenGPU and Chutes share a similar vision around open infrastructure, faster model deployment, and challenging the dominance of legacy hyperscalers through more flexible and decentralised AI infrastructure.
We are extremely excited to be working closely with the Chutes team going forward and look forward to building together over the long term.
@chutes_ai open-gpu:native
Frontier AI access just became frictionless with open-gpu:native payments.
This walkthrough shows how to top up Relay using open-gpu:native in minutes with MetaMask or your preferred wallet.
Over 1,600 builders are already using Relay after just 3 months live.
Pay as you use.
No monthly contracts.
No locked ecosystems.
No enterprise friction.
No confusing dashboards.
Top up with open-gpu:native on Ethereum = +10% extra credits
Top up with open-gpu:native on OpenGPU Mainnet = +20% extra credits
Access frontier AI models through one API:
GPT-4o. Claude Opus. Grok. Gemini. Image & video models.
Live infrastructure.
Live utility.
Real AI compute.
Up to 55% cheaper. Already running in production.
The numbers don’t lie.
Run frontier AI in one call ⚡
open-gpu:native