Top Tweets for #InferenceEngine
TokenSpeed: 에이전트형 워크로드를 위한 빠른 LLM 추론 엔진 (feat. Kimi K2.5, NVIDIA Blackwell)
(by 9bow님)
https://t.co/g9EjZERlyZ
#llminference #kimik2 #tensorrtllm #blackwell #inferenceengine #agenticworkload #mlaattention #tokenspeed
محرك استدلال مطور بالكامل بلغة Rust
A fully developed inference engine in Rust offers efficient and reliable performance. Explore its potential for various applications. Learn more: https://t.co/OYmi9NBwwZ
#RustProgramming #InferenceEngine #TechInnovation
Inference Engine in AI (Complete Guide You Need ) https://t.co/kmf9YyQpKp via @YouTube
#AI #ArtificialIntelligence #AIEngineering #InferenceEngine #MachineLearning #DeepLearning #AITools #AIRevolution #ChatGPT #TechEducation #TAJ #AIYING
AI is NOT the model. It’s the system behind it.
If you don’t understand inference…
you’re just using AI — not building with it.
👉 Watch here: https://t.co/jS5cGtvKkH
#AI #AIEngineering #InferenceEngine #ArtificialIntelligence #TAJ #AIYING

AI feels instant.
But inside → a full system is running.
Input → Process → Output
That’s the Inference Engine.
That’s where AI actually works.
https://t.co/idt1x1cOY1
#AI #ArtificialIntelligence #InferenceEngine #AIEngineering #MachineLearning #DeepLearning #TAJ #AIYING

一个前端工程师,vibe两天从零用rust撸了一个推理引擎。从没写过 GPU kernel,全程自然语言驱动 AI 实现。811 tok/s,TTFT 4.6× SGLang。
blog https://t.co/nb1DEYzMGZ
https://t.co/aW8CtUz3lv
#LLM #VibeCoding #Rust #CUDA #InferenceEngine #OpenSource #AIEngineering #BuildInPublic
一个前端工程师,用 vibe coding 两天撸了一个推理引擎。从没写过 GPU kernel,全程自然语言驱动 AI 实现。811 tok/s,TTFT 4.6× SGLang。
https://t.co/nb1DEYzMGZ
#LLM #VibeCoding #Rust #CUDA #InferenceEngine #FlashInfer #AgentOS #OpenSource #AIEngineering #BuildInPublic
A full range of Rust models and GUI platform with a highly intelligent AI Assistant (AGI) for all screen and mobile coming soon Enjoy 😎
#Rust #LLM #AI #MachineLearning #OpenSource #EdgeAI #InferenceEngine #blockchain #AIagent
🤯 Feeling lost with #AI deployment terms? Our latest blog breaks down 10 key concepts you need to know, like #AIInfrastructure, #inferenceengine, #ModelOps, #MLOps, #LLMOps, #MaaS, and more!
Let’s fast-track your AI journey 👉 https://t.co/TdCJKa3CRC

logicalinference. com live auction at dropcatch
#LogicalInference
#ArtificialIntelligence
#AIReasoning
#MachineReasoning
#InferenceEngine
#LogicAI
#DeepLearning
#AIML
#CognitiveAI
#KnowledgeGraph
#NeuralReasoning
#SmartAI
#FutureOfAI
#AIInnovation
#DataIntelligence

🚀Project Number 1 - Inference Engine 🔥
'High-performance cloud engine for real-time AI'
by @gmi_cloud
#aiagents #ai #opensource #automation #researchtools #cloudai #videagents #medtech #learningtools #InferenceEngine
Here’s another solid option:
✨ Inference is where real AI power shows up turning complex models into real-time results. Smarter systems, sharper performance. The future is live. #AI #InferenceEngine #Innovation
@inference_labs
Inference Engine 2.0 がアップデート。
Kimi K2Thinking・Minimax M2・DeepSeek V3・GPT OSS・QWen 3 など主要 LLM が、1 つのコンソールで稼働。
API は 1 本。
画像・モーション・テキスト生成まで、モデル切替もスムーズな統合環境です。
#GMICloud #InferenceEngine #AI #LLM #MultimodalAI
What is Inference Engine 2.0?
It unites text, images, audio, and video in one spot!
Learn more in this quick blog—under 5 minutes!
Blog link - https://t.co/vFOXx0DKOr…
#GMICloud #InferenceEngine #AI
What exactly is Inference Engine 2.0?
How does it unite text, image, audio, and video in one place — and why does it matter?
This blog breaks it all down in under 5 minutes: https://t.co/dAtO9ISqtR
#GMICloud #InferenceEngine #AI #Innovation #MultimodalAI #ML #LLM #Videogen #GenerativeAI

What exactly is Inference Engine 2.0?
How does it unite text, image, audio, and video in one place — and why does it matter?
This blog breaks it all down in under 5 minutes: https://t.co/dAtO9ISqtR
#GMICloud #InferenceEngine #AI #Innovation #MultimodalAI #ML #LLM #Videogen #GenerativeAI

You’ve seen the launch. Now see it in motion.
Every video model — Kling v2.5, Hailuo 2.3, Sora 2, Veo 3.1, Flux-kontext-pro — running live on Inference Engine 2.0
@Kling_ai @Hailuo_AI @Alibaba_Wan @LumaLabsAI
#GMICloud #InferenceEngine #AI #MultimodalAI
#VideoAI
Best models. One playground.
Welcome to Inference Engine 2.0- came from infra, moved with creation.
Where speed meets intuition — and builders become inventors.
1.46× faster, 49 % more throughput, one API for all
Inference Engine 2.0 on @GMICloud = insanely fast ⚡
Perfect for devs or creators using AI, image, and video tools.
Here’s what it can do 👇
#GMICloud #InferenceEngine #ad
- Multimodal-native by design
- NVIDIA Cloud Lepton Partner performance
- Elastic scaling for real-time responsiveness
This isn’t theoretical speed.
It’s production-grade performance - built for scale.
#AI #InferenceEngine #GMICloud
⚡GLM-4.6 推論速度ベンチマークで、GMI Cloud は 73 tokens/sec を記録し、第2位にランクイン!
高性能な Inference Engine が、大規模 Reasoning モデルでも安定した処理を実現。
📊 Source: Artificial Analysis
#GMICloud #InferenceEngine #GLM46 #GenerativeAI #AIインフラ
GLM-4.6 providers overview: we are benchmarking API endpoints offered by Baseten, GMI, Parasail, Novita, Deepinfra
GLM-4.6 (Reasoning) from @Zai_org is one of the most intelligent open weights models, with intelligence close to GPT-OSS-120b (high), DeepSeek V3.2 Exp (Reasoning) and Qwen3 235B 2507 (Reasoning), and is attracting strong interest for agentic coding in particular. It’s now being served by a range of cloud inference providers.
Key benchmarking takeaways
➤⚡ Speed: Baseten is serving the fastest GLM-4.6 endpoint at 104 output tokens per second, followed by GMI (73 t/s), Parasail (68 t/s), and Novita (66 t/s)
➤⏳ Latency: We track TTFT (time to first token) and TTFAT (time to first answer token). For reasoning models, TTFAT is the key metric as it marks when users first see usable output. Generation of reasoning tokens has the same performance profile as generation of answer tokens, so the main driver of TTFAT is output speed - as opposed to standard TTFT which is driven by prefill (input processing) performance. Baseten leads on TTFAT with a 19.4s result, with GMI following at 28.4s TTFAT despite having the longest TTFT
➤$ Pricing: GLM-4.6 (Reasoning) is priced very consistently across providers, costing $0.6/M input tokens and $1.9/M output tokens on Deepinfra, $0.6/$2 on GMI, $0.6/$2.1 on Parasail, and $0.6/$2.2 on Novita and Baseten
➤🪟 Context window: All providers support the full 200k token context window
➤🧰 Supported tools: All providers support tool calling with GLM-4.6, with JSON mode availability confirmed on GMI, Novita, Parasail and Baseten
@baseten @gmi_cloud @parasail_io @novita_labs @DeepInfra

⚡ テキスト一行、映像が動き出す。
GMI Cloud Inference Engine の舞台裏を、ディレクターが公開。
言葉がシーンを生み、AIが世界を描く。
#GMICloud #InferenceEngine #生成AI
Last Seen Hashtags on Sotwe
Most Popular Users

Elon Musk 
@elonmusk
240.1M followers

Barack Obama 
@barackobama
119.3M followers

Donald J. Trump 
@realdonaldtrump
111.6M followers

Cristiano Ronaldo 
@cristiano
108.7M followers

Narendra Modi 
@narendramodi
106.9M followers

Rihanna 
@rihanna
97.2M followers

NASA 
@nasa
92.1M followers

Justin Bieber 
@justinbieber
90.5M followers

KATY PERRY 
@katyperry
86.7M followers

Taylor Swift 
@taylorswift13
80.5M followers

Lady Gaga 
@ladygaga
72.1M followers

Kim Kardashian 
@kimkardashian
69.3M followers

YouTube 
@youtube
68.6M followers

Virat Kohli 
@imvkohli
68.4M followers

Bill Gates 
@billgates
63.3M followers

The Ellen Show
@theellenshow
62.5M followers

CNN 
@cnn
61.9M followers

Neymar Jr 
@neymarjr
60.9M followers

X 
@x
60.9M followers

CNN Breaking News 
@cnnbrk
59.9M followers














