#InferenceEngine - Twitter Hashtag

23 days ago

TokenSpeed: 에이전트형 워크로드를 위한 빠른 LLM 추론 엔진 (feat. Kimi K2.5, NVIDIA Blackwell) (by 9bow님) https://t.co/g9EjZERlyZ #llminference #kimik2 #tensorrtllm #blackwell #inferenceengine #agenticworkload #mlaattention #tokenspeed

0

21

✪ 𝕱𝖆𝖍𝖆𝖉

@fad_777

26 days ago

محرك استدلال مطور بالكامل بلغة Rust A fully developed inference engine in Rust offers efficient and reliable performance. Explore its potential for various applications. Learn more: https://t.co/OYmi9NBwwZ #RustProgramming #InferenceEngine #TechInnovation

0

19

Ravindra Dastikop @RavindraDa26088

2 months ago

Inference Engine in AI (Complete Guide You Need ) https://t.co/kmf9YyQpKp via @YouTube #AI #ArtificialIntelligence #AIEngineering #InferenceEngine #MachineLearning #DeepLearning #AITools #AIRevolution #ChatGPT #TechEducation #TAJ #AIYING

0

10

Ravindra Dastikop @RavindraDa26088

2 months ago

AI is NOT the model. It’s the system behind it. If you don’t understand inference… you’re just using AI — not building with it. 👉 Watch here: https://t.co/jS5cGtvKkH #AI #AIEngineering #InferenceEngine #ArtificialIntelligence #TAJ #AIYING

RavindraDa26088's tweet photo. AI is NOT the model. It’s the system behind it.

If you don’t understand inference…
you’re just using AI — not building with it.
👉 Watch here: https://t.co/jS5cGtvKkH

#AI #AIEngineering #InferenceEngine #ArtificialIntelligence #TAJ #AIYING https://t.co/ajbyYRPd1X

0

6

Ravindra Dastikop @RavindraDa26088

2 months ago

AI feels instant. But inside → a full system is running. Input → Process → Output That’s the Inference Engine. That’s where AI actually works. https://t.co/idt1x1cOY1 #AI #ArtificialIntelligence #InferenceEngine #AIEngineering #MachineLearning #DeepLearning #TAJ #AIYING

RavindraDa26088's tweet photo. AI feels instant.
But inside → a full system is running.
Input → Process → Output

That’s the Inference Engine.
That’s where AI actually works.

https://t.co/idt1x1cOY1

#AI #ArtificialIntelligence #InferenceEngine #AIEngineering #MachineLearning #DeepLearning #TAJ #AIYING https://t.co/hapGnOV1Bl

0

2

chen @kailunchen5

2 months ago

一个前端工程师，vibe两天从零用rust撸了一个推理引擎。从没写过 GPU kernel，全程自然语言驱动 AI 实现。811 tok/s，TTFT 4.6× SGLang。 blog https://t.co/nb1DEYzMGZ https://t.co/aW8CtUz3lv #LLM #VibeCoding #Rust #CUDA #InferenceEngine #OpenSource #AIEngineering #BuildInPublic

1

2

1

4

2K

chen @kailunchen5

2 months ago

一个前端工程师，用 vibe coding 两天撸了一个推理引擎。从没写过 GPU kernel，全程自然语言驱动 AI 实现。811 tok/s，TTFT 4.6× SGLang。 https://t.co/nb1DEYzMGZ #LLM #VibeCoding #Rust #CUDA #InferenceEngine #FlashInfer #AgentOS #OpenSource #AIEngineering #BuildInPublic

0

2

0

99

Q O R A N E T @Qora_Net

3 months ago

A full range of Rust models and GUI platform with a highly intelligent AI Assistant (AGI) for all screen and mobile coming soon Enjoy 😎 #Rust #LLM #AI #MachineLearning #OpenSource #EdgeAI #InferenceEngine #blockchain #AIagent

0

40

SmartX @smartx_hq

3 months ago

🤯 Feeling lost with #AI deployment terms? Our latest blog breaks down 10 key concepts you need to know, like #AIInfrastructure, #inferenceengine, #ModelOps, #MLOps, #LLMOps, #MaaS, and more! Let’s fast-track your AI journey 👉 https://t.co/TdCJKa3CRC

smartx_hq's tweet photo. 🤯 Feeling lost with #AI deployment terms? Our latest blog breaks down 10 key concepts you need to know, like #AIInfrastructure, #inferenceengine, #ModelOps, #MLOps, #LLMOps, #MaaS, and more!
Let’s fast-track your AI journey 👉 https://t.co/TdCJKa3CRC https://t.co/reT51Emszt

0

10

Chouaib @PremiumN21

4 months ago

logicalinference. com live auction at dropcatch #LogicalInference #ArtificialIntelligence #AIReasoning #MachineReasoning #InferenceEngine #LogicAI #DeepLearning #AIML #CognitiveAI #KnowledgeGraph #NeuralReasoning #SmartAI #FutureOfAI #AIInnovation #DataIntelligence

PremiumN21's tweet photo. logicalinference. com live auction at dropcatch

#LogicalInference
#ArtificialIntelligence
#AIReasoning
#MachineReasoning
#InferenceEngine
#LogicAI
#DeepLearning
#AIML
#CognitiveAI
#KnowledgeGraph
#NeuralReasoning
#SmartAI
#FutureOfAI
#AIInnovation
#DataIntelligence https://t.co/OZAo7XRLV8

0

11

ManuAGI 🤖 - ( ManuIn )

@ManuAGI01

6 months ago

🚀Project Number 1 - Inference Engine 🔥 'High-performance cloud engine for real-time AI' by @gmi_cloud #aiagents #ai #opensource #automation #researchtools #cloudai #videagents #medtech #learningtools #InferenceEngine

1

0

22

FrankieJonshon @FrankieJonshon

6 months ago

Here’s another solid option: ✨ Inference is where real AI power shows up turning complex models into real-time results. Smarter systems, sharper performance. The future is live. #AI #InferenceEngine #Innovation @inference_labs

0

1

0

2K

GMI Cloud Japan @gmicloudJapan

7 months ago

Inference Engine 2.0 がアップデート。 Kimi K2Thinking・Minimax M2・DeepSeek V3・GPT OSS・QWen 3 など主要 LLM が、1 つのコンソールで稼働。 API は 1 本。画像・モーション・テキスト生成まで、モデル切替もスムーズな統合環境です。 #GMICloud #InferenceEngine #AI #LLM #MultimodalAI

0

213

ANKIT PATEL 🇮🇳 | AI

@Ankit_patel211

7 months ago

What is Inference Engine 2.0? It unites text, images, audio, and video in one spot! Learn more in this quick blog—under 5 minutes! Blog link - https://t.co/vFOXx0DKOr… #GMICloud #InferenceEngine #AI

GMI Cloud

@gmi_cloud

7 months ago

What exactly is Inference Engine 2.0? How does it unite text, image, audio, and video in one place — and why does it matter? This blog breaks it all down in under 5 minutes: https://t.co/dAtO9ISqtR #GMICloud #InferenceEngine #AI #Innovation #MultimodalAI #ML #LLM #Videogen #GenerativeAI

gmi_cloud's tweet photo. What exactly is Inference Engine 2.0?

How does it unite text, image, audio, and video in one place — and why does it matter?

This blog breaks it all down in under 5 minutes: https://t.co/dAtO9ISqtR

#GMICloud #InferenceEngine #AI #Innovation #MultimodalAI #ML #LLM #Videogen #GenerativeAI

2

11

1

3

518

1

2

1

0

117

GMI Cloud

@gmi_cloud

7 months ago

What exactly is Inference Engine 2.0? How does it unite text, image, audio, and video in one place — and why does it matter? This blog breaks it all down in under 5 minutes: https://t.co/dAtO9ISqtR #GMICloud #InferenceEngine #AI #Innovation #MultimodalAI #ML #LLM #Videogen #GenerativeAI

2

11

1

3

518

GMI Cloud

@gmi_cloud

7 months ago

You’ve seen the launch. Now see it in motion. Every video model — Kling v2.5, Hailuo 2.3, Sora 2, Veo 3.1, Flux-kontext-pro — running live on Inference Engine 2.0 @Kling_ai @Hailuo_AI @Alibaba_Wan @LumaLabsAI #GMICloud #InferenceEngine #AI #MultimodalAI #VideoAI

GMI Cloud

@gmi_cloud

7 months ago

Best models. One playground. Welcome to Inference Engine 2.0- came from infra, moved with creation. Where speed meets intuition — and builders become inventors. 1.46× faster, 49 % more throughput, one API for all

81

179

66

669K

5

44

11

8

11K

Josh

@altphotos_pl

7 months ago

Inference Engine 2.0 on @GMICloud = insanely fast ⚡ Perfect for devs or creators using AI, image, and video tools. Here’s what it can do 👇 #GMICloud #InferenceEngine #ad

3

72

3

10

43K

Farhan Azad Shuvra

@Faazsh

7 months ago

- Multimodal-native by design - NVIDIA Cloud Lepton Partner performance - Elastic scaling for real-time responsiveness This isn’t theoretical speed. It’s production-grade performance - built for scale. #AI #InferenceEngine #GMICloud

1

3

0

109

GMI Cloud Japan @gmicloudJapan

7 months ago

⚡GLM-4.6 推論速度ベンチマークで、GMI Cloud は 73 tokens/sec を記録し、第2位にランクイン！高性能な Inference Engine が、大規模 Reasoning モデルでも安定した処理を実現。 📊 Source: Artificial Analysis #GMICloud #InferenceEngine #GLM46 #GenerativeAI #AIインフラ

Artificial Analysis

@ArtificialAnlys

8 months ago

GLM-4.6 providers overview: we are benchmarking API endpoints offered by Baseten, GMI, Parasail, Novita, Deepinfra GLM-4.6 (Reasoning) from @Zai_org is one of the most intelligent open weights models, with intelligence close to GPT-OSS-120b (high), DeepSeek V3.2 Exp (Reasoning) and Qwen3 235B 2507 (Reasoning), and is attracting strong interest for agentic coding in particular. It’s now being served by a range of cloud inference providers. Key benchmarking takeaways ➤⚡ Speed: Baseten is serving the fastest GLM-4.6 endpoint at 104 output tokens per second, followed by GMI (73 t/s), Parasail (68 t/s), and Novita (66 t/s) ➤⏳ Latency: We track TTFT (time to first token) and TTFAT (time to first answer token). For reasoning models, TTFAT is the key metric as it marks when users first see usable output. Generation of reasoning tokens has the same performance profile as generation of answer tokens, so the main driver of TTFAT is output speed - as opposed to standard TTFT which is driven by prefill (input processing) performance. Baseten leads on TTFAT with a 19.4s result, with GMI following at 28.4s TTFAT despite having the longest TTFT ➤$ Pricing: GLM-4.6 (Reasoning) is priced very consistently across providers, costing $0.6/M input tokens and $1.9/M output tokens on Deepinfra, $0.6/$2 on GMI, $0.6/$2.1 on Parasail, and $0.6/$2.2 on Novita and Baseten ➤🪟 Context window: All providers support the full 200k token context window ➤🧰 Supported tools: All providers support tool calling with GLM-4.6, with JSON mode availability confirmed on GMI, Novita, Parasail and Baseten @baseten @gmi_cloud @parasail_io @novita_labs @DeepInfra