Top Tweets for #FastInference
@GroqInc Have not seen it in the docs: is it possible to select the data center location for the model? Asking of EU customers... #ai #fastinference
This outputs whole responses in the time it takes other inference providers to output one token!!! #fastinference
17,000 tokens per second!! Read that again!
LLM is hard-wired directly into silicon. no HBM, no liquid cooling, just raw specialized hardware. 10x faster and 20x cheaper than a B200.
the "waiting for the LLM to think" era is dead. Code generates at the speed of human thought.
Transition from brute-force GPU clusters to actual AI appliances.
https://t.co/Bf6DH7Q6Uf

We just released Kimi-Linear-REAP-35B-A3B-Instruct (30% pruned from 48B). Showing REAP’s robustness on Hybrid-attention MoEs, lighter footprint, more context headroom.
🤗 https://t.co/fKUb9wdr96
📄 https://t.co/ou9mi9WRdn
🔗 https://t.co/IkOnWpwyIv
#Cerebras #FastInference #Kimi
API service provided by Llama 4 through Cerebras. IBM also announced similar fast inference services for large customers. #API #Llama4 #Cerebras #hyperscaler #IBM #fastinference #Meta #developerfriendly #mobilefirst #bigdata #technology #innovation https://t.co/ute7fvWAT2
Just tested three cutting‑edge AI services—@GoogleGemini, @huggingface, and @GroqAI.
From multi‑citation reports and editable docs to domain‑specific chatbots and sub‑100 ms responses—AI workflows unlocked!
#ALX_AiSK @alx_africa #AI #DeepResearch #NoCode #FastInference

🚀 Dived into Groq today — amazed by its lightning-fast AI performance powered by custom hardware! Speed meets intelligence in real-time. ⚡🧠 #GroqAI #FastInference #AIInnovation

@deepseek_ai 🤖 PublicAI's decentralized workforce is the perfect match for NSA's hardware-aligned sparse attention! Imagine the AI training data goldmine we could uncover by crowdsourcing at blockchain-scale. 🧠💰 #FastInference #CostEfficient
🚀 Fast Inference in Generative AI = Real-time translations, instant content, seamless AI!
💡 Learn with:
✅ Prompt Engineering with Claude
✅ AI for Everyone
🌐 Join the revolution: https://t.co/j9ap7fp055
#AI #GenerativeAI #LearnAI #FastInference

1/2
NLP/search folks, asking for help!
What is the fastest way to run bge-m3 on single GPU. Trying to do fineweb-edu filtering on xx million samples so looking for really efficient implementation. #NLP #Embeddings #CUDA #tensorrt #onnx #fastinference
Groq API: Make your AI Applications Lighting Speed
Subscribe for more: https://t.co/etKQrfzOXl
YT: https://t.co/34gFFoE1nW
@GroqInc @JonathanRoss321
#GroqAPI #FastInference #AI #LLM
Ehh? A quantized Llama-2-7b-hf with a starting phrase
"Once upon a time, "
#fastinference

Atlas 300I Inference Card (Model: 3000/3010) - Achieve lightning-fast inference results with this powerful card! #Atlas300I #InferenceCard #FastInference...Contact us : Tel/wtsap +966-11-8359915....E: [email protected]

Revolutionary robot RT - 1 model from Google will change the game for robotics. #driverlesscarprogram #fastinference #generalizingnewtasks #generativeAImodels #Google #GPTmodel #multipledifferentrobots #newenvironments #objects #OpenAI #realtimecontrol
https://t.co/LPJl5d2Rgl

Here is a #Deeplearning technical blog written by one of our DataToBizer Vikas Kumar Ojha
#T5 #optimization #fastinference
https://t.co/uZ3X1OOyuN https://t.co/0CU4ejpdhL
Video alert for the #ML2R #YouTube channel! @sbuschjaeger (@sfb876/@TU_Dortmund) introduces the #FastInference tool. Uniting ensemble pruning & leaf refinement for #RandomForests, #MachineLearning models are applied resource-efficiently on small devices.
📽️https://t.co/RZ82wIeXzB

2. We have my #fastinference library:
https://t.co/WnTlqRuoqH_onnx() -> Model is run through ONNX ort and DataLoaders (which are the same blank DataLoaders from 1) get exported.
Still requires that all related functions are in the namespace still 4/
This little project/idea has been on my mind since I first released #fastinference, however at the time I didn't quite know enough on how to pull it off. Thanks to what I learned along the way while writing this (https://t.co/xwKBILpVjP) article, I figured out what's needed
First off I built #fastinference, a @fastdotai extension geared towards making inference with the fastai more approachable, faster at times, and include integrations with libraries such as ONNX. 2/ https://t.co/tHVi21k0zg
This question is geared towards my @fastdotai extension #fastinference. In the pytorch-version only, what would be the *minimal* augmentation you would expect available at runtime after training your model (from a vision perspective)
Last Seen Hashtags on Sotwe
wetpolly
Seen from Turkey
saatchiart
Seen from New Zealand
RicardoDeAngelis
Seen from Brazil
burdurpasif
Seen from Turkey
gebzegay
Seen from Turkey
บางแพ
Seen from Thailand
sneaky
Seen from United States
malaysiaviral
Seen from Malaysia
KayleyGunner
Seen from United States
latinafemdom
Seen from Argentina
Most Popular Users

Elon Musk 
@elonmusk
240.2M followers

Barack Obama 
@barackobama
119.3M followers

Donald J. Trump 
@realdonaldtrump
111.6M followers

Cristiano Ronaldo 
@cristiano
108.8M followers

Narendra Modi 
@narendramodi
106.9M followers

Rihanna 
@rihanna
97.2M followers

NASA 
@nasa
92.1M followers

Justin Bieber 
@justinbieber
90.5M followers

KATY PERRY 
@katyperry
86.7M followers

Taylor Swift 
@taylorswift13
80.5M followers

Lady Gaga 
@ladygaga
72.1M followers

Kim Kardashian 
@kimkardashian
69.3M followers

YouTube 
@youtube
68.6M followers

Virat Kohli 
@imvkohli
68.4M followers

Bill Gates 
@billgates
63.4M followers

The Ellen Show
@theellenshow
62.5M followers

CNN 
@cnn
61.9M followers

Neymar Jr 
@neymarjr
61M followers

X 
@x
60.9M followers

CNN Breaking News 
@cnnbrk
59.9M followers























