#FastInference - Twitter Hashtag

3 months ago

@GroqInc Have not seen it in the docs: is it possible to select the data center location for the model? Asking of EU customers... #ai #fastinference

0

27

Coenraad Loubser 🇿🇦

@dagelf

3 months ago

This outputs whole responses in the time it takes other inference providers to output one token!!! #fastinference

Wildminder

@wildmindai

3 months ago

17,000 tokens per second!! Read that again! LLM is hard-wired directly into silicon. no HBM, no liquid cooling, just raw specialized hardware. 10x faster and 20x cheaper than a B200. the "waiting for the LLM to think" era is dead. Code generates at the speed of human thought. Transition from brute-force GPU clusters to actual AI appliances. https://t.co/Bf6DH7Q6Uf

wildmindai's tweet photo. 17,000 tokens per second!! Read that again!
LLM is hard-wired directly into silicon. no HBM, no liquid cooling, just raw specialized hardware. 10x faster and 20x cheaper than a B200.
the "waiting for the LLM to think" era is dead. Code generates at the speed of human thought.
Transition from brute-force GPU clusters to actual AI appliances.
https://t.co/Bf6DH7Q6Uf

383

7K

934

5K

2M

0

2

0

1

53

Vithu Thangarasa

@vithursant19

7 months ago

We just released Kimi-Linear-REAP-35B-A3B-Instruct (30% pruned from 48B). Showing REAP’s robustness on Hybrid-attention MoEs, lighter footprint, more context headroom. 🤗 https://t.co/fKUb9wdr96 📄 https://t.co/ou9mi9WRdn 🔗 https://t.co/IkOnWpwyIv #Cerebras #FastInference #Kimi

0

6

0

3

424

wAIve.online @wAIve_online

7 months ago

#vLLM #vLLMServings #FastInference #LLMInference #DistributedInference #PagedAttention

0

18

CUBE365 Clips @clipper_video

about 1 year ago

API service provided by Llama 4 through Cerebras. IBM also announced similar fast inference services for large customers. #API #Llama4 #Cerebras #hyperscaler #IBM #fastinference #Meta #developerfriendly #mobilefirst #bigdata #technology #innovation https://t.co/ute7fvWAT2

0

1

0

146

Mohamed Elbaz @MohamedElb28128

about 1 year ago

Just tested three cutting‑edge AI services—@GoogleGemini, @huggingface, and @GroqAI. From multi‑citation reports and editable docs to domain‑specific chatbots and sub‑100 ms responses—AI workflows unlocked! #ALX_AiSK @alx_africa #AI #DeepResearch #NoCode #FastInference

MohamedElb28128's tweet photo. Just tested three cutting‑edge AI services—@GoogleGemini, @huggingface, and @GroqAI.
From multi‑citation reports and editable docs to domain‑specific chatbots and sub‑100 ms responses—AI workflows unlocked!
#ALX_AiSK @alx_africa #AI #DeepResearch #NoCode #FastInference https://t.co/B0GZsUHG0z

0

2

0

53

kayley @naomikayley

about 1 year ago

🚀 Dived into Groq today — amazed by its lightning-fast AI performance powered by custom hardware! Speed meets intelligence in real-time. ⚡🧠 #GroqAI #FastInference #AIInnovation

naomikayley's tweet photo. 🚀 Dived into Groq today — amazed by its lightning-fast AI performance powered by custom hardware! Speed meets intelligence in real-time. ⚡🧠 #GroqAI #FastInference #AIInnovation https://t.co/ofKfiKyj3O

0

23

freedomlife @kzk88888

over 1 year ago

@deepseek_ai 🤖 PublicAI's decentralized workforce is the perfect match for NSA's hardware-aligned sparse attention! Imagine the AI training data goldmine we could uncover by crowdsourcing at blockchain-scale. 🧠💰 #FastInference #CostEfficient

0

86

ai guru @aiguruin

over 1 year ago

🚀 Fast Inference in Generative AI = Real-time translations, instant content, seamless AI! 💡 Learn with: ✅ Prompt Engineering with Claude ✅ AI for Everyone 🌐 Join the revolution: https://t.co/j9ap7fp055 #AI #GenerativeAI #LearnAI #FastInference

aiguruin's tweet photo. 🚀 Fast Inference in Generative AI = Real-time translations, instant content, seamless AI!

💡 Learn with:
✅ Prompt Engineering with Claude
✅ AI for Everyone
🌐 Join the revolution: https://t.co/j9ap7fp055
#AI #GenerativeAI #LearnAI #FastInference https://t.co/3gB4Iy2r3L

0

14

Rasmus Toivanen @RasmusToivanen

almost 2 years ago

1/2 NLP/search folks, asking for help! What is the fastest way to run bge-m3 on single GPU. Trying to do fineweb-edu filtering on xx million samples so looking for really efficient implementation. #NLP #Embeddings #CUDA #tensorrt #onnx #fastinference

1

0

2

117

Mervin Praison

@MervinPraison

over 2 years ago

Groq API: Make your AI Applications Lighting Speed Subscribe for more: https://t.co/etKQrfzOXl YT: https://t.co/34gFFoE1nW @GroqInc @JonathanRoss321 #GroqAPI #FastInference #AI #LLM

2

41

2

22

11K

Shwu Ku(n) @shwooobham

over 2 years ago

Ehh? A quantized Llama-2-7b-hf with a starting phrase "Once upon a time, " #fastinference

0

32

mounika reddy @bmr_mounika

almost 3 years ago

Atlas 300I Inference Card (Model: 3000/3010) - Achieve lightning-fast inference results with this powerful card! #Atlas300I #InferenceCard #FastInference...Contact us : Tel/wtsap +966-11-8359915....E: [email protected]

bmr_mounika's tweet photo. Atlas 300I Inference Card (Model: 3000/3010) - Achieve lightning-fast inference results with this powerful card! #Atlas300I #InferenceCard #FastInference...Contact us : Tel/wtsap +966-11-8359915....E: bmr@ritco.sa.com https://t.co/RHKfjtqPAV

0

1

0

1

16

NeuroHub.ai @neurohub_ai

over 3 years ago

Revolutionary robot RT - 1 model from Google will change the game for robotics. #driverlesscarprogram #fastinference #generalizingnewtasks #generativeAImodels #Google #GPTmodel #multipledifferentrobots #newenvironments #objects #OpenAI #realtimecontrol https://t.co/LPJl5d2Rgl

neurohub_ai's tweet photo. Revolutionary robot RT - 1 model from Google will change the game for robotics. #driverlesscarprogram #fastinference #generalizingnewtasks #generativeAImodels #Google #GPTmodel #multipledifferentrobots #newenvironments #objects #OpenAI #realtimecontrol

https://t.co/LPJl5d2Rgl https://t.co/h2PVCSaViQ

1

2

1

64

parindsheel @ParindDhillon

almost 4 years ago

Here is a #Deeplearning technical blog written by one of our DataToBizer Vikas Kumar Ojha #T5 #optimization #fastinference https://t.co/uZ3X1OOyuN https://t.co/0CU4ejpdhL

0

Lamarr Institute @LamarrInstitute

over 4 years ago

Video alert for the #ML2R #YouTube channel! @sbuschjaeger (@sfb876/@TU_Dortmund) introduces the #FastInference tool. Uniting ensemble pruning & leaf refinement for #RandomForests, #MachineLearning models are applied resource-efficiently on small devices. 📽️https://t.co/RZ82wIeXzB

LamarrInstitute's tweet photo. Video alert for the #ML2R #YouTube channel! @sbuschjaeger (@sfb876/@TU_Dortmund) introduces the #FastInference tool. Uniting ensemble pruning & leaf refinement for #RandomForests, #MachineLearning models are applied resource-efficiently on small devices.
📽️https://t.co/RZ82wIeXzB https://t.co/K3XkpIvDmg

0

4

3

1

0

Zach Mueller @ CVPR

@TheZachMueller

about 5 years ago

2. We have my #fastinference library: https://t.co/WnTlqRuoqH_onnx() -> Model is run through ONNX ort and DataLoaders (which are the same blank DataLoaders from 1) get exported. Still requires that all related functions are in the namespace still 4/

1

2

0

Zach Mueller @ CVPR

@TheZachMueller

over 5 years ago

This little project/idea has been on my mind since I first released #fastinference, however at the time I didn't quite know enough on how to pull it off. Thanks to what I learned along the way while writing this (https://t.co/xwKBILpVjP) article, I figured out what's needed

1

8

1

0

Zach Mueller @ CVPR

@TheZachMueller

over 5 years ago

First off I built #fastinference, a @fastdotai extension geared towards making inference with the fastai more approachable, faster at times, and include integrations with libraries such as ONNX. 2/ https://t.co/tHVi21k0zg

1

12

1

0

Zach Mueller @ CVPR

@TheZachMueller

over 5 years ago

This question is geared towards my @fastdotai extension #fastinference. In the pytorch-version only, what would be the *minimal* augmentation you would expect available at runtime after training your model (from a vision perspective)

3

0

1

0

Top Tweets for #FastInference

Last Seen Hashtags on Sotwe

Trends for you

Most Popular Users