Instantly summarize any article on Twitter into bullet points 👇
1. Reply "@getsummari summarize" to the tweet with the article link
2. We'll reply with the summary
🔥 We have two huge announcements today! 🔥
We’re thrilled to announce early access to our new Baseten Hybrid offering and our launch on the @googlecloud Marketplace!
Links to the announcement blogs and waitlist for Hybrid in the comments below 👇
With Baseten Hybrid, you have complete control over your policies and workloads with the flexibility to scale up as needed on our cloud. This solves a common problem: you want to self-host certain workloads to meet compliance requirements, but when push comes to shove, you need to tap into extra compute.
Run sensitive workloads securely in your VPC, meet specific data residency requirements, and fully utilize existing investments in providers like @googlecloud. When you need extra compute, effortlessly spill over to Baseten Cloud—zero engineering effort required.
With our launch on the Google Cloud Marketplace, it’s now easier than ever for Google Cloud users to leverage Baseten’s platform directly within their cloud ecosystem.
Our growing partnership provides seamless interoperability, secure data management, and the ability to quickly spin up high-performance AI applications in your Google Cloud environments.
Get early access to Baseten Hybrid or get started today with Baseten Cloud on the Google Cloud Marketplace!
We're excited to introduce our new Engine Builder for TensorRT-LLM! 🎉
Same great @nvidia TensorRT-LLM performance—90% less effort.
Check out our launch post to learn more: https://t.co/8hUhkRAeX6
Or @philip_kiely's full video: https://t.co/OB7IjfSSZZ
We often use TensorRT-LLM to support custom models for teams like @Get_Writer. For their latest industry-leading Palmyra LLMs, TensorRT-LLM inference engines deployed on Baseten achieved 60% higher tokens per second.
We've used TensorRT-LLM to achieve results including:
📈 3x better throughput
📉 40% lower time to first token
📉 35% lower cost per million tokens
While TensorRT-LLM is incredibly powerful, we and our customers repeatedly faced tedious, lengthy, resource-intensive builds. We created the TensorRT-LLM Engine Builder to eliminate hours of manual work and bring the power of TensorRT-LLM to more teams. 💪
Now, you can automatically build optimized model-serving engines for open-source and fine-tuned LLMs in minutes!
Leveraging our new Engine Builder on Baseten gives you full control to customize your model server, dedicated deployments with automatic traffic-based scaling, logging and metrics observability, and leading security and compliance. TensorRT-LLM is compatible with 50+ LLMs, including foundation models like Llama, Mistral, Whisper, and their fine-tuned variants.
If you have any questions about how to get the best possible performance for LLMs in production, we'd love to help! Try out the TensorRT-LLM Engine Builder with $30 in free credits 🚀 https://t.co/9ADjB2cPfK
We're excited to introduce our new Engine Builder for TensorRT-LLM! 🎉
Same great @nvidia TensorRT-LLM performance—90% less effort.
Check out our launch post to learn more: https://t.co/8hUhkRAeX6
Or @philip_kiely's full video: https://t.co/OB7IjfSSZZ
We often use TensorRT-LLM to support custom models for teams like @Get_Writer. For their latest industry-leading Palmyra LLMs, TensorRT-LLM inference engines deployed on Baseten achieved 60% higher tokens per second.
We've used TensorRT-LLM to achieve results including:
📈 3x better throughput
📉 40% lower time to first token
📉 35% lower cost per million tokens
While TensorRT-LLM is incredibly powerful, we and our customers repeatedly faced tedious, lengthy, resource-intensive builds. We created the TensorRT-LLM Engine Builder to eliminate hours of manual work and bring the power of TensorRT-LLM to more teams. 💪
Now, you can automatically build optimized model-serving engines for open-source and fine-tuned LLMs in minutes!
Leveraging our new Engine Builder on Baseten gives you full control to customize your model server, dedicated deployments with automatic traffic-based scaling, logging and metrics observability, and leading security and compliance. TensorRT-LLM is compatible with 50+ LLMs, including foundation models like Llama, Mistral, Whisper, and their fine-tuned variants.
If you have any questions about how to get the best possible performance for LLMs in production, we'd love to help! Try out the TensorRT-LLM Engine Builder with $30 in free credits 🚀 https://t.co/9ADjB2cPfK
Launching today 🎉
Double your throughput or halve your latency for @MistralAI, @StabilityAI + others?
Do both at ~20% lower cost with @nvidia H100s on Baseten.
Here’s how 👇
1/
The personal agent, that’s the big thing - Bill Gates
Introducing LLynx🐈, a building block to help enable action-oriented AI Agents. It’s fast, accurate & small (3B params)
🙏Built on resources from @langchain@hwchase17@huggingface@_philschmid@chipro@lmoroney
Demo👇
@renval23@mitsmr@SaveToNotion The approach is particularly effective at teaching employees soft skills like perspective-taking and emotional intelligence, which are crucial to creating a psychologically safe environment where employees feel comfortable sharing ideas and information. (3/3)
@renval23@mitsmr@SaveToNotion The skills-based approach to culture change is more effective than other approaches because it infuses new culture components into the existing culture of the company. (2/3)
@pogue25@hinklej Patrick explained that the bill was always intended to be signed, but there was a "deal in the works" at the end of the regular session. (3/4)
@renval23@HarvardBiz@SaveToNotion Before expressing disagreement with someone in a position of power, it is crucial to evaluate the potential risks and consequences. (5/5)