Yay, our team has just published a new paper, “Shift Parallelism: Low-Latency, High-Throughput LLM Inference for Dynamic Workloads"
https://t.co/hpN1RvO6es
Shift Parallelism is a new inference parallelism strategy that can dynamically switch between Tensor Parallelism and Sequence Parallelism, delivering:
- Up to 1.5x lower latency in interactive workloads
- Up to 50% higher throughput under heavy traffic
The tech shows a robust performance across dynamic, real-world traffic pattern
The creators are: Mert Hidayetoglu, Aurick Zhou, Michael Wyatt, Jeff Rasley, Yuxiong He, and Samyam Rajbhandari @ Snowflake AI Research
This work extends Arctic Inference to further optimize LLM inference under dynamic, real-world traffic patterns. The working code is here: https://t.co/r8Am2NaqXq
@davideborall@RamaswmySridhar@davideborall we have this fixed in preview for notebooks. Please DM your details and we can enable that for you.
Also please check out this demo site for Cortex Analyst, https://t.co/PbIamYHkuT, which allows business users to use natural language to query their data.
Cortex Agents are out, with @AnthropicAI as a key LLM powering it.
➡️ Orchestrate across structured and unstructured data sources
➡️ State of the art retrieval: Cortex Analyst with agentic text-to-SQL for high-quality structured data retrieval. Cortex Search with state of the art search quality.
➡️ All running next to data, within the unified governance and security of @SnowflakeDB
https://t.co/YxifLf8TPW
@SnowflakeDB partners with top AI companies to bring cutting-edge models to users through Cortex AI.
Excited to add DeepSeek’s R1 model running on Snowflake; with data never leaving Snowflake. Let’s push AI innovation forward!
https://t.co/YAJ0Ux5M8b
Excited to share @Snowflake’s One Million Minds + One Platform initiative -- a $20 million commitment to upskill 1M+ people in AI and data by 2029.
Hear more from our @RamaswmySridhar on how we're empowering the next wave of data-driven AI innovators.
🌎 Introducing @SnowflakeDB's One Million Minds + One Platform - where we’re empowering one million minds with AI and data skills 🧠
I couldn’t be prouder to support the next generation of builders, AI users, developers, and data experts with this initiative. 🚀
https://t.co/fFLt4cJ7Z5
Snowflake, with Hao AI Lab, placed no. 1 on the Spider 2.0 leaderboard, a new benchmark that offers a more realistic test of LLMs’ performance on complex enterprise tasks. Exciting possibilities ahead.
Snowflake customers get immense value from the open source models we host in Cortex AI. For example, open source models like @AIatMeta’s Llama have proven to be highly capable and suited for enterprise-grade AI work. I see it first-hand every day.
Exciting win for open source embeddings - Arctic Embed 2.0 is out, state-of-the-art multilingual embeddings for retrieval tasks, with high inference efficiency.
Available as part of Snowflake Cortex or on HuggingFace under an Apache-2 license.
We’re thrilled to announce the release of Arctic Embed 2.0, the next iteration of our frontier embedding models.
Arctic Embed 2.0 builds on the robust foundation of our previous releases, adding multilingual support without sacrificing English performance or scalability.
“Multilingual embedding models are crucial for enabling people worldwide — not just English speakers — to become AI builders,” said CEO of Hugging Face @ClementDelangue. “By releasing these state-of-the-art models as open source on Hugging Face, Snowflake is making a tremendous contribution to AI and the world.”
Learn more about Arctic Embed 2.0, available on @huggingface: https://t.co/7WPfvjf7Wn
🚀 I am thrilled to introduce @SnowflakeDB 's Arctic Embed 2.0 embedding models! 2.0 offers high-quality multilingual performance with all the greatness of our prior embedding models (MRL, Apache-2 license, great English retrieval, inference efficiency) https://t.co/hEcd0niVyr🌍
Get up to speed on our latest AI innovations 🏎️
Our Head of AI @barisg joins @jacobeffron on the Unsupervised Learning podcast for a deep dive into our AI ecosystem 🎧
Our Head of AI @barisg joined the @NVIDIAAI podcast to discuss how we are using AI to enable customers to access and manage data at scale.
By separating the storage of data from compute, Snowflake allows organizations across the world to connect via cloud technology and work on a unified platform — eliminating data silos and streamlining collaborative workflows.
Listen now 🎧: https://t.co/eLskoWjubv
#Snowflake Cortex Analyst is now available, an agentic AI system for self serve analytics, achieving a remarkable 90%+ SQL accuracy, almost 2x better in text-to-SQL accuracy than GPT4o.
https://t.co/flS5e3wcgr
Our Head of AI Baris Gultekin (@barisg) sat down on the Office Hours with Tomasz Tunguz (@ttunguz) podcast to dig into what Snowflake’s been up to in the AI space — including a debrief of our latest Llama announcement, how our world-class AI Research Team is pushing the boundaries of innovation, Snowflake’s shift to developing in the open, and more.
Our Head of AI Baris Gultekin (@barisg) sat down on the Office Hours with Tomasz Tunguz (@ttunguz) podcast to dig into what Snowflake’s been up to in the AI space — including a debrief of our latest Llama announcement, how our world-class AI Research Team is pushing the boundaries of innovation, Snowflake’s shift to developing in the open, and more.
World-class search needs world-class metrics 🚀
And great metrics need to be constantly evolved to avoid overfitting 🤟
As we build out a great search experience at @SnowflakeDB AI research, we are excited to join forces with @lintool and the University of Waterloo 🙌
Our focus will be on BEIR v2, a rethink of MTEB/BEIR, and TREC-RAG ...
Another #SnowflakeArctic update - our document focused model beats #GPT4 in document question answering with an orders of magnitude smaller (so cheaper) model!
Snowflake’s Arctic-TILT model, powering our Document, Al beats GPT-4 with just 0.8B parameters, securing a top spot in the standard benchmark for document understanding DocQVA.
Excited to release #SnowflakeArctic, a state-of-the-art open source foundation model for enterprise tasks like SQL generation, coding, and instruction following.
https://t.co/p3AavXrjB1