Enterprise AI agents need context—not scraping scripts.
Cortex Knowledge Extensions are now GA, bringing verified, licensed content from @AP, @WashingtonPost, @StackOverflow, and more into your Snowflake-powered AI. 📚🧠
Use Snowflake Intelligence—now in Public Preview—to connect this content to agents or prompt it directly through Cortex AI APIs using your LLM of choice.
No hacks. No headaches. Just answers you can trust: https://t.co/X3i88eL5Cu
Congratulations to @Cisco for launching AI Defense today! We worked with Cisco to build the guardrail models that power this end-to-end solution to safeguard enterprise AI transformation.
https://t.co/YkqnSDfpu7
As we launch into 2025, our Public Sector team reflects on our work over the past year. We focused on strengthening national security by equipping U.S. agencies with the best-in-class commercial AI technologies.
Highlights include:
✅Deepening ties to Missouri’s geospatial and defense tech industry
✅Work with the U.S. government to develop and deploy new AI capabilities, including a T&E framework for @DeptofDefense and novel perimeter security technology with @DIU_x and @usairforce
✅Collaborations with @anduriltech, @CISAgov, & Holos
✅Product launches like Defense Llama and major upgrades to Scale Donovan
Read more 👉 https://t.co/tXOyli9Np1
📣 Introducing Llama 3.2: Lightweight models for edge devices, vision models and more!
What’s new?
• Llama 3.2 1B & 3B models deliver state-of-the-art capabilities for their class for several on-device use cases — with support for @Arm, @MediaTek & @Qualcomm on day one.
• Llama 3.2 11B & 90B vision models deliver performance competitive with leading closed models — and can be used as drop-in replacements for Llama 3.1 8B & 70B.
• New Llama Guard models to support multimodal use cases and edge deployments.
• The first official distro of Llama Stack simplifies and supercharges the way developers & enterprises can build around Llama to support agentic applications and more.
Details in the full announcement ➡️ https://t.co/1bnEeLY9qf
Download Llama 3.2 models ➡️ https://t.co/DZoTQvESbG
These models are available to download now directly from Meta and @HuggingFace — and will be available across offerings from 25+ partners that are rolling out starting today, including @accenture, @awscloud, @AMD, @azure, @Databricks, @Dell, @Deloitte, @FireworksAI_HQ, @GoogleCloud, @GroqInc, @IBMwatsonx, @Infosys, @Intel, @kaggle, @NVIDIA, @OracleCloud, @PwC, @scale_AI, @snowflakeDB, @togethercompute and more.
With Llama 3.2 we’re making it possible to run Llama in even more places, with even more flexible capabilities. We’ve said it before and we’ll say it again: open source AI is how we ensure that these innovations reflect the global community they’re built for and benefit everyone. We’re continuing our drive to make open source the standard with Llama 3.2.
Let's say you’re a pre-revenue tech startup with a shiny new product that’s almost ready for primetime. You’ve landed a couple of hot leads that could make or break your credibility—and get investors pumped for your next round.
Here’s where I’d focus my energy right now:
1. Value Prop Check
Nail down your sales pitch, but be ready to adapt as real-world feedback rolls in and shows you where your message hits—or misses. You won't get it right the first time, so plan to iterate.
2. Scrappy Sales Team
Assemble a small, agile team that includes you (the CEO or Founder) and one or two sharp folks from engineering with a decent customer UI. At this stage, it's as much about being good learners than about being good closers.
3. Test and Tweak Sales Process
Start with a basic sales playbook, but treat it like a lab experiment—track everything, learn fast, and adjust on the fly. *Always* ask tons of questions first before you pitch your product.
4. Find Your First Believers
Go after the early adopters who feel the pain your product solves, and dig deep into their world to refine your ideal customer profile.
5. Pilot with Partners
Lock in some pilot deals for low initial dollars that let you test your product in the wild—think of it as R&D with potential revenue. Always charge something to test willingness to pay.
6. Fast Feedback Loops
Set up quick and dirty feedback loops between your customers and the product team—iterate like your runway depends on it. Learn and adapt.
7. Insights Over Instincts
Make every customer interaction a chance to gather intel—use it to tweak your product and pitch until you’ve got a winner.
Focus on these 7 key areas, execute relentlessly, and stay adaptable. Nail this phase and you'll transform from pre-revenue to market-ready before you know it.
Google DeepMind have produced a program that in a certain sense has achieved a silver-medal peformance at this year's International Mathematical Olympiad. 🧵
https://t.co/DIcsYXUv97
Unlock the secrets to mastering generative AI with our AI Readiness Report. Based on a survey of over 1,800 experts, discover how leading organizations are fine-tuning models, tackling data challenges, and implementing robust evaluation frameworks to drive AI success.
We’re excited to see @cohere launching Command-R today!
Enterprises can test and evaluate Cohere’s models on Scale GenAI Platform, incorporate these models into their RAG pipelines, and use these models to build custom GenAI apps built on their own data 💪
Scale GenAI Platform is now available on the @Azure Marketplace!
Azure customers can purchase Scale GP through the Azure Marketplace, leveraging their existing @Microsoft Azure Consumption Commitment (MACC).
👉 https://t.co/QDoc2Xy0Qi
👉 https://t.co/UzmhklAMET
We are excited that @scale_AI has now added Cohere’s Command model and Rerank technology to their GenAI platform to customize business applications with fine-tuning and use RAG for accurate responses. This move will help unlock the powerful potential of AI for enterprise customers.
📣Today, we're launching a major update to the Scale GenAI Platform, the full-stack platform to transform your data into customized enterprise-ready Generative AI applications. Any use case, any model, anywhere 🧵
https://t.co/kMAwSBnWPC
Here's a list of my favorite LLM papers I read this month:
1/ Zephyr LLM - a 7B parameter model with competitive performance to ChatGPT on AlpacaEval; applies distilled supervised fine-tuning to improve task accuracy and distilled direct performance optimization on AI feedback data to better align the model; shows performance comparable to 70B-parameter chat models aligned with human feedback.
https://t.co/9ihZGCm62I
2/ LLMs Meet New Knowledge - presents a benchmark to assess LLMs' abilities in knowledge understanding, differentiation, and association; benchmark results show
https://t.co/ftjswZsg0a
3/ Llemma - an LLM for mathematics which is based on continued pretraining from Code Llama on the Proof-Pile-2 dataset; the dataset involves scientific paper, web data containing mathematics, and mathematical code; Llemma outperforms open base models and the unreleased Minerva on the MATH benchmark; the model is released, including dataset and code to replicate experiments.
https://t.co/3gz2Kr5XKP
4/ LLMs for Software Engineering - a comprehensive survey of LLMs for software engineering, including open research and technical challenges.
https://t.co/mwSf8nPuZM
5/ Self-RAG - presents a new retrieval-augmented framework that enhances an LM’s quality and factuality through retrieval and self-reflection; trains an LM that adaptively retrieves passages on demand, and generates and reflects on the passages and its own generations using special reflection tokens; it significantly outperforms SoTA LLMs.
https://t.co/3lJ7IJU2dk
6/ Instruct-Retro - introduces Retro 48B, the largest LLM pretrained with retrieval; continues pretraining a 43B parameter GPT model on an additional 100B tokens by retrieving from 1.2T tokens.
https://t.co/EORkgCWUJu
7/ Overview of Factuality in LLMs - a survey of factuality in LLMs providing insights into how to evaluate factuality in LLMs and how to enhance it.
https://t.co/OAlEbzd2o8
8/ LLMs Represent Space and Time - discovers that LLMs learn linear representations of space and time across multiple scales; the representations are robust to prompt variations and unified across different entity types; demonstrate that LLMs acquire fundamental structured knowledge such as space and time, claiming that language models learn beyond superficial statistics, but literal world models.
https://t.co/XnoQ0EQKUv
9/ StreamingLLM - a framework that enables efficient streaming LLMs with attention sinks, a phenomenon where the KV states of initial tokens will largely recover the performance of window attention; the emergence of the attention sink is due to strong attention scores towards the initial tokens; this approach enables LLMs trained with finite length attention windows to generalize to infinite sequence length without any additional fine-tuning.
https://t.co/0YCZULlAsW
10/ Retrieval meets Long Context LLMs - compares retrieval augmentation and long-context windows for downstream tasks to investigate if the methods can be combined to get the best of both worlds; an LLM with a 4K context window using simple RAG can achieve comparable performance to a fine-tuned LLM with 16K context; retrieval can significantly improve the performance of LLMs regardless of their extended context window sizes; a retrieval-augmented LLaMA2-70B with a 32K context window outperforms GPT-3.5-turbo-16k on seven long context tasks including question answering and query-based summarization.
https://t.co/BWD4qXq2fV
You can find more interesting papers for this and past months here: https://t.co/Ffrj4b12zX
Good intro/overview to LLM retrieval (RAG) from @c_viernickel and Bryan Zhu on Scale's blog — it's not RAG vs. fine-tuning, it's tuning for RAG; for interesting tasks you want both: