Why is no one talking about this?
@nvidia is offering around 80 AI models via hosted APIs absolutely for free.
You get access to MiniMax M2.7, GLM 5.1, Kimi 2.5, DeepSeek 3.2, GPT-OSS-120B, Sarvam-M etc.
This plugs straight into OpenClaude, OpenCode, Zed IDE, Hermes agent and even with Cursor IDE.
Setup:
– Grab API key: https://t.co/Wfdclm0hY2
– base_url = "https://t.co/VOGC10LmGP"
– api_key = "$NVIDIA_API_KEY"
– select model (e.g. minimaxai/minimax-m2.7)
If you’re building or experimenting, this is basically free inference.
Lock in and start building today anon.
Thank me later.
Nice paper combining the strength of Skills and RAG.
Most RAG systems retrieve on every query, whether the model needs help or not. This is wasteful when the model already knows the answer, and often too late when it does not.
New research introduces Skill-RAG, a failure-state-aware retrieval system. It uses hidden-state probing to detect when an LLM is approaching a knowledge failure, then routes the query to a specialized retrieval strategy matched to the gap.
Evaluated on HotpotQA, Natural Questions, and TriviaQA, the approach improves over uniform RAG baselines on both efficiency and accuracy.
Why does it matter?
RAG is moving from a single monolithic pipeline to a suite of skills an agent selects between. Knowing when to retrieve and what kind of retrieval to run will matter more than raw retriever quality as agents take on multi-step reasoning, where a single bad lookup derails the whole chain.
Paper: https://t.co/GcHFJrO17E
Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
NEW Research from Google.
Integration test failures are painful because the signal is buried in messy logs. Massive output, heterogeneous systems, low signal-to-noise ratio, and unclear root causes.
This paper introduces Auto-Diagnose, an LLM-based tool deployed inside Google's Critique code review system.
Auto-Diagnose analyzes failure logs, summarizes the most relevant lines, and suggests the root cause in the developer workflow where the failure is already being reviewed.
The deployment numbers are notable. In a manual evaluation of 71 real-world failures, Auto-Diagnose reached 90.14% root-cause diagnosis accuracy.
After Google-wide deployment, it was used across 52,635 distinct failing tests. User feedback marked it "Not helpful" in only 5.8% of cases, and it ranked #14 in helpfulness among 370 Critique tools.
Paper: https://t.co/KJq3LgETpa
Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
Happy to share a new article after some time! 🚀
Exploring how to make web apps agent-ready with WebMCP, the start of an exciting new journey in AI tech.
Read here @towards_AI :
https://t.co/LkhdynVN15
A Comprehensive Survey of Small Language Models
Nice survey on small language models (SLMs) and discussion on issues related to definitions, applications, enhancements, reliability, and more.
https://t.co/S7XxpW8QZM
Programming is changing so fast... I'm trying VS Code Cursor + Sonnet 3.5 instead of GitHub Copilot again and I think it's now a net win. Just empirically, over the last few days most of my "programming" is now writing English (prompting and then reviewing and editing the generated diffs), and doing a bit of "half-coding" where you write the first chunk of the code you'd like, maybe comment it a bit so the LLM knows what the plan is, and then tab tab tab through completions. Sometimes you get a 100-line diff to your code that nails it, which could have taken 10+ minutes before.
I still don't think I got sufficiently used to all the features. It's a bit like learning to code all over again but I basically can't imagine going back to "unassisted" coding at this point, which was the only possibility just ~3 years ago.
Transformer Explainer
Really cool interactive tool to learn about the inner workings of a Transformer model.
Apparently, it runs a GPT-2 instance locally in the user's browser and allows you to experiment with your own inputs. This is a nice tool to learn more about the different components inside the Transformer and the transformations that occur.
Tool: https://t.co/9oG15dpBEQ
The journey starts to get knowledge about #prompt and optimizations of #LLM LLMs. Find some basic information I collected when starting as an #engineering#manager in the field of #ai to identify the right approaches to customize #LLMS output.
https://t.co/2IStBwWUbh
Our Beta for TypeScript 4.0 is out now!
🔪 Sophisticated Slicing and Dicing on Tuples
🦺 Safer Exception Handling
❓❓= New Operators
🔎 Control-flow on Class Properties
🏃♀️ Faster Initial Editor Startup
⚠ /** @deprecated */ Editor Support
and more!
https://t.co/B5t6ddzeXb
A few ways to manage day-to-day bias in your team:
- Set up a rotation for office housework — don’t ask for volunteers.
- Mindfully assign people to high-value projects.
- Acknowledge the importance of low-profile contributions.
https://t.co/6LVqDJuDQU
Thread: As a designer, I find it interesting (but not surprising) that my own pandemic “ah-hah” moments have been driven by data visualizations. First, this one: https://t.co/FKt3SxAzsf
Blink: Intent to Prototype and Ship: Implement ServiceWorkerContainer.onmessageerror and ServiceWorkerGlobalScope.onmessageerror https://t.co/8XP1LpNPXb