🤖 Surprise! Your AI Agent Isn’t Reliable
Your #AI agent worked perfectly… once?
➡️ That doesn’t mean it will work tomorrow.
@dot_treo shares one of the most overlooked realities of #deploying AI agents in real #workflows:
#reliability is fragile.
⚠️ Models change constantly — even without you noticing
🔄 Prompts that worked yesterday can silently break today
🧪 “It worked once” is not a production strategy
This is where most teams fail.
Because in real-world systems, small inconsistencies quickly turn into serious problems.
Want more real-world insights like this?
👉 https://t.co/fxlxQ0J1vc
#MLcon #AIAgents #MLOps #AIInfrastructure #MLEngineer
CRITICAL Security Advisory: Talk to your tech team right now. If you are using LiteLLM (a python package to be able to use multiple AI vendors), you may be at risk. The last two versions 1.82.7 and 1.82.8 on PyPI have been compromised.
@Zoom@nvidia Turns out it was only minor progress. Now it simply takes longer to crash, but usually does when screen sharing.
If I don't use the Nvidia Broadcast camera it seems to work. So something about the way Zoom uses it needs to be fixed.
Finally found out how to fix @Zoom crashing when it uses @nvidia Broadcast as the camera: use the 32 bit zoom client. Unfortunately that version is discontinued as of December 2025.
Maybe someone could fix the issue for the 64 bit client now?
I had some great conversations after my talk at Entwickler Summit this year. A common theme was the double-edged sword of AI coding assistants.
If you're thinking about how to strategically use AI in your development workflow, you might find it useful.
https://t.co/pSDEBia3WA
Florence-2, the new vision foundation model by Microsoft, can now run 100% locally in your browser on WebGPU, thanks to Transformers.js! 🤗🤯
It supports tasks like image captioning, optical character recognition, object detection, and many more! 😍 WOW!
Demo (+ source code) 👇
Announcing the Build with Claude June 2024 contest.
We're giving out $30k in Anthropic API credits. All you need to do is build and share an app that uses Claude through the Anthropic API:
TLDR: Google is ALL IN on AI agents
AI agents are deployed across their whole product ecosystem.
8 wild demos from Google I/O today:
1. An email agent to continuously organise all receipts in your inbox into a spreadsheet
I am so excited that xLSTM is out. LSTM is close to my heart - for more than 30 years now. With xLSTM we close the gap to existing state-of-the-art LLMs. With NXAI we have started to build our own European LLMs. I am very proud of my team. https://t.co/IH7giCe3gd
One of the projects I worked on @ImperialDyson / @thrishlab is getting published! We designed a print-in-place compliant wheel that passively transforms when encountering obstacles.
Video with more details: https://t.co/xaIDdSNGLW
Early access paper: https://t.co/AQChRLvAzo
Apparently some people are surprised that you can run a 13 year old self-contained Jar on the latest JVM. And thinking about other language ecosystems, it is actually pretty freaking amazing and not something that we talk about enough.
The software industry is rapidly converging on just three languages: Go, Rust, and JS.
It would be smart to learn one of those really well, and have at least a working acquaintance with the other two.
Had a lot of fun updating the visuals and making event-driven workflows a reality.
In fact, it is now possible to create things declaratively; not just event handlers, but routes for Flask components, or AI agents with the Agent Toolkit.
Xircuits 1.10 adds a ton of great features making it much easier to build composable workflows, event driven programs and even discover and install component libraries in the UI. Learn more about it on our blog: https://t.co/IquS0jC9Md
For the first time, we show that the Llama 7B LLM can be trained on a single consumer-grade GPU (RTX 4090) with only 24GB memory. This represents more than 82.5% reduction in memory for storing optimizer states during training.
Training LLMs from scratch currently requires huge computational resources with large memory GPUs. While there has been significant progress in reducing memory requirements during fine-tuning (e.g., LORA), they do not apply for pre-training LLMs. We design methods that overcome this obstacle and provide significant memory reduction throughout training LLMs.
Training LLMs often requires the use of preconditioned optimization algorithms such as Adam to achieve rapid convergence. These algorithms accumulate extensive gradient statistics, proportional to the model's parameter size, making the storage of these optimizer states the primary memory constraint during training. Instead of focusing just on engineering and system efforts to reduce memory consumption, we went back to fundamentals.
We looked at the slow-changing low-rank structure of the gradient matrix during training. We introduce a novel approach that leverages the low-rank nature of gradients via Gradient Low-Rank Projection (GaLore). So instead of expressing the weight matrix as low rank, which leads to a big performance degradation during pretraining, we instead express the gradient weight matrix as low rank without performance degradation, while significantly reducing memory requirements.
@jiawzhao@BeidiChen@tydsh
🔧 Going LIVE at 15:15 GMT+2 on Twitch! 🚀
Dive in as we strategize, implement, and explore the new tool support in "Technologic"! Share your ideas and see them come to life in real-time on our ChatGPT client! 🛠️✨
Be part of the innovation journey!
🔗 https://t.co/627XZ61txD