Finished https://t.co/bTvsf6Ld8z and learned about the fundamentals of model inference, choosing right model format, and how to measure and optimize performance. Looking forward to apply this knowledge to improve the user experience of some AI agents @googledevs#NVIDIAGTC
🚀 Gemini Headless is a game-changer for server-side AI automation at scale:
🔹 Built for headless server automation
🔹 Zero-deadlock background I/O piping
🔹 High-throughput parallel execution
🔹 Fully async & type-safe
https://t.co/vwzgIJ1wmG
We can run #Claude Code with #Ollama now ❤️ Really cool! This is huge for those who can’t use cloud models but still want Claude running locally🚀
🔗 https://t.co/PgBwKt00vS
any-llm @MozillaAI - Another LiteLLM proxy alternative which is more lightweight and cleaner way to communicate with an LLM provider using a single interface
👉 https://t.co/W3tQr1wV5n
Strix - Open-source AI agents for penetration testing. Makes it easy to run a penetration test in hours and not weeks and also get a compliance report for this. Plus it can also run it as part of CI/CD https://t.co/4T9YUPs0zh
#PenTesting#AIAgent
I’m officially Ray Foundations Certified!
Excited to be part of the growing Ray community and to continue building distributed AI applications with Ray.
#RayFoundationsCertified#RaySummit2025
📷Get certified now: https://t.co/qo7hS0nkgW
Today the MCP Steering Committee is soft-launching the official OSS MCP Registry. This is a build in public launch to start getting feedback and contributions.
https://t.co/yft0NR4W7f
#mcp#llm
https://t.co/9yB7Rdm4Tq just dropped and it’s a big step for standardization: an open, clean format for coding agents! 🔥
With a community led by OpenAI, with teams from Factory, Cursor, Jules team from Google, Roo Code, and more.
🔗 https://t.co/7pe1dnaho2
#agentic#LLM
New lib for asynchronous and composable AI Pipelines
⛓️ Chaining (+) or Parallelizing (//) processors.
🔌 Integrates with Gemini and Gemini Live APIs.
🚀 Orchestrate concurrent tasks with asyncio.
🖼️ Diverse content: text, images, audio, and PDFs.
https://t.co/r6HNXlOXbz
#genai
For #MCP Builders: all the videos from the #MCPDevSummit in San Francisco held in May are now live https://t.co/T1fqpThIld
New #MCP features are being shipped each week—Auth, Elicitations, User interactions, UI functionalities, and more ⚡
Someone uploaded an old EXE File to Claude 3.7 and it recreated the game in Python https://t.co/Clpx0WiB3u
Here is the original chat with Claude https://t.co/kPxfkuhHhV
Starting this week, #deepseekai will be open-sourcing 5 repos – one daily drop – not because they have made grand claims, but simply as developers sharing their small-but-sincere progress
📢 https://t.co/CIhIf0XpLH
#llm#transparency
Microsoft just released BitNet an inference framework for 1-bit #LLMs so you can run a 100bn BitNet b1.58 model on a single CPU
🤯It achieve speeds comparable to human reading (5-7 tokens per second) and it is perfect for local devices
👉 https://t.co/URiOqekXMp
Google hard dataset for evaluating RAG applications:
🐇 evaluating a multi-hop rag application
🔖 benchmarking a language model on reasoning an factuality
Baseline results range 41% for basic prompting up to 66% for multi-step retrieval & reasoning.
https://t.co/t5Z5LebCd4
Practical RAG
(i) Use semantic (aka vector database) and keyword search
(ii) Situate the chunks of the doc in the greater context by summarizing the role they play
(iii) More chunks are usually better (eg. 20 chunks)
(iv) Rerank the chunks.
https://t.co/Gke6a34Y5X
OpenAI released an evaluation dataset
🌍 14 languages: Arabic, Spanish,….
🧠 57 subjects: elementary to advanced professional
🎓 translated by professional translators
🔬 Evaluates general knowledge across diverse cultures, used in openai/simple-evals
https://t.co/I0zgX4EaZy
The future is running your own AI cluster at home with your own everyday devices.
Check how to run a Llama 3.1 405B model on MacBooks with exo 🤯
https://t.co/F1ErkwKajO