Computer Scientist (Artificial intelligence) and a Social Scientist
* Singer | Actor | Dancer & a Quick Study
* Loves Psychology
* Philomath with Profailantism✨
Introducing autoresearch for arXiv papers
Change 'arxiv' to 'autoarxiv' in any paper URL
An agent deploys to resolve setup issues on the codebase, run a minimal reproduction, and estimate full replication cost. Read more below
@rodydavis@antigravity Adding the ability for BYOK as that will not tie our credits with only the models you provide. The harness is your product and it should not be constrained.
Thanks for asking. I love Google and Antigravity beyond words, TBH
@scaling01 Absolutely. It's horrendous to imagine how they chose to charge more than 20+ dollars for API output when their monthly pro subscription starts with 20$.
Isn't the API ridiculously expensive!?
The Hermes Agent Accelerated Business Hackathon presented by @NVIDIAAI × @stripe × @NousResearch starts now, for builders making agents that can earn, spend, and run real operations at any scale.
Our NVIDIA integrations let your team run agents safely through NemoClaw, quickly on Nemotron 3 Ultra, and intelligently with access to their extensive agent skills. The new Stripe Skills for Hermes let your agent buy what it needs, provision its own SaaS, and pay for the services it uses.
We want to see what kind of business tooling you can build on top of this foundation, whether it’s a fully automated company or a framework to accelerate enterprise functions.
Prizes:
1st — $10,000 cash + NVIDIA DGX Spark + $5,000 Stripe Credits
2nd — $5,000 cash + NVIDIA DGX Spark + $3,000 Stripe Credits
3rd — $2,500 cash + NVIDIA DGX Spark + $1,000 Stripe Credits
To enter:
1) Tweet a 1-3 minute demo video tagging @NousResearch with a short writeup
2) Drop the link in the submissions channel: https://t.co/S16j0bgumq
3) Fill out the submission form: https://t.co/5tQR7mOODF
Judged by Nous Research, NVIDIA, and Stripe on usefulness, viability, and presentation. Submissions due EOD Tuesday, June 30.
View the model card at @OpenRouter (https://t.co/Xbodf4nBQ5)
For me one of the main information I seek when it comes to coding models is Speed! Token-per-second.
GLM 5.2 currently is giving ~25-30tps.
Personally I feel it's slow. What do you guys think about this speed?
Introducing GLM-5.2: Frontier Intelligence, Open Weights
- Significant improvements in coding and agentic tasks
- Strong long-horizon capabilities with a 1M context window
- Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency
- MIT-licensed open weights
- Same API pricing as GLM-5.1
Tech Blog: https://t.co/LAsxUdN0JZ
Weights: https://t.co/g0A1C4UWx4
API: https://t.co/Kc3E22cbN7
Coding Plan: https://t.co/Nk8Y98HNhU
Chat: https://t.co/WCqWT0qCQb
Cursor just got acquired by SpaceX for $60 billion.
- IDEs already existed
- VS Code was free
- JetBrains was around
Cursor didn't invent the category.
Stop thinking you need to cure cancer with code to succeed.
Build YOUR solution
It might be exactly what someone else needs.
@KabirGoel Some jokes need to be earned by the speaker to become funny. I seriously am not liking these fake model announcements even when they are pranks or jokes
Whatever you may call it!
🚨 Bootcamps are charging $15k just to teach you how to call the OpenAI API.
Meanwhile, this open-source curriculum just dropped 435 lessons to help you build AI from the ground up.
It spans 20 phases and ~320 hours. But the best part is the philosophy: you don't touch PyTorch until you understand exactly what it’s doing under the hood.
Here is the full stack:
→ Phase 0-2 (Foundations): Linear algebra and probability through code, plus classical ML from scratch.
→ Phase 3-6 (Deep Learning): Neural networks from first principles. No frameworks.
→ Phase 7-9 (Architectures): Transformers, Generative AI, and Reinforcement Learning. You implement attention yourself.
→ Phase 10-11 (LLMs): Build, train, and deploy large language models.
→ Phase 12-13 (Multimodal): Vision, audio, and MCP server integration.
→ Phase 14-16 (Agents): 42 lessons on agent engineering, including a custom ReAct loop in ~120 lines of pure Python.
→ Phase 17-19 (Production): Infra, deployment, observability, and safety.
→ Phase 20 (Capstones): 17 shippable projects.
Every single lesson ships a real artifact: a prompt, a skill, an agent, or an MCP server.
You don't just learn AI. You build it by hand.
Please don't believe any news related to Mistral "Le Bros Chaton" or "Le Chaton Fat" model. This is misleading and a big prank.
I personally don't like to be misled into excitement. Be real
@CJP_for_India I will not comment on the worst because nobody can judge that. There can always be worse cases.
But in his case, the way he entered the leadership is no longer the same. I am disappointed in him for wasting so much of India's time, Resources and zeal. I am being extreme here.
We are still waiting for reliable benchmarks on GLM 5.2. Has anyone done a coding comparison for the same task to compare token efficiency and speed?
Actually any kind of test-based-analysis is fine right now due to scarcity.
NVIDIA might just have open-sourced one of the most important AI projects right now.
everyone is building skills, and we are also pulling in skills other people wrote and downloading them straight off GitHub.
the skill is not just text. it bundles instructions and real executable code, and your agent runs that code with the same access you have.
so a skill you grabbed to save ten minutes can read your environment variables, lift your API keys, and quietly send them somewhere. recent research found roughly 1 in 4 public skills carry a vulnerability, and a smaller slice are outright malicious.
that is the gap SkillSpector closes. it is a security scanner that answers one question before you install anything: is this skill safe to run.
you point it at a skill, and a local folder, a single skill .md file, a GitHub link, or a zip all work.
it then runs two passes over the code. a fast static pass flags risky patterns like credential harvesting, data leaks, and prompt injection, and checks the dependencies against live cve data.
an optional second pass uses an LLM to read intent and clear out false positives.
at the end you get one risk score from 0 to 100 and a plain verdict that reads as safe, caution, or do not install.
it is open source under Apache 2.0 and scans skills for Claude Code, Codex CLI, and Gemini.
worth a run before you trust the next skill you find online.
link to the GitHub repo: https://t.co/iaPlOvQ3t4
If you have a full-stack app in your local environment and are ready to take it to production, this course is for you.
You'll learn about container orchestration, managing environments with Docker Compose, & deployment.
Along the way, Gavin also teaches you about working with Dockerfiles, testing your app, and more.
https://t.co/rYCld1SC6P
June '26 LLM list (updated)
Claude Fable 5 ✅ 🥇🐐 banned?
Nex-N2-Pro ✅ 🥇best open source
Claude Mythos ✅ ...but not for you
Claude Sonnet-4.8 ❌
GPT-5.6 ❌
Gemini 3.5 Pro ❌
Gemma 4 12B ✅
Grok 5 ❌
MiniMax M3 ✅ (weights released)
Nemotoron 3 Ultra ✅
Qwen3.7-Plus ✅
Qwen3.7-30B ❌ 🙏
Kimi K3.0 ❌
Kimi K2.7 Code✅
GLM 5.2 ✅ being benchmarked
Hunyuan3 MoE ❌
Macron V1 Preview ✅
MAI Thinking-1 ✅
MAI Code-1-Falsh (coding model) → VSCode ✅
MAI Image-2.5 ✅
MAI Image-2.5-Flash ✅
MAI Transcribe-1.5 ✅
MAI Voice-2 ✅
MAI Voice-2-Flash ✅