New research from Microsoft Research
I see a lot of AI engineers handwriting agent skill docs and hope they generalize.
Probably not optimal. This works show why.
It treats the skill doc as a trainable external state of a frozen agent instead.
It introduces SkillOpt, where an optimizer model makes validation-gated edits to the skill file. It adds, deletes, or replaces instructions, with a textual learning rate that controls how aggressively each round rewrites the doc. The agent itself never changes.
SkillOpt is best or tied on all 52 (model, benchmark, harness) cells.
On GPT-5.5 it adds 23.5 points in direct chat, 24.8 with Codex, and 19.1 with Claude Code over no skill. It beats human-written skills, TextGrad, GEPA, and EvoSkill, carries zero extra inference-time cost, and the learned skills transfer across models and harnesses.
Paper: https://t.co/mNgTmmT32U
Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
⏰ We introduce Reinforcement Pre-Training (RPT🍒)
— reframing next-token prediction as a reasoning task using RLVR
✅ General-purpose reasoning
📑 Scalable RL on web corpus
📈 Stronger pre-training + RLVR results
🚀 Allow allocate more compute on specific tokens
DeepSeek just dropped the single best end-to-end paper on large model training.
It covers
— Software (MLA, training in FP8, DeepEP, LogFMT)
— Hardware (Multi-Rail Fat Tree, Ethernet RoCE switches)
— Mix (IBGDA, 3FS filesystem)
DeepSeek's engineering depth is insane. Must read.
Flow matching produces smooth, deterministic trajectories.
In contrast, the sampling process of a diffusion model is chaotic, resembling the random motion of gas particles.
🤯Inspiring cross-domain insights for AI! 🌞Neural Thermodynamic Laws for Large Language Model Training
Specifically, apply Thermodynamic Laws to design learning rate schedules, as the LLM loss landscape mirrors a river-valley terrain: a flat river at the base of steep valleys.
Ray Dalio sees the future.
He said, "The US is going into a death spiral of debt."
Few weeks later, Moody's downgraded the US credit score, citing the mounting government debt.
What Ray Dalio sees coming next is disastrous: 🧵
I present to you my AI automation that converts your Google Drive folder to RAG vector database. It also updates the RAG vector database whenever a new file is added or deleted from the Google Drive folder.
You can download the workflow for free (link in comment)
AI Agents vs. Agentic AI
Interesting paper summarizing distinctions between AI Agents and Agentic AI.
It also talks about the key ideas, solutions, and the future.
Here are my notes:
For the Taiwan Training Promo Ad on Twitter:
Ready to master #crypto compliance & investigations? Join our hands-on training in Taipei on October 26, 2024, led by world-class experts from TVA3 & Crystal Intelligence. Gain real-world insights into Taiwan’s regulatory landscape.
VIDEO NOW BLOWING UP THE INTERNET; Pfizer director on camera saying they are “mutating" COVID-19 Virus to increase infectiousness. UNREAL!
"That is Not What We Say to the Public" …
"People Won’t Like That’ … ‘Don’t Tell Anyone"