Language Models Need Sleep
"Transformer-based large language models are increasingly used for long-horizon tasks; however, their attention mechanism scales poorly with context length. To handle this, we study a sleep-like consolidation mechanism in which a model periodically converts recent context into persistent fast weights before clearing its key-value cache."
"increasing sleep duration N for our models improves performance, with the largest gains on examples that require deeper reasoning."
Caveat: this works when I already have domain knowledge. For a new territory, I flip the first step, AI maps what is possible, then I generate within it, then critique as usual.
If my goal is brain activation and learning, this is my flow: (1) Think and write first, without any #AI augmentation (2) Use AI as an adversarial critic: prompt it to attack (3) Iterate and (4) Come back days later and reproduce the argument unaided
Engineers and salespeople think in completely different ways.
- Engineers: "If you ask them a question, a hundred percent of them will try and think of what is the correct answer to that question."
- Salespeople: "If you're a salesperson, your first thought isn't: what's the answer? It's: why are you asking me that question?"
"And so if you have an engineer talking to a good sales guy, it's going to upset them. Because they're often not gonna answer the question."
"The guys who are good at the job get rejected, because you don't like them. And then the people who are terrible at it, those are the ones that ended up getting hired."
"These CEOs just wanna take a guy who failed the engineering test, put a clean shirt on him and make him the head of sales."
@bhorowitz with @bhalligan
We've been working on the Waxal dataset project since 2021, aiming to enhance the amount of data available for African languages. This public speech dataset initially covers 27 Sub-Saharan African languages spoken by over 100 million speakers across more than 26 countries. 🌍
My PhD thesis is out 🥳🎓
How do LLMs, trained on trillions of tokens, reason?
Can they generalise beyond their training data or are they constrained by what they've seen before?
My takeaway: they can generalise beyond training in interesting ways, showing genuine reasoning
MCP connects agents to live systems: databases, APIs, external services. It's designed for runtime tool access . But the moment you need to teach your agent how to approach a problem domain, you need something else.
Skills aren't about (just) accessing data. They're about embedding knowledge into your agent's reasoning. When your agent needs to understand "here's the right sequence for debugging a data pipeline" or "this is how you validate and process complex documents," skills allow you to bake that knowledge into how the agent thinks.
There's then also the whole matter of how they work fundamentally: MCP tools rely on an external connection and API calls. Skills are local..
The issue isn't choosing between them. It's understanding that they kiiinda serve different purposes. MCP extends your agent's capabilities at runtime. Skills shape how your agent reasons about problems.
I wrote all about this with @itsclelia in our latest blog: https://t.co/ve19aULZwL
.@Ahmad_Al_Dahle is joining as Airbnb's new CTO.
I’m often asked about our AI strategy. We believe pairing great design with frontier technology will help us improve the way people experience travel. Excited to build!
@maximelabonne Privacy indeed; for certain use cases, you don't want the data to leave the device. I expect next gen of OS to provide a local LLM that you can use.
Tomorrow #Nexthink is coming (back) to #EPFL!
We’ll be sharing how we’re building #AI agents to transform the IT world.
Seats are limited — register here
https://t.co/Ql9vrqurQD
@NexthinkNews@ICepfl@EPFL
🇪🇺 EU subsidies are such a massive waste
In Portugal for example, I noticed most construction companies from window frames, glass railing, door, balcony doors etc. have a sign in the footer of their website that say they received EU funding
You can search their EU funding ID and the reason for funding is always some silly bs like
"Project title: Research on how to improve glass railings
Project description: Ensure the company’s competitiveness by strengthening its internal competencies to be competitive in the demanding external market, investing in innovation in management processes, distribution, logistics, and work organization practices, as well as in relationships with the external environment"
And they always get about €250,000 to €500,000
I even have family who received these subsidies, they got one because they were running a company in a "disadvantaged neighborhood" and had to write a 10 page document and instantly got €250,000, they laughed hysterically cause they definitely didn't need that money, but why not get it if you can, right? Free money!
The EU spends hundreds of billions of euros every year from European taxpayer's money on these subsidies that just go nowhere
Congratulations to @anniehartley_ and her promotion to Adjunct Professor! 🥳
📣We also look forward to welcoming Dr Samy Bengio, Director for AI Research at @Apple, joining us as Adjunct Professor in the School of Computer and Communication Sciences.
👉https://t.co/Tc5IiXfJ71
#OpenAI is opening a new office in Zurich, recruiting 3 engineers from #Google#deepmind — right in the city with Google’s 2nd largest office (~5K employees).
The #AI talent war is heating up, but this is great news for the #AI community & #Switzerland’s ecosystem! 🚀