Recentemente, vem aparecendo cada vez mais manchetes e vídeos sobre inteligência artificial e seus riscos, até de extinção da humanidade. Nesse thread, vou tentar explicar melhor isso 🧵
Experimentos bem legais mostrando que um neurônio biológico consegue fazer muito mais do que um neurônio artificial, através dos dendritos. Isso talvez signifique que aquelas comparações entre o número de sinapses e pesos de uma rede neural esteja mais errado ainda.
What can a neuron compute?
Real biological neurons are complex, but how capable are they?
Using a new method, we found that a single cortical neuron can classify cats vs dogs, recognize spoken words, and solve 10-bit parity, all tasks thought to require entire networks. (1/15)
@avioesemusicas Esse paper é de 2024, e até hoje esse problema não se concretizou de verdade, porque o paper testa uma versão irreal de treinamento. Tiveram algumas críticas a essa ideia de "model collapse", por ser irreal/exagerada. Sugiro esse fio/paper: https://t.co/GkNC3TaozJ
What is the future of web-scale synthetic data, and what harms might such data cause?
Delighted to announce our new position paper: Model Collapse Does Not Mean What You Think
https://t.co/qKA1uQseQ9
@JoshuaK92829@AlvanArulandu@sanmikoyejo w/ 🙏to @ang3linawang@walesalaudeen96
@WeiseFranklin Sim, e essa ideia de "model collapse" já foi bem questionada por ser um pouco irreal/exagerada. Sugiro esse fio/paper: https://t.co/GkNC3TaozJ
What is the future of web-scale synthetic data, and what harms might such data cause?
Delighted to announce our new position paper: Model Collapse Does Not Mean What You Think
https://t.co/qKA1uQseQ9
@JoshuaK92829@AlvanArulandu@sanmikoyejo w/ 🙏to @ang3linawang@walesalaudeen96
Everyone says the latest AI agents will be "job-ready" soon, especially after the release of Fable 5 this week. But is that really the case?
Over the past many months, my group and collaborators have been building Agents' Last Exam (ALE), a benchmark designed to test exactly that claim on real digital labor-market work.
My group and collaborators previously have created many of the benchmarks the field runs on, including MMLU, MATH, CyberGym, and ExploitGym. Today, I'm excited to share Agents' Last Exam (ALE): a rolling benchmark that measures whether AI agents can actually perform economically valuable work across a broad range of real-world domains.
With ALE, we evaluated Fable 5, GPT-5.5, Composer 2.5, and other frontier agent systems across more than 1,500 expert-sourced tasks spanning 55 occupations.
The result is both impressive and sobering.
Today's agents can solve a meaningful fraction of professional tasks. But when we look at the hardest tasks, the ones requiring sustained reasoning, deep domain expertise, and reliable execution over long horizons, they are still far from human-level performance.
On ALE's hardest tier, every frontier agent we tested, including Fable 5, achieved a 0% success rate.
The age of useful agents is here.
The age of truly job-ready agents is not.
We hope Agents' Last Exam (ALE) will serve as a new guidepost and north star for developing agents capable of reliably performing economically valuable work across a broad range of domains.
🧵
@coproduto Você viu esse resultado aqui, em que eles dizem que a arquitetura deles consome 50-100x menos computação/dados? E parece real, é open-source e já teve gente que reproduziu. Não sei porque não teve tanta repercussão.
https://t.co/vGRUkCDSvI
Introducing HRM-Text.
An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performance with a fraction of the data, compute, and infrastructure.
Trained on just 40B structured tokens, HRM-Text achieves competitive performance while using ~1/1000 of the training data of comparable models.
The kicker? The full model trains in roughly one day on a $1,000 budget.
This opens the door to a new generation of AI that is powerful, accessible, and radically easier to adapt. Theories and research concepts once deemed too expensive to test are officially back in the game.
Sapient Intelligence invites you to help us shape a new paradigm for general intelligence.
I'm seeing a lot of hate for Anthropic's decision to secretly nerf ai RnD capabilities.
But I haven't seen critics engage with the imo strongest defence of Anthropic:
1. By far the biggest risks are from superintelligent AI
2. To manage these risks the leading company will need to pause partway through the intelligence explosion.
(Pausing at this time allows them to a) generate the compelling empirical evidence of misalignment that will be needed justify a longer global pause, AND b) use powerful ai to massively accelerate alignment progress. A pause today couldn't accomplish either.)
3. A pause is MUCH more likely if the leading company has a big lead. It's much less likely if multiple companies are neck and neck.
(More specifically, Anthropic had good reason to think OAI wouldn't pause. This makes it v hard for Anthropic to pause if they're neck and neck. Hopefully recent announcements build mutual trust that everyone will pause)
4. If lagging AI companies can use the leader's AI for ai RnD during an intelligence explosion, the leader *cannot* maintain their lead.
(This point is underappreciated. If you model out the intelligence explosion, you'll find that a laggard with equal access to the leading AI quickly catches up to the leader bc the leader faces big headwinds from having plucked low hanging fruit.)
5. So: sharing ai RnD access with competitors massively decreases the chance of a pause at the critical time, and massively increases the risk from superintelligent AI
6. Anthropic can't block competitors using Mythos without the silent sabotage. For the obvious reason: it's very hard for a frozen safeguard to block someone that can iterate against it. It sucks that this is the only way, but it is.
7. They've long had terms of service against competitors using Claude for AI RnD. They have a right to enforce their terms of service. This is the only way.
---
Overall, silent sabotage is a very spooky and scary precedent to be setting and imo the wrong call.
But still, the above is a strong argument for Anthropic's actions and I haven't seen it rebutted.
Today I'm publishing a new essay, Policy on the AI Exponential. AI is progressing extremely fast—much faster than the policy process was built to handle. The essay lays out where I think the technology is now, and the action needed to close the gap: https://t.co/Lh6PWae178
@LukeberryPi Olha os comentários. Boa parte das pessoas simplesmente descarta totalmente a possibilidade de alguém falar a verdade. Não tem o mínimo de dúvida.
Na época do projeto manhattan, alguns deles provavelmente falariam que era fearmongering do Einstein e Szilard.
Calma aí, kk. Acho igualmente ruim quando alguém é categórico pra dizer que vai acabar o mundo ou similar. A diferença é que normalmente eu não preciso falar nada por que alguém já vai e faz esse trabalho por mim.
Novamente, desculpe a chatice, é que tem gente que usa o "sempre" literalmente, daí não tinha como saber.
Tendi, eu só estava sendo chato mesmo (como de costume), porque você foi muito categórico por um momento, mas aparentemente já corrigiu. Na verdade você é um dos perfis que mais concordo com a visão e jeito de pensar.
E eu não sabia dessas tretas aí não, até me surpreende alguém ter chamado de negacionista ou algo assim. Pelo que acompanhei, você sempre foi um dos mais abertos a ideia de AGI na bolha dev.
Concordo que boa parte do pessoal da IA não considera suficientemente os fatores não técnicos dessa discussão. Também concordo que AGI não faria todo mundo abandonar o trabalho "de repente". Mas entre "de repente" e "nunca" tem várias outras possibilidades.
Também acho a IA (e AGI) bem diferente de crypto (e de quase todas outras tecnologias), sendo muito mais impactante, podendo quebrar vários paradigmas com esse impacto.
@coproduto@gustavo_pch@RafaelMorgan Concordo que ele terá que existir por um tempo. Mas novamente, acho "sempre" muito forte. Assim como as leis, isso vai depender da vontade da sociedade e de vários outros fatores que acho difícil prever com tanta certeza.
@coproduto@gustavo_pch@RafaelMorgan Concordo que por um tempo vai ser assim. Mas acho bem difícil cravar que "sempre" será assim. Até porque a necessidade de um "cuidador" limitaria a própria capacidade da IA fazer as coisas de forma mais rápida/eficiente.
Once upon a time there was an Lead AI Developer who's AI was not getting impressive benchmark results. That evening, all of his neighbors came around to commiserate. They said, "We are so sorry to hear that deep learning is hitting a wall. This is most unfortunate." The Lead Developer said, "Maybe."
The next day the LLM came back bringing seven massive benchmark scores and even got 90% on the LSAT. I the evening everybody came back and said, "Oh, isn’t that lucky. What a great turn of events. You now are really close to AGI!" The Lead AI Developer again said, "Maybe."
The following day his son tried to train the next successor model, and while training it, he found that 10x'ing pre-training compute wasn't giving results anymore. The neighbors then said, "Oh dear, that’s too bad. Deep learning is hitting a wall." and the Lead AI Developer responded, “Maybe.”
The day after, the Lead AI Developer announced they'd achieved breakthrough results by adding inference-time compute, RL scaling, and tool use. The neighbors came around and said, "Oh wow, AGI is soon!" The Lead AI Developer said, "Maybe."
O PL 3066/2025 aprovado semana passada na Câmara e encaminhado para o Senado prevê prisão para quem desenvolver ou fornecer serviço de VPN. Não foi falta de aviso meu.