After last Sunday´s elections in #Hungary all eyes have been on Budapest. A real hero story is happening in #Poland though. Here is their story. Meet IZG, who are inventing the future #economic#model of Poland. https://t.co/HTpL9znPuS
My thoughts on AI agents and what they mean for academia
I really struggled to write this. A lot of it is speculative, and I tried hard not to be preachy but honestly, I couldn't help it. I feel the efficiency gains & impacts are too unsettling to ignore
https://t.co/idC7ehafBB
This has immediate consequences on job crafting and decisions about what humans could focus on vs agentic work, fast evolving. Let's map the evolution and define what remains in the end.
Claude Opus 4.6 takes the lead in GDPval-AA, surpassing GPT-5.2 in our benchmark of agentic real-world knowledge work tasks
We worked with @AnthropicAI to benchmark Claude Opus 4.6 ahead of launch - it reached an Elo of 1606 with adaptive thinking, nearly 150 points ahead of GPT-5.2 (xhigh). This implies a win rate of ~70% when compared head-to-head with OpenAI’s December 2025 flagship.
Using ~160 million tokens in adaptive thinking mode to complete the 220 tasks in GDPval-AA, Claude Opus 4.6 used 30-60% more tokens than Opus 4.5, but still far fewer than GPT-5.2 (xhigh). This increased token use combined with its high per-token pricing ($5/$25 per million tokens, no change from Opus 4.5) makes Claude Opus 4.6 the most costly model we’ve tested on GDPval-AA so far.
See below for a breakdown of token use, turns and cost, along with example file outputs.
The full set of Artificial Analysis Intelligence Index benchmarks are in progress - we will share a full update on the performance of Opus 4.6 when complete.
GDPval-AA is our primary metric for general agentic performance, measuring the performance of models on knowledge work tasks from preparing presentations and data analysis through to video editing. Models use shell access and web browsing in an agentic loop through Stirrup, our open-source agentic reference harness.
The underlying GDPval dataset was released by @OpenAI in September 2025 to capture self-contained work tasks across 44 occupations in 9 different sectors. It offers insight into the types of tasks models can complete that are relevant to today’s workforce, and is highly realistic due to the OpenAI team’s expert filtering and curation.
Does AI already have human-level intelligence? The evidence is clear according to this nature article. Let's then focus on what's coming and how to prepare. https://t.co/pR3ALuIBID
Yet more proof of the importance to combine scientific and technical #expertise with #socialskills . How to AI-proof your job https://t.co/QLLyBchmsE via @ft
Goldman Sachs has released a shocking report summarizing the job types that AI will take over.
It states that AI has the potential to automate the work equivalent to 300 million people worldwide.
Those particularly affected are knowledge workers, with
・Administrative and support: 46%
・Legal: 44%
・Architecture and engineering: 37%
On the other hand, the impact on physical labor such as construction and maintenance is less than 6%. The result is that AI will first replace knowledge workers.
This is bad but somehow expected. Job growth will return when we hire for talent and not a precise fit on a precise role. Entirely revising workforce planning, adapting to AI progress and the changing nature of work. #futureofwork
CNBC: "Very disappointing. Private sector jobs for September comes in at -32,000. Economists were looking for an increase of 51,000. August was also revised down sharply, the month now showing -3,000 compared to the initial read of +54,000.“
#AI adoption in #Ireland by @ibec_irl shows strong progress in larger companies but a clear need for training and continuing skills adaptation. #jobcrafting
@MichaelAArouet Lack of trust in society, centuries of absolute regimes + economic activity in the hands of just a few in the South. Free city states, great trade routes and strong civil society in the North, creating a culture of innovation and entrepreneurship. Now, will the South bounce back?
Six Facts about the Recent Employment Effects of Artificial Intelligence — Stanford Digital Economy Lab. I wonder will happen to entry level graduates. Time to craft their job profiles and put jobs growth back on the rails. https://t.co/G4tuMc2BHE
Using AI to write leads to cognitive debt. Great term to show what ChatGPT may be doing to our skill sets. It coincides with the study showing doctors having worse diagnostics skills when they use AI. What's your take away @NTFabiano ? Thank you for sharing
"a new study has found that doctors who regularly used AI actually became less skilled within months.
The study, which was published on Wednesday in the Lancet Gastroenterology and Hepatology journal, found that over the course of six months, clinicians became over-reliant on AI
recommendations and became themselves “less motivated, less focused, and less responsible when making cognitive decisions without AI assistance.”"
This concerns cancer diagnosis.