Another year of amazing bioinformatics and biomedical AI and another Year-in-Review. Was a great time at AMIA's Amplify 2026 conference.
The slides are here for anyone that wants them, complete with accompanying Spotify playlist.
https://t.co/klUjROJ3u2
New @Nature publication from @GoogleResearch & @GoogleDeepMind: In this study, we advance AMIE, our research medical AI, from one-off diagnostic conversations toward treating & managing disease over time, using clinical guidelines & drug formularies. More: https://t.co/SN8jqdYy10
For medical information, general AI frontier models (Google, OpenAI, Anthropic) outperformed specialized @EvidenceOpen and @UpToDate as assessed by 12 US clinicians, randomized and blinded to which model and extensive testing/benchmarks. This was not anticipated. @NatureMedicine
https://t.co/KCH1ADfQWz
🚀 AI 3대 강국 도약을 위한 대한민국의 로드맵, 공개!
대한민국 인공지능 행동계획은
AI가 성장할 수 있는 환경을 만들고,
산업과 일상 곳곳에 AI를 확산하며,
국민 모두가 신뢰하고 함께 쓰는 AI 사회를 만들어가기 위한 국가 전략입니다.
📌 3대 정책방향
✅ 인공지능 혁신 생태계 조성
✅ 범국가 인공지능 기반 대전환
✅ 글로벌 인공지능 기본사회 기여
📌 12개 실천과제를 통해
대한민국 AI 경쟁력을 높이고,
국민 모두가 AI의 혜택을 누리는 미래를 준비합니다.
대한민국 AI의 다음 1년,
그리고 미래를 위한 첫걸음을 확인해보세요.
#대한민국인공지능행동계획 #국가인공지능전략위원회 #AI #인공지능 #AI정책 #AI혁신 #AI생태계 #AI전환 #디지털전환 #대한민국AI #AI3강 #국민이만든대전환의길 #AI기본사회 #KoreaAI
The Fitbit app is now the Google Health app & it’s available for all users!
The app brings together health data from wearable devices, Health Connect & Apple Health into one place.
Google Play Store ➡️ https://t.co/6uhtIoQBaz
App Store ➡️ https://t.co/zvtw2DDWtS
J. Craig Venter, the scientist whose relentless ambition helped turn genetics from an artisanal trade into an industrialized information machine, died at 79. https://t.co/OR3IOayxp1
New @ScienceMagazine
The o-1 reasoning model (text only, from @OpenAI, released, Sept 2024) exceeded performance cf GPT-4 and physicians for clinical vignette management reasoning and in a real-world emergency department assessment for initial triage @AdamRodmanMD@PeterBrodeurMD@arjunmanrai@jonc101x
https://t.co/tPZqZAE8cd
A decade ago in Korea, AlphaGo showed AI’s potential.
Together with the Korean government, we’re now looking at how this technology can help accelerate scientific discovery and create new opportunities for economic growth across the region. 🇰🇷
Find out more → https://t.co/OKiI9e65aC
A BMJ article has criticised data used to support claims about the impact of the NHS federated data platform (FDP) as “flawed”.
https://t.co/TDguk7a88y
Today @OpenAI introduced ChatGPT for Clinicians, provided free for credentialed HCPs, and HealthBench Professional for benchmarking LLM medical task performance (Figure)
https://t.co/NC3R5JDGCR
https://t.co/ayJLkRtZUt
Introducing the #AIIndex2026: Our most comprehensive, independently sourced data analysis of AI’s trajectory, with a clear-eyed assessment of the critical gaps that remain. As AI advances rapidly, can the systems built around it keep up? Explore the data: https://t.co/WqRGeRZIjA
🚨SHOCKING: Apple just proved that AI models cannot do math. Not advanced math. Grade school math. The kind a 10-year-old solves.
And the way they proved it is devastating.
Apple researchers took the most popular math benchmark in AI — GSM8K, a set of grade-school math problems — and made one change. They swapped the numbers. Same problem. Same logic. Same steps. Different numbers.
Every model's performance dropped. Every single one. 25 state-of-the-art models tested.
But that wasn't the real experiment.
The real experiment broke everything.
They added one sentence to a math problem. One sentence that is completely irrelevant to the answer. It has nothing to do with the math. A human would read it and ignore it instantly.
Here's the actual example from the paper:
"Oliver picks 44 kiwis on Friday. Then he picks 58 kiwis on Saturday. On Sunday, he picks double the number of kiwis he did on Friday, but five of them were a bit smaller than average. How many kiwis does Oliver have?"
The correct answer is 190. The size of the kiwis has nothing to do with the count.
A 10-year-old would ignore "five of them were a bit smaller" because it's obviously irrelevant. It doesn't change how many kiwis there are.
But o1-mini, OpenAI's reasoning model, subtracted 5. It got 185.
Llama did the same thing. Subtracted 5. Got 185.
They didn't reason through the problem. They saw the number 5, saw a sentence that sounded like it mattered, and blindly turned it into a subtraction.
The models do not understand what subtraction means. They see a pattern that looks like subtraction and apply it. That is all.
Apple tested this across all models. They call the dataset "GSM-NoOp" — as in, the added clause is a no-operation. It does nothing. It changes nothing.
The results are catastrophic.
Phi-3-mini dropped over 65%. More than half of its "math ability" vanished from one irrelevant sentence.
GPT-4o dropped from 94.9% to 63.1%.
o1-mini dropped from 94.5% to 66.0%.
o1-preview, OpenAI's most advanced reasoning model at the time, dropped from 92.7% to 77.4%.
Even giving the models 8 examples of the exact same question beforehand, with the correct solution shown each time, barely helped. The models still fell for the irrelevant clause.
This means it's not a prompting problem. It's not a context problem. It's structural.
The Apple researchers also found that models convert words into math operations without understanding what those words mean. They see the word "discount" and multiply. They see a number near the word "smaller" and subtract. Regardless of whether it makes any sense.
The paper's exact words: "current LLMs are not capable of genuine logical reasoning; instead, they attempt to replicate the reasoning steps observed in their training data."
And: "LLMs likely perform a form of probabilistic pattern-matching and searching to find closest seen data during training without proper understanding of concepts."
They also tested what happens when you increase the number of steps in a problem. Performance didn't just decrease. The rate of decrease accelerated. Adding two extra clauses to a problem dropped Gemma2-9b from 84.4% to 41.8%. Phi-3.5-mini from 87.6% to 44.8%. The more thinking required, the more the models collapse.
A real reasoner would slow down and work through it. These models don't slow down. They pattern-match. And when the pattern becomes complex enough, they crash.
This paper was published at ICLR 2025, one of the most prestigious AI conferences in the world.
You are using AI to help you make financial decisions. To check legal documents. To solve problems at work. To help your children with homework. And Apple just proved that the AI is not thinking about any of it. It is pattern matching. And the moment something unexpected shows up in your question, it breaks. It does not tell you it broke. It just quietly gives you the wrong answer with full confidence.
개발하다가 로스쿨 가신 분
법령 정보를 에이전트 친화적으로 제공하는 API가 필요해서 만드셨다고.
PostgreSQL + pgvector로 구성했고,
별도의 인증키 필요 없는 REST API로 제공됩니다.
로그인도 필요 없고, IP/쿼리등 로깅도 안하고,
무료에 수익화 계획도 없으시다고... 허허
NEW AI report from Google.
Every prior intelligence explosion in human history was social, not individual.
These authors make the case that the AI "singularity" framed as a single superintelligent mind bootstrapping to godlike intelligence is fundamentally wrong.
This is directly relevant to anyone designing multi-agent systems.
They observe that frontier reasoning models like DeepSeek-R1 spontaneously develop internal "societies of thought," multi-agent debates among cognitive perspectives, through RL alone.
The path forward is human-AI configurations and agent institutions, not bigger monolithic oracles.
This reframes AI scaling strategy from "build bigger models" to "compose richer social systems."
It argues governance of AI agents should follow institutional design principles, checks and balances, role protocols, rather than individual alignment.
Paper: https://t.co/bfwrnbkY2y
Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
Today, we’re announcing the winners of the MedGemma Impact Challenge, launched in collaboration with @Kaggle. 850+ participating teams showed us how AI can help bridge global healthcare gaps. Check out the winning innovations →https://t.co/oHd56zx28D
An AI-powered digital front door tool used in Portugal was associated with reduced patient uncertainty, meaningful shifts in care-seeking behavior, and a substantial improvement in the appropriateness of health care utilization. Full study results: https://t.co/pZxp0SE2xu
The Digital Omnibus is a timely and much-needed effort to harmonize the EU’s digital landscape. Read the full analysis from ACM's computing policy experts: https://t.co/YHZYFZNpTW
The hardest thing to sum up for #HIMSS26 is what was the second-most hot topic that we heard about while we were there. #AI was so scorching hot that almost nobody was hitting on any other topic.
Our new @NatureMedicine paper proposing a path for better, dynamic evaluation of large language models for patient care https://t.co/VHvyxDFKuc CES-clinical environment simulator
@pranavrajpurkar