AI will help discover new science, such as cures for diseases, which is perhaps the most important way to increase quality of life long-term.
AI will also present new threats to society that we have to address. No company can sufficiently mitigate these on their own; we will need a society-wide response to things like novel bio threats, a massive and fast change to the economy, extremely capable models causing complex emergent effects across society, and more.
These are the areas the OpenAI Foundation will initially focus on, and in my opinion are some of the most important ones for us to get right. The Foundation will spend at least $1 billion over the next year.
@woj_zaremba, co-founder of OpenAI, will transition to Head of AI Resilience. I believe that shifting how the world thinks about safety to include a Resilience-style approach is critical, and I am extremely grateful to Wojciech for taking on this role.
Wojciech has been my cofounder for the last decade; anyone who knows him will understand what I mean when I say he is one of a kind. He has a lot of ideas about how we build a new kind of AI safety.
@JacobTref is joining as Head of Life Sciences and Curing Diseases.
@annaadeola, our VP of Global Impact, will transition to Head of AI for Civil Society and Philanthropy.
@robert_kaiden is joining as Chief Financial Officer.
@jeffarnold is joining as Director of Operations.
How can you solve systems of linear equations efficiently and identify if they have a unique solution? In the Mathematics for Machine Learning and Data Science specialization, you'll learn algorithms that simplify these problems, building a strong mathematical foundation for AI and machine learning.
Join the specialization and deepen your understanding of the math behind the algorithms: https://t.co/JLvG4hZxEo
How can businesses go beyond using AI for incremental efficiency gains to create transformative impact? I write from the World Economic Forum (WEF) in Davos, Switzerland, where I’ve been speaking with many CEOs about how to use AI for growth. A recurring theme is that running many experimental, bottom-up AI projects — letting a thousand flowers bloom — has failed to lead to significant payoffs. Instead, bigger gains require workflow redesign: taking a broader, perhaps top-down view of the multiple steps in a process and changing how they work together from end to end.
Consider a bank issuing loans. The workflow consists of several discrete stages:
Marketing -> Application -> Preliminary Approval -> Final Review -> Execution
Suppose each step used to be manual. Preliminary Approval used to require an hour-long human review, but a new agentic system can do this automatically in 10 minutes. Swapping human review for AI review — but keeping everything else the same — gives a minor efficiency gain but isn’t transformative.
Here’s what would be transformative: Instead of applicants waiting a week for a human to review their application, they can get a decision in 10 minutes. When that happens, the loan becomes a more compelling product, and that better customer experience allows lenders to attract more applications and ultimately issue more loans.
However, making this change requires taking a broader business or product perspective, not just a technology perspective. Further, it changes the workflow of loan processing. Switching to offering a “10-minute loan” product would require changing how it is marketed. Applications would need to be digitized and routed more efficiently, and final review and execution would need to be redesigned to handle a larger volume.
Even though AI is applied only to one step, Preliminary Approval, we end up implementing not just a point solution but a broader workflow redesign that transforms the product offering.
At AI Aspire (an advisory firm I co-lead), here’s what we see: Bottom-up innovation matters because the people closest to problems often see solutions first. But scaling such ideas to create transformative impact often requires seeing how AI can transform entire workflows end to end, not just individual steps, and this is where top-down strategic direction and innovation can help.
This year's WEF meeting, as in previous years, has been an energizing event. Among technologists, frequent topics of discussion include Agentic AI (when I coined this term, I was not expecting to see it plastered on billboards and buildings!), Sovereign AI (how nations can control their own access to AI), Talent (the challenging job market for recent graduates, and how to upskill nations), and data-center infrastructure (how to address bottlenecks in energy, talent, GPU chips, and memory). I will address some of these topics in future posts.
Against the backdrop of geopolitical uncertainty, I hope all of us in AI will keep building bridges that connect nations, sharing through open source, and building to benefit all nations and all people.
[Original text: https://t.co/Ck52mNGX4a ]
High love and high structure.
I grew up in a home that was high structure but low love. My father's rules were absolute, enforced by the threat of conflict, but there wasn't warmth underneath. When I started building teams, I instinctively swung the other way: high love, low structure. I wanted everyone to feel supported. I avoided setting hard expectations because I didn't want to be my dad. It didn't work. People didn't know where they stood. Standards were fuzzy. The team drifted. What I eventually learned is that the optimal isn't one or the other — it's both. Genuine care for people combined with clear expectations and real accountability. Most people need structure to thrive. They want to know what's expected, how they're doing, and where the boundaries are. That's not mean. That's respectful. High love without high structure is actually a form of neglect.
Happy New Year!
In the New Year issue of The Batch, Andrew Ng introduces the Turing-AGI Test, a proposal to evaluate systems' capabilities for real, economically useful work, not hype.
And we bring perspectives from:
- IBM's David Cox: Open Source Wins
- Princeton's Adji Bousso Dieng: AI for Scientific Discovery
- Microsoft's Juan M. Lavista Ferres: Education That Works With — Not Against — AI
- Allen Institute's Tanmay Gupta: From Prediction to Action
- UC-San Diego's Pengtao Xie: Multimodal Models for Biomedicine
- AMD's Sharon Zhou: Chatbots That Build Community
Each presents a thoughtful look at where AI is headed, and how we should measure progress.
Read The Batch now: https://t.co/5xT9yoAZal
OpenReview is one of the most important pillars supporting AI research and knowledge sharing, through open peer review and publishing. But as a non-profit, it needs our community’s support. Please consider making a donation to this great institution!
https://t.co/TBYB1cCKyI
As amazing as LLMs are, improving their knowledge today involves a more piecemeal process than is widely appreciated. I’ve written before about how AI is amazing... but not that amazing. Well, it is also true that LLMs are general... but not that general. We shouldn’t buy into the inaccurate hype that LLMs are a path to AGI in just a few years, but we also shouldn’t buy into the opposite, also inaccurate hype that they are only demoware. Instead, I find it helpful to have a more precise understanding of the current path to building more intelligent models.
First, LLMs are indeed a more general form of intelligence than earlier generations of technology. This is why a single LLM can be applied to a wide range of tasks. The first wave of LLM technology accomplished this by training on the public web, which contains a lot of information about a wide range of topics. This made their knowledge far more general than earlier algorithms that were trained to carry out a single task such as predicting housing prices or playing a single game like chess or Go. However, they’re far less general than human abilities. For instance, after pretraining on the entire content of the public web, an LLM still struggles to adapt to write in certain styles that many editors would be able to, or use simple websites reliably.
After leveraging pretty much all the open information on the web, progress got harder. Today, if a frontier lab wants an LLM to do well on a specific task — such as code using a specific programming language, or say sensible things about a specific niche in, say, healthcare or finance — researchers might go through a laborious process of finding or generating lots of data for that domain and then preparing that data (cleaning low-quality text, deduplicating, paraphrasing, etc.) to create data to give an LLM that knowledge.
Or, to get a model to perform certain tasks, such as use a web browser, developers might go through an even more laborious process of creating many RL gyms (simulated environments) to let an algorithm repeatedly practice a narrow set of tasks.
A typical human, despite having seen vastly less text or practiced far less in computer-use training environments than today's frontier models, nonetheless can generalize to a far wider range of tasks than a frontier model. Humans might do this by taking advantage of continuous learning from feedback, or by having superior representations of non-text input (the way LLMs tokenize images still seems like a hack to me), and many other mechanisms that we do not yet understand.
Advancing frontier models today requires making a lot of manual decisions and taking a data-centric AI approach to engineering the data we use to train our models. Future breakthroughs might allow us to advance LLMs in a less piecemeal fashion than I describe here. But even if they don’t, the ongoing piecemeal improvements, coupled with the limited degree to which these models do generalize and exhibit “emergent behaviors,” will continue to drive rapid progress.
Either way, we should plan for many more years of hard work. A long, hard — and fun! — slog remains ahead to build more intelligent models.
[Original text: https://t.co/SHRN5JDvTW ]
Sharing a fun recipe for building a highly autonomous, moderately capable, and very UNreliable agent using the open source aisuite package that Rohit Prasad and I have been working on.
With a few lines of code, you can give a frontier LLM a tool (like disk access or web search), prompt it with a high-level task (such as creating a snake game and saving as an HTML file, or carrying out deep research), and let the LLM loose and see what it does. Example in image.
Caveat: This is not how practical agents are built today, since most need much more scaffolding (see my Agentic AI course to learn more), but is still interesting to experiment with.
Longer write-up here: https://t.co/BdS8tGhnIy