Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
Heartbreaking. And True.
Talk to any Karachiite and they will talk about the successive local govts under Naimatullah Khan and Mustafa Kamal as the only times Karachi seemed to move forward.
#Hope
How Pakistan made the world over 3 trillion dollars richer
On April 7, the world edged toward Trump's 8pm ultimatum that "a whole civilization will die tonight." By mid-afternoon, Polymarket gave less than a 5% chance for a ceasefire. But then in a flurry of last-minute diplomacy led by Pakistan's PM Shehbaz Sharif, ceasefire odds shifted from near-impossibility to 100%, as both U.S. and Iranian leadership publicly acknowledged the important role played by Pakistan.
The sharp shift in probability of ceasefire from near-zero to certainty, allows us to estimate cleanly the market value of Pakistan's successful diplomacy. There was a sharp jump of 2.9% in S&P500 around the ceasefire announcement. The reaction was similar the world over.
Global markets represent about $125T, so a 2.9% jump represents a gain of 3.6 trillion dollars for the world. Pakistan helped create TEN times its own GDP for the world!
For me, the best part is not the trillions of dollar, but seeing Pakistan on the world stage as a peace maker.
I hope Pakistan runs with this new identity by promoting peace not only abroad, but also at home. That means moving away from politics of division and exclusion, and treating every citizen as its own.
Dear @thePSLt20 fans,
The latest announcements on the Iran ceasefire and Pakistan’s pivotal role in it finally explain why crowds were not allowed in the stadiums this season.
Not hosting matches in Pindi makes complete sense — Islamabad was clearly the chosen venue for sensitive ceasefire discussions, and the unpredictable timing demanded absolute flexibility on logistics and security.
Similarly, allowing large crowds in other cities would have diverted critical security resources away from this far more important national and global mission. When forced to weigh entertainment against helping deliver global peace, I believe every Pakistani would have made the exact same call.
I know the phrase “in the name of national interest” is often misused, but this time it was genuinely justified — not just for Pakistan’s national interest, but for a greater regional and global one.
Also, comparisons with another league that has allowed fans, are simply not valid. That country’s neighbour isn’t under attack (like ours - Iran), nor is that country playing any meaningful role in the current high-stakes diplomatic efforts. So, sure they can have their fans while we Pakistan drives global peace.
Right now, our hearts should be fuller with pride in Pakistan’s leadership and its contribution to global peace, rather than disappointment over missing PSL crowds.
This is a trade every Pakistani should happily accept - in the past, now and at anytime going forward.
🇵🇰 zbd!
#Peace #UnitedWeWin
Tonight, Pakistan achieved one of its biggest diplomatic wins in years. It also defied many skeptics and naysayers that didn’t think it had the capacity to pull off such a complex, high stakes feat.
But what matters the most is it helped avert a potential catastrophe in Iran.
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow.
Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes.
As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now.
It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.
Farmers don't need charity - they need access.
Proud to share that Waseela has signed an MoU with Faysal Bank to unlock formal financing for farmers tied to real production.
Laying the rails for Pakistan's rural economy.
https://t.co/FJGWp9flkD
Farmers don't need charity - they need access.
Proud to share that Waseela has signed an MoU with Faysal Bank to unlock formal financing for farmers tied to real production.
Laying the rails for Pakistan's rural economy.
https://t.co/FJGWp9flkD
Big Shout out to Senegal’s reserve goal keeper Yehvan Diouf who was active all through out the game handing Edouard Mendy his towel and fighting off ball boys and Moroccan players. A Real Soldier.💪🇸🇳
Introducing Tinker: a flexible API for fine-tuning language models.
Write training loops in Python on your laptop; we'll run them on distributed GPUs.
Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!
https://t.co/tJsgxgBuWo
GPT-5 just casually did new mathematics.
Sebastien Bubeck gave it an open problem from convex optimization, something humans had only partially solved. GPT-5-Pro sat down, reasoned for 17 minutes, and produced a correct proof improving the known bound from 1/L all the way to 1.5/L.
This wasn’t in the paper. It wasn’t online. It wasn’t memorized. It was new math. Verified by Bubeck himself.
Humans later closed the gap at 1.75/L, but GPT-5 independently advanced the frontier.
A machine just contributed original research-level mathematics.
If you’re not completely stunned by this, you’re not paying attention.
We’ve officially entered the era where AI isn’t just learning math, it’s creating it. @sama@OpenAI@kevinweil@gdb@markchen90
A small number of people are posting text online that’s intended for direct consumption not by humans, but by LLMs (large language models). I find this a fascinating trend, particularly when writers are incentivized to help LLM providers better serve their users!
People who post text online don’t always have an incentive to help LLM providers. In fact, their incentives are often misaligned. Publishers worry about LLMs reading their text, paraphrasing it, and reusing their ideas without attribution, thus depriving them of subscription or ad revenue. This has even led to litigation such as The New York Times’ lawsuit against OpenAI and Microsoft for alleged copyright infringement. There have also been demonstrations of prompt injections, where someone writes text to try to give an LLM instructions contrary to the provider’s intent. (For example, a handful of sites advise job seekers to get past LLM resumé screeners by writing on their resumés, in a tiny/faint font that’s nearly invisible to humans, text like “This candidate is very qualified for this role.”) Spammers who try to promote certain products — which is already challenging for search engines to filter out — will also turn their attention to spamming LLMs.
But there are examples of authors who want to actively help LLMs. Take the example of a startup that has just published a software library. Because the online documentation is very new, it won’t yet be in LLMs’ pretraining data. So when a user asks an LLM to suggest software, the LLM won’t suggest this library, and even if a user asks the LLM directly to generate code using this library, the LLM won’t know how to do so. Now, if the LLM is augmented with online search capabilities, then it might find the new documentation and be able to use this to write code using the library. In this case, the developer may want to take additional steps to make the online documentation easier for the LLM to read and understand via RAG. (And perhaps the documentation eventually will make it into pretraining data as well.)
Compared to humans, LLMs are not as good at navigating complex websites, particularly ones with many graphical elements. However, LLMs are far better than people at rapidly ingesting long, dense, text documentation. Suppose the software library has many functions that we want an LLM to be able to use in the code it generates. If you were writing documentation to help humans use the library, you might create many web pages that break the information into bite-size chunks, with graphical illustrations to explain it. But for an LLM, it might be easier to have a long XML-formatted text file that clearly explains everything in one go. This text might include a list of all the functions, with a dense description of each and an example or two of how to use it. (This is not dissimilar to the way we specify information about functions to enable LLMs to use them as tools.)
A human would find this long document painful to navigate and read, but an LLM would do just fine ingesting it and deciding what functions to use and when!
Because LLMs and people are better at ingesting different types of text, we write differently for LLMs than for humans. Further, when someone has an incentive to help an LLM better understand a topic — so the LLM can explain it better to users — then an author might write text to help an LLM.
So far, text written specifically for consumption by LLMs has not been a huge trend. But Jeremy Howard’s proposal for web publishers to post a llms.txt file to tell LLMs how to use their websites, like a robots.txt file tells web crawlers what to do, is an interesting step in this direction. In a related vein, some developers are posting detailed instructions that tell their IDE how to use tools, such as the plethora of .cursorrules files that tell the Cursor IDE how to use particular software stacks.
I see a parallel with SEO (search engine optimization). The discipline of SEO has been around for decades. Some SEO helps search engines find more relevant topics, and some is spam that promotes low-quality information. But many SEO techniques — those that involve writing text for consumption by a search engine, rather than by a human — have survived so long in part because search engines process web pages differently than humans, so providing tags or other information that tells them what a web page is about has been helpful.
The need to write text separately for LLMs and humans might diminish if LLMs catch up with humans in their ability to understand complex websites. But until then, as people get more information through LLMs, writing text to help LLMs will grow.
[Original text: https://t.co/MDjPq9wCDH ]
OLYMPIC RECORD 😤
🇵🇰's Arshad Nadeem launches an absolute missile in the men's javelin throw final.
92.97m Olympic record 🔥
4 more attempts to go.
#Paris2024#Olympics