Every try scored, every victory celebrated, and every dream pursued is possible because of the sacrifices of those who came before us.
Today, we salute Uganda's heroes past and present, whose courage continues to inspire generations.
🇺🇬 Happy Heroes Day
#HeroesDay2026#HeroDay
To the family of Sydney Gongodyo, Black Pirates Rugby Club, the Uganda Rugby Cranes, and the wider Ugandan rugby community, my heart goes out to you, and I offer my deepest condolences on the passing of Sydney Gongodyo.
At times like these, beyond the competition, beyond the jerseys and the stadiums, we are first and foremost a family. Today, that family is grieving the loss of one of its own.
@UgandaRugby@RugbyAfrique #UgandaRugby
Today, we join the Uganda Rugby Community in mourning the loss of Sydney Gongodyo.
Our heartfelt condolences go out to his family, friends, teammates and fans at @piratesrugbyUG
Rest easy, Champ.🕊️
#KCBKOBs#PoetryInMotion#BlueArmy
Thinking of building an “Amazo-style” agent
1. Absorbs new skills dynamically
2. Learns workflows by observation
3. combines abilities into higher-order capabilities
4. evolves based on experience
Increasingly making use of a ‘skills distillation’ pattern where I have stronger LLMs write out the skills file and then reuse that with cheaper models.
Feels like a evolution of traditional knowledge distillation?
Just like we have code linters, we’ll soon (or already have) have ‘LLM linters’ running continuously across docs, policies, and datasets flagging inconsistencies, drift, conflicting definitions, and broken reasoning before humans notice.
I’ve always believed the No.1 application of AI should be to improve human health.
That work started with AlphaFold, and now at @IsomorphicLabs with the mission to reimagine drug discovery and one day solve all disease!
We are turbocharging that goal with $2.1B in new funding.
New Anthropic research: Natural Language Autoencoders.
Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read.
Here, we train Claude to translate its activations into human-readable text.
Uganda Rugby Legends carry the casket bearing the body of the late Rugby Cranes player O’Brien Tindimbwebwa today at Christ the King Church.
O’Brien passed away on Friday, 1st May 2026, following an accident.
May his soul rest in eternal peace.
Want to transcribe Luganda speech with AI? This fine-tuned Whisper model does exactly that. It's a small, efficient model trained on 400 hours of Luganda audio. Perfect for developers and researchers working on African language tech.
Meet Gemma 4: our new family of open models you can run on your own hardware.
Built for advanced reasoning and agentic workflows, we’re releasing them under an Apache 2.0 license. Here’s what’s new 🧵
One common issue with personalization in all LLMs is how distracting memory seems to be for the models. A single question from 2 months ago about some topic can keep coming up as some kind of a deep interest of mine with undue mentions in perpetuity. Some kind of trying too hard.
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow.
Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes.
As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now.
It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.
I love the expression “food for thought” as a concrete, mysterious cognitive capability humans experience but LLMs have no equivalent for.
Definition: “something worth thinking about or considering, like a mental meal that nourishes your mind with ideas, insights, or issues that require deeper reflection. It's used for topics that challenge your perspective, offer new understanding, or make you ponder important questions, acting as intellectual stimulation.”
So in LLM speak it’s a sequence of tokens such that when used as prompt for chain of thought, the samples are rewarding to attend over, via some yet undiscovered intrinsic reward function. Obsessed with what form it takes. Food for thought.
Don't think of LLMs as entities but as simulators. For example, when exploring a topic, don't ask:
"What do you think about xyz"?
There is no "you". Next time try:
"What would be a good group of people to explore xyz? What would they say?"
The LLM can channel/simulate many perspectives but it hasn't "thought about" xyz for a while and over time and formed its own opinions in the way we're used to. If you force it via the use of "you", it will give you something by adopting a personality embedding vector implied by the statistics of its finetuning data and then simulate that. It's fine to do, but there is a lot less mystique to it than I find people naively attribute to "asking an AI".
Research ML and applied ML are completely a personal interest.
I recommend exploring both. I did both, and for me, applied ML is a bit more interesting. If you're working on research, ML metrics are important.
For applied ML, business metrics are important + it requires domain knowledge. For research people their work revolves around benchmarks.
How to evaluate an ML/LLM model in 2025?
Precision, recall, and F1 score are crucial parts of ML evaluation. In the past, they were theoretically the only way to judge a model. Kaggle competitions later made them famous for earning medals, but in 2025, no app or algorithm is perfect, and ML evaluation has evolved.
Here’s the reality: data is growing insanely fast. You often need to retrain your model every couple of months because old patterns quickly become irrelevant. A single “best” model with 90% metrics can become outdated fast.
What should you do?
- If your model has a decent score (e.g., ~80% accuracy or precision/recall, which is actually great in production), push it to production.
- Deploy it and run A/B testing, this is the only real way to validate performance against live data.
- Retrain regularly based on new data and feedback.
- Evaluate the model on business impact.
In 2025, a robust ML pipeline matters far more than a single accuracy metric. Researchers are building strong foundational models for general tasks, but real-world success depends on your system design.
The ML pipeline and system design are now mandatory.
Keep learning ;)