Just published a new blogpost on the @taresco_hq blog on different encoding schemes and how they impact tokenization for African Languages. This is the first in a series of articles about encoding and tokenization.
Hope you enjoy reading it!
https://t.co/BJlwb1rvrr
BrowseComp-Plus (just accepted at #ACL2026 BTW 🎉 congrats @zijian42chen@xueguang_ma@ShengyaoZhuang et al.) can be described as a projection of "riddle-style" questions from BrowseComp on the live web to a custom-built static corpus. https://t.co/bjFeN8LuQg
We curated a list of high-quality multilingual LLM training datasets containing at least one African language to save you hours of searching! It includes pre-training, post-training data, reasoning, math, and more – all with sources, split sizes and language breakdown
Excited to share that two papers led by Kosei Uemura are accepted at #EACL2026! 🎉
MERLIN: Multi-Stage Curriculum Alignment for Multilingual Encoder-LLM Integration in Cross-Lingual Reasoning https://t.co/Tfbayyn8Da
AfriMTEB and AfriE5: https://t.co/LF65rtCt3l
Heading to AWS re:Invent in Las Vegas next week! ☁️
I’ll be leading and supporting a few sessions on optimizing open weight models, automated reasoning, building trustworthy AI apps on Bedrock, and Amazon Nova.
Looking forward to catching up. Reach out if you're attending.
@taresco_hq But we didn't stop at numbers.
We show actual model outputs for Machine Translation and Mathematical Reasoning tasks, so you can see exactly where N-ATLaS succeeds and where it struggles, in each language.
4/
How Much Did N-ATLaS-LLM Move the Needle?
The African Research Collective @taresco_hq evaluated Nigeria's multilingual LLM (N-ATLaS) using AfroBench across Yoruba, Igbo, Hausa & English.
The results? More interesting than you'd expect
Full report: https://t.co/fgCayaVxgd
1/
LLMs are injective and invertible.
In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space.
(1/6)
How to become expert at thing:
1 iteratively take on concrete projects and accomplish them depth wise, learning “on demand” (ie don’t learn bottom up breadth wise)
2 teach/summarize everything you learn in your own words
3 only compare yourself to younger you, never to others