Principal Researcher at @tiiuae. Passionate about #AI and #NLProc. I make LLMs and I help machines to think 🧠, read 🤓 and talk 🗣️! #ML#GenAI#DL#ArabicNLP
Read the full paper to understand our quality-first curation methodology and what we found across existing Arabic benchmarks.
📄 https://t.co/QD4i1JX71w
If you work on Arabic NLP, submit your model, read the paper, and let us know what you think. 🙌
#ArabicNLP#LLM#AI#QIMMA
🚀 We are very happy to introduce QIMMA قمّة: the first quality-assured Arabic LLM leaderboard.
14 benchmarks. 52,000+ validated samples. 99% native Arabic content. Built from scratch because we weren't satisfied with what existed.
🏆 https://t.co/mQnYCkAOAV
Full reproducibility.
Every result comes with the exact run config: precision, model type, tensor parallelism settings.
All evaluation code is open source.
💻 https://t.co/R9p4gPdh4s
We are releasing Falcon Perception, an open-vocabulary referring expression segmentation model. Along with it, a 0.3B OCR model that is on par with 3-10x larger competitors.
Current systems solve this with complex pipelines (separate encoders, late fusion, matching algorithms). We developed a novel simpler "bitter" approach: one early-fusion Transformer (image + text from first layer) with a shared parameter space, and let scale + training signal do the work. Please check our work !
📄 Paper: https://t.co/dWvK5t7MIt
💻 Code: https://t.co/AJ65GbMrUY
🎮 Playground: https://t.co/BIgisZkeid
🤗 Blogpost: https://t.co/J2IjlBPywF
1990-1991 has become important in hindsight. Back then, my TUM research group published principles of (1) the P in ChatGPT, (2) the T in ChatGPT, (3) GANs (now used for deepfakes), (4) neural network distillation (key for DeepSeek and other LLMs), (5) artificial curiosity for improving neural World Models (now a hot topic), (6) LSTM (the most cited AI paper of the 20th century), (7) deep residual learning (basis of the most cited paper of the 21st century). As of 2025, the two most frequently cited scientific articles of all time (most citations within 3 years - manuals excluded) are both directly based on our 1991 work. See the revised 2026 report "Deep Learning: Our Miraculous Year 1990-1991" (52 pages, 300+ references, 38 illustrations): https://t.co/aocq01AToq
Can reasoning thrive in small models? Falcon-H1-Tiny-R (0.6B & 0.09B) excels on AIME24/25, LiveCodeBench & Math500 when trained on reasoning data.
Blog: https://t.co/r9LLmM0AOP
Models: https://t.co/5PmKHI648b
#TII#AI#FalconH1RTiny#Reasoning
Falcon H1 Arabic was built with one principle: Arabic must be understood, not approximated. Native data, dialects, and culture guided every step from training to alignment, creating AI that truly speaks the Arab world.
Try it: https://t.co/F7XvQ60x5l
#TII#Falcon#AI#Arabic
Ever wondered what the next frontier of Arabic AI looks like? Falcon-H1 Arabic delivers top-ranked performance, massive context, and unmatched Arabic content quality.
Explore more: https://t.co/F7XvQ5ZZfN
#FalconH1Arabic#TII#ArabicAI#AI#Innovation
When performance speaks, benchmarks follow. TII’s Falcon-H1 Arabic sets a new global standard for Arabic AI with top-ranked results across leading evaluations.
Explore more: https://t.co/F7XvQ60x5l
#FalconH1Arabic#ArabicAI#TII#AI#Innovation