Interesting read -> The Lottery Ticket Hypothesis
This paper shows that large networks contain smaller subnetworks that can train just as well if you know which weights to keep and use the original initialization
https://t.co/JoPQPQyuYU
تقييم التطبيقات بالنجمات غريب شويا، خ��وصاً إننا ماصرنا نعرف إذا التقييم يدعم RTL في التطبيقات اللي تدعم اللغة العربية.
دوبي قيمت تطبيق نجمة وحدة وانا أقصد احط 5 نجمات، اول ما حطيت خيار نجمة راح قفل التقييم وارسله للسيرفر انه تقييمي 1 بدل 5 ولا أقدر أعدل
Bad UX 👎
I find this article interesting, Android team has been rewriting some Android system components in Rust to reduce memory-safety bugs!
https://t.co/cxw9GIAItl
@ChShersh It’s not just juniors. Plenty of experienced engineers avoid taking initiative because they see it as "free labor" or "not my job". That mindset keeps them stuck in a loop where their growth is driven only by market demand, not by technical curiosity or self-chosen challenges
🇨🇳 DeepSeek-R1 was published in Nature yesterday as the cover article for their BRILLIANT latest research.
They show that pure Reinforcement Learning with answer-only rewards can grow real reasoning skills, no human step-by-step traces required.
So completely skip human reasoning traces and still get SOTA reasoning via pure RL.
It’s so powerful revelation, because instead of forcing the model to copy human reasoning steps, it only rewards getting the final answer right, which gives the model freedom to invent its own reasoning strategies that can actually go beyond human examples.
Earlier methods capped models at what humans could demonstrate, but this breaks that ceiling and lets reasoning emerge naturally.
Those skills include self-checking, verification, and changing strategy mid-solution, and they beat supervised baselines on tasks where answers can be checked.
Models trained this way also pass those patterns down to smaller models through distillation.
AIME 2024 pass@1 jumps from 15.6% to 77.9%, and hits 86.7% with self-consistency.
⚙️ The Core Concepts
The paper replaces human-labelled reasoning traces with answer-graded RL, so the model only gets a reward when its final answer matches ground truth, which frees it to search its own reasoning style.
The result is longer thoughts with built-in reflection, verification, and trying backups when stuck, which are exactly the skills needed for math, coding, and STEM problems where correctness is checkable.
This matters because supervised traces cap the model at human patterns, while answer-graded RL lets it discover non-human routes that still land on correct answers.
اللهم صلّ على محمد، وعلى آل محمد، كما صليت على إبراهيم، وعلى آل إبراهيم، إنك حميد مجيد، اللهم بارك على محمد، وعلى آل محمد، كما باركت على إبراهيم، وعلى آل إبراهيم، إنك حميد مجيد.