🚀 Ищу senior DS: AI агенты + классика в HR-блок, чтобы сделать путь сотрудника лучше и технологичнее:
Откликаться на hh https://t.co/Os5YoY9hSN
https://t.co/3THXyCmfaD для инфо
I updated my tests (1st table, https://t.co/IUbrqCCjs6) — and RealMLP actually beats boosting. Fast, out-of-the-box, no hyperparam tuning needed. Worth trying on real tasks.
TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling
A new SOTA (or is it?) from Yandex for tabular tasks, TabM (https://t.co/R0On1v44SG), is an MLP architecture that imitates an ensemble of k networks (32 in the paper)
But there’s a surprise: RealMLP (https://t.co/PoqYjYLf4Z) outperforms both. It’s an MLP with smart defaults, quantile feature transform, smooth clipping, scaling layer, etc., and works even better in ensemble with boosting.
@rohanpaul_ai It's funny that if you feed it historical prices, it's not rocket science. It's usually linear regression (like dragging the values in Excel) or exponential smoothing, like code below
P.S. On Nvidia stock, the error was 40% over 3 months (ChatGPT 4.1).
Book Review: ML System Design by Alex Xu
An excellent introductory book on designing machine learning systems. This book is highly recommended, especially for managers tasked with designing new systems they haven't tackled before.
For the Russian version: https://t.co/eMr68hfCqm
The project has been recognized by winning the Yandex Open Source Grant Program (https://t.co/ymAw5fupFZ). For a deeper understanding, you can watch my talk about the project https://t.co/w0bAQts1Pt
New release of my library for generating tabular data: tabgan (500+⭐, 40k downloads). It leverages advancements in tabular generation research, integrating LLM (from a Great library), Forest Diffusion and tabular GAN. To try with "pip install tabgan"
https://t.co/SDqDjmIEz8