Assistant Professor of math and data science @NYU_Courant and @NYUDataScience | Former PhD @stanford_ee, postdoc @SimonsInstitute @mit | views are my own
CDS Asst. Prof. Yanjun Han (@yanjun_han) and colleagues at NYU and MIT explains why transformers trained on synthetic data excel at empirical Bayes (EB) problems.
By using universal priors, these models adapt to new data through posterior contraction.
https://t.co/qpsp65Z9Wf
Just posted a preprint on arXiv (with P. Durcik, J. Roos, X. Xie) settling the Kahn–Park conjecture on the Hamming cube:
https://t.co/cKYXvZc2lu
I first learned about the problem through Gil Kalai’s (@GilKalaiblog) blog:
https://t.co/L6Og6nYt3f
(I should also add that I was asking about related question "Sharp L1 Poincare inequality for boolean functions" 8-9 years ago
https://t.co/e359tw5x70 and now the preprint solves both of them though these two problems do not imply each other but they are connected).
In addition, it confirms the low-noise limit for balanced functions predicted by the Hellinger conjecture on noisy Boolean channels in information theory.
The paper shows C_{11b}=0.5 in our GitHub repo: https://t.co/xF9TsWmkf9
and answers my recent experimental AI challenge:
https://t.co/SfBpybe57q
(For the record: I received many interesting submissions -- all incorrect, often due to surprisingly simple mistakes. I still don’t know whether there is a proof avoiding the argument in our paper.) But AI did correctly identify the bottleneck that one needs to show C_{11b}=0.5 which is (now was) an open problem.
Finally, many thanks to @AIMathematics -- this project began at our SQuaREs program in San Jose supported by AIM. Without AIM, this would not have happened 🙏
New paper alert: https://t.co/WrwNSmdZSp
Why do pretrained transformers succeed at empirical Bayes? Rather than analyzing architecture or training dynamics, we ask a statistical question: how can a fixed training prior perform well under arbitrary test distributions? [1/2]
Our answer: Universal priors exist, just because of the classical phenomenon of posterior contraction! A pretrained estimator under such priors adapt to different test distributions, and generalize to different lengths.
Comments welcome! [2/2]
A key technical challenge is to show a quantitative mean-field approximation of the best permutation-invariant decision rule by a simple rule. We managed to apply the tools we developed last year https://t.co/n6cYf5vhLo to give tight results in Gaussian and Poisson models! [3/3]
New preprint out: https://t.co/YGGrdf1uTd!
For the good old problem of distribution estimation, we use empirical Bayes and nonparametric MLE, two cornerstones of 20th century statistics, to propose a new, efficient, parameter-free, and competitively optimal estimator. [1/3]
Technically, we resolved a decade-old competitive gap in https://t.co/PqGswPis7F, an award-winning paper in NeurIPS 2015. Yihong Wu brought this question to me years ago, and we are so happy to solve it along with two amazing collaborators Jon Niles-Weed and Yandi Shen! [2/3]

The Fall 2025 CDS PhD Program application is now open! Apply now: https://t.co/68c3Dzlyv2
Information on our Fall 2025 PhD Admissions Information Sessions is coming soon!
#datascience#ai#artificialintelligence#machinelearning
there are surprisingly many open problems when it comes to theory/methods in causal inference
check out this talk by Siva Balakrishnan for an excellent & comprehensive summary of the state of the art
https://t.co/BixhF4jImP
https://t.co/pVIslnRatf
I haven’t enjoyed the mathematics in a paper this much in a long time: https://t.co/n6cYf5vhLo
Summary: an example of performing method-of-moment type analysis for high-dimensional mixtures.
Joint work with my amazing colleague Jonathan Niles-Weed.
Yanjun Han (@yanjun_han) will be joining CDS this fall as an Assistant Professor of Mathematics and Data Science. Read about Yanjun and his work on the mathematics of data science on the CDS blog!
https://t.co/sO94YltryB
Applications for the NYU Data Science PhD program are now open! To apply/find more information, please visit our PhD Admissions page: https://t.co/W6tktEoICE. We're excited to welcome the next cohort of leading researchers in data science! #datascience
The Nobel prize to Giorgio Parisi is such a joy and and multiply well deserved recognition:
In random order:
(1) Stochastic quantization;
(2) The KPZ equation;
(3) Matrix models;
(4) Mean field spin glasses (!);
(5) Random constraint satisfaction problems. 1/2