We know optimism is provably efficient for online RL. What about offline RL? It turns out simply flipping the sign of the bonus is minimax optimal! Given a dataset, pessimism is the best effort we can make.
https://t.co/zO6cxKY7bb
Just leave pessimism to 2020. Happy new year~!
Struggling with fine-tuning BERT models? Overfit your tasks again?
Check our recent work on ACL 2020 "SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization" https://t.co/JO2pxb8SiD
Asst Professor @lu_junwei is organizing an online seminar on Mathematical Foundations of Data Science w. @EthanFangXY, Niao He, @zhaoran_wang, Zhuoran Yang, and @tourzhao. Next speaker: @UMengineering Prof Alfred Hero May 12 @ 3pm EST via Zoom. More info: https://t.co/TsBgDDPLPN