Congratulations to Prof. Weijie Su (@weijie444) from our Statistics and Data Science Department on being named the recipient of this year's Committee of Presidents of Statistical Societies (@COPSSNews) Presidents' Award: https://t.co/xBXjzUndEo
The honor is given annually to a young member of the statistical community in recognition of outstanding contributions to the profession of statistics.
It's jointly sponsored by five statistical societies: @AmstatNews,
@ENAR_ibs, @InstMathStat, @SSC_stat, and @WNAR_ibs.
[1/n] New work [JSKZ25] w/ @JikaiJin2002, @syrgkanis, @ShamKakade6.
We introduce new formulations and tools for evaluating language model capabilities, which help explain recent observations of post-training behaviors of Qwen-series models — there is a sensitive causal link from instruction-following capabilities to math reasoning ones, which can lead to performance inflation.
See more details 🧵: https://t.co/vP7m6rOhvR
Large Concept Models: Language Modeling in a Sentence Representation Space
This new paper from Meta introduces a very interesting, novel approach to language modeling.
Rather than doing prediction of the next subword token, focus on generating the next concept, which is represented by a sentence.
Effectively, the model operates over sentence embeddings. Here, the pretrained SONAR sentence embeddings are used. Text is encoded by SONAR, passed into the model, the next SONAR sentence embeddings are generated, and are decoded back to text.
Various variants for the model itself are explored. This includes a standard decoder-only autoregressive Transformer, different diffusion model archs, and autoregressive quantized archs. The diffusion model archs seem to perform best and were scaled up to 7B and were comparable to Llama-3.1-8B and other ~7B LLMs for summarization and summary expansion tasks.
I had the privilege of being interviewed by the marvelous and unique Steve Strogatz on the podcast "The Joy of Why." We talked about how AI is changing the science of prediction. Links are below:
https://t.co/L0xcbVIOdq
https://t.co/hTuu3TNQby
https://t.co/7ys84IcPfh
📢 We are excited to announce our 7th *virtual* mentoring workshop which will be held on November 13 - 14, 2024.
Registration (https://t.co/Lo1lYowqXH) is open and free for all.
See the full schedule at https://t.co/9qa12pqpWF.
#MLTheory [1/4]
We created SynthID, a robust digital watermarking technology to tag & identify AI-generated content. Now we’re open-sourcing SynthID-Text so developers can use it to embed & detect watermarks in text outputs from their own LLMs. Published today in @Nature https://t.co/0gKcjoNHqS