Huge upvote to this substack, I’ve known Lara for years and she’s really committed to understanding things deeply and communicating them well; for a while she’s posted these weekly newsletters, go check it out!
Our latest work AlphaProof, building on AlphaZero, LLMs and the @leanprover theorem prover, combined with AlphaGeometry 2 managed to solve 4 IMO problems and achieve silver-medalist level! 🚀
More at https://t.co/YMAp8uUYSY
Introducing Sora, our text-to-video model.
Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions.
https://t.co/YYpOAcrXQ3
Prompt: “Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.”
Introducing FunSearch in @Nature: a method using large language models to search for new solutions in mathematics & computer science. 🔍
It pairs the creativity of an LLM with an automated evaluator to guard against hallucinations and incorrect ideas. 🧵 https://t.co/MC5ttgvZeM
It has been a great experience co-authoring my first paper with this group of incredibly inspiring and talented researchers. And I’m really excited with our outcome!
Our paper, “VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution” has been accepted into #NEURIPS2023. VisoGender is an evaluative tool designed to flag a model’s bias prior to deployment
https://t.co/0lYbSEMg2f
https://t.co/BVJKMKokzs 1/n
Our paper, “VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution” has been accepted into #NEURIPS2023. VisoGender is an evaluative tool designed to flag a model’s bias prior to deployment
https://t.co/0lYbSEMg2f
https://t.co/BVJKMKokzs 1/n
New benchmark for @NeurIPSConf ! Bias testing of multimodal models was lacking..So, in VisoGender, we combine winograd style schema of NLP w/ visual-linguistic stress-testing.
✨This is my first last author paper accepted at neurips😁 v. proud of all the students involved!