In this report, we describe the 2025 Apple Foundation Models ("AFM"). We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device AFM model
https://t.co/tUGXvgJAzd
through architectural innovations such as KV-cache sharing and 2-bit quantization-aware training;
and (ii) a scalable server model built on a novel Parallel-Track Mixture-of-Experts (PT-MoE) transformer that combines track parallelism, mixture-of-experts sparse computation,
Not that many systems handle 6B QPS. 😀
"Bigtable has been in continuous production use at Google for more than 15 years now, processing over 6 billion requests per second at peak and with over 10 exabytes of data under management. "
https://t.co/g7bx2PaDEL
I view my talk as a chance to exchange ideas w/ students. Some points I raised:
- Be interdisciplinary, à la A. van Leeuwenhoek
- Read papers older than you
- On fast-evolving topics, my students outpace my knowledge
- My last serious coding session? During Obama's 1st term 😑
https://t.co/n1uCdhQVZl GSPMD aka "Sharding is all you need". Foundational work for Giant Models by Yuanzhong Xu et al! *generalized from
GShard backend system
This week we will be presenting three papers at #ICLR2021 each exploring a different aspect of multi-task/multilingual models at scale: (1) modeling (2) optimization and (3) large scale systems.
Tragic—a mother lost **both** her sister and her 13 year old son Peyton to #COVID19. The bereaved mom took the bold step to publicly share the chilling images of her son’s blood-spattered hospital room in a bid to urge Americans to take COVID seriously.
https://t.co/rZbTMqS7Cs
A statement signed by 150 people incl. Bill T. Jones, Wynton Marsalis, Jennifer Finney Boylan, Noam Chomsky, J.K. Rowling, Margaret Atwood, and Salman Rushdie expresses concern over the illiberal trend intensified by our national reckoning.
https://t.co/4zPjuPNXBu
@timnitGebru@kat_heller I absolutely don't condone anyone being personally attacked on social media. I just haven't seen this in my feed here.
If anyone reading this is part of attacking you or anyone else, I say this to you:
Please stop. Personal attacks have no place in scientific discourse.
Great work by @GoogleAI researchers @lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen.
13.5 BLEU point gain is really significant!