We just released our paper https://t.co/2IutqxDhrR on how Meta scaled GPU communication to 100k+ GPUs. OSS is a part of the torchcomms repo we announced last week https://t.co/zInf6FHBWi including our implementations of various concepts like DQPLB, AllToAllvDynamic and more
Meta has open sourced their CTran library that natively works with AMD & NVIDIA GPUs 🚀. Previously, if u want multiple NVIDIA GPUs to work together on an workload, you must used the NVIDIA NCCL library. Although NCCL's source code is public, it does not have an open governance model, does not have open CI, employs an "code dump" update model, is not GitHub first, and rarely accepts external contributions. Previously, If you want multiple GPUs to work together on an workload, you must used the AMD fork called RCCL library, which is a delayed fork of NVIDIA's NCCL. With CTran, it is 1 unified library and allows for adding new like Bruck's in an way such that the code can be shared between different AI GPU types.
Furthermore, Meta has open sourced NCCLX (NCCL extended) which is their production-tested collective library that powered all Llama training and uses the unified CTran library. Meta is the creator & main maintainer of PyTorch and is well trusted in the open source community.
NVIDIA continues to be the leader in collective libraries but Jensen must not taken it for granted given the heavily increased competition in the open source collective communication space. Just like how TRTLLM moved to an GitHub first development when facing heavy competition from SGLang/vLLM, Jensen should seriously consider moving NCCL to GitHub first open development model due to the competition in the collective front too. To draw parallel comparisons to the inference engine world, Collective Communication Libraries are moving from the 2021 "FasterTransformer" era to the 2025 "SGLang/vLLM/TRTLLM" era.
The main competitors in the collective library space include China's DeepEP library, AMD's new MORI, AMD's upcoming MORI-CCL, Meta's CTran & NCCLX, NVIDIA's NCCL (which has released their new NCCL Device API, NCCL's new GPU-Initiated Networking, etc). Competition breeds innovation! 🚀
Super excited to OSS torchcomms https://t.co/HuPRMXYAb2 , a lightweight collective communication library for PyTorch we’ve been working on for ai workloads along with our custom low level comms library ncclx and ctran.
Proud of the work done at WA (with my team's help ;)) to be a leader in privacy-by-design at scale. First E2EE backup and now Key Transparency that lets you verify your encryption keys are authentic with little friction: https://t.co/6KrluAQTOT @_klewi@shiggschili
@_klewi from Meta's Applied Privacy Technology team recently open sourced the library behind whatsapp's key transparency https://t.co/U6ytrGBR4x, https://t.co/ypPLnKZGMU
We’ve built a better way for clients to authenticate in a de-identified manner ➡️ Meta has open-sourced Anonymous Credential Service (ACS), a highly available multitenant service. Here's how we developed ACS & how you can use it. #opensource
https://t.co/Q3l9cn7vZO
We’ve built a better way for clients to authenticate in a de-identified manner ➡️ Meta has open-sourced Anonymous Credential Service (ACS), a highly available multitenant service. Here's how we developed ACS & how you can use it. #opensource
https://t.co/Q3l9cn7vZO
I've been fascinated by managed charging and using cars as a battery to store solar electricity. I put together a fun project that that allows you to sync your solar generation with your tesla so that the charge rate varies based on solar generation https://t.co/U7EPClmlxT
For the past few months, a few of us at @Meta have been working with Martin Thompson and @ekr____ on a proposal for a new way of privately counting how many conversions an ad campaign drove. Martin authored this post on the @mozilla blog about it: https://t.co/VDd2vS1tyM
Great new work by my colleagues at Meta https://t.co/wc4Y3VVtki. Shows how you can combine k-anonymization and gaussian DP mechanisms to avoid full cartesian expansion (very computationally intensive) to make a dataset DP.
@ekr____ It may still be useful in closed environment settings for example in organizations which could prove that the passport came from a trusted company device like a provisioned laptop or phone rather than from a proxy server. Maybe giving people vaccination TPMs helps :)