Clément Chadebec @CChadebec - Twitter Profile

Pinned Tweet

almost 4 years ago

After 8 months of long coding nights ☕️ we finally officially release Pythae 🥳, a python library unifying generative autoencoder implementations including vaegan🥗, vqvae or RAEs. 🖥️ github repo: https://t.co/570oxztyn3 👉paper: https://t.co/Nh5BgRWtU7

CChadebec's tweet photo. After 8 months of long coding nights ☕️ we finally officially release Pythae 🥳, a python library unifying generative autoencoder implementations including vaegan🥗, vqvae or RAEs.

🖥️ github repo: https://t.co/570oxztyn3
👉paper: https://t.co/Nh5BgRWtU7 https://t.co/BKV4Kz5D1d

12

898

141

263

0

CChadebec retweeted

clem 🤗

@ClementDelangue

3 days ago

Feels quite magical to be able to clone a 68 TB dataset to my private HF training bucket while I only have a 4TB local disk, all of that in less than a minute thanks to HF infra optimizations & xet dedup!

ClementDelangue's tweet photo. Feels quite magical to be able to clone a 68 TB dataset to my private HF training bucket while I only have a 4TB local disk, all of that in less than a minute thanks to HF infra optimizations & xet dedup! https://t.co/AEwlifG0iJ

14

133

9

23

14K

Clément Chadebec

@CChadebec

6 days ago

@cyrildiagne Thank you Cyril!

0

2

0

26

CChadebec retweeted

Cyril Diagne

@cyrildiagne

6 days ago

very nice dataset drop from the clipdrop team 🙌

1

8

1

0

720

Who to follow

Fabian Pedregosa

@fpedregosa

Keeping the gradients flowing since 2013. Loves open source. Sometime blogs and writes papers.

Symmetry and Geometry in Neural Representations

@neur_reps

NeurIPS workshop and digital community | 🌐 geometry, algebra, topology + 🤖 deep learning + 🧠 neuroscience | Join us on slack! https://t.co/Run9wPnrDB

Ben Eysenbach

@ben_eysenbach

Prof @ Princeton CS working on AI/ML/RL. 🦋@ https://t.co/hz4KZsv5iO

CChadebec retweeted

Julien Chaumond

@julien_c

7 days ago

We are starting to be quite bullish about getting in the data infrastructure business. I just cloned 68 TB (while I only have a 4TB local disk) to my @huggingface training bucket in 1 minute 55 seconds, thanks to Xet deduplication and all our infra optimizations. You can host your data processing pipelines on HF and leverage those insane optimizations 🔥

16

176

33

76

22K

CChadebec retweeted

Pedro Cuenca

@pcuenq

7 days ago

wow 🤯 68 TB, high-quality, high-resolution, Apache 2. Thank you! 🙌🫡

2

51

3

47

8K

Clément Chadebec

@CChadebec

7 days ago

@JulienBlanchon @heyjasper @huggingface This may be an idea for future work indeed!

0

1

0

8

Clément Chadebec

@CChadebec

7 days ago

📢 New @heyjasper release ! 📢 MONET 🌸 : An Apache2.0 deduped and recaptioned dataset of 105M samples unlocking reproducible text-to-image research. Nano T2I 🖌️ : A codebase to train your own T2I model 🤗 @huggingface: https://t.co/x6gEhQIaFV 💻: https://t.co/K6VIU2wjtW Very excited about this new release, pushing the boundaries of open and reproducible T2I research. Congrats to the team! Benjamin Aubin Gonzalo Quintana @onurxtasar @UlaLaParis @_jeev2 @dh7net @clipdropapp @heyjasperai

9

117

33

90

45K

Clément Chadebec

@CChadebec

7 days ago

@AmelieTabatta @heyjasperai Thank you for the support, means a lot! I hope it will be useful in your research

0

2

0

27

CChadebec retweeted

Amélie Chatelain

@AmelieTabatta

7 days ago

Ah! this is insane, I've been complaining for weeks about the lack of open text-to-image data! Congrats on the release @heyjasperai

2

11

2

602

Clément Chadebec

@CChadebec

7 days ago

@JulienBlanchon @heyjasper @huggingface Yes indeed! We hope this dataset can be useful for other tasks as well.

1

0

20

Clément Chadebec

@CChadebec

7 days ago

@JulienBlanchon @heyjasper @huggingface Thank you!

1

2

0

171

CChadebec retweeted

Wauplin @Wauplin

7 days ago

Huge open release from @heyjasperai : MONET 105M curated image-text pairs, Apache 2.0, with embeddings, VAE latents, multi-VLM captions, and a companion training repo (nano-t2i) to train a T2I model end-to-end on one H200 for <$300. Congrats @CChadebec & co 👏

1

4

2

3

586

Clément Chadebec

@CChadebec

7 days ago

@julien_c @huggingface @heyjasperai Thank you for the support @julien_c!

0

2

0

42

Clément Chadebec

@CChadebec

7 days ago

@labaubine_ @heyjasper @huggingface Congrats to you for this huge piece of work!

0

1

0

4

Clément Chadebec

@CChadebec

7 days ago

@UlaLaParis @heyjasper @huggingface Thanks for trying it out!

0

1

0

14

CChadebec retweeted

Julien Chaumond

@julien_c

7 days ago

With 104M of image-text pairs, this is one of the largest, if not the largest, openly-licensed image dataset And it's on @huggingface!! Kudos @heyjasperai

4

81

11

42

20K

Clément Chadebec

@CChadebec

7 days ago

@lhoestq @heyjasper @huggingface Thank you for the support!

0

2

1

0

172

Clément Chadebec

@CChadebec

7 days ago

We put in place a rigorous and meticulous filtering, deduplicating, and re-captioning pipeline to create MONET: ⛽ Sourced from 2.9B images from open datasets (LAION, COYO, etc.) ✅ Filtered for high-res, aesthetics & strict safety/NSFW standards 👬 Deduplicated & stripped of stock/watermarked images 💬 Re-captioned using 4 top VLMs for rich, diverse text descriptions 🕹️ Augmented with safe, permissive synthetic data

CChadebec's tweet photo. We put in place a rigorous and meticulous filtering, deduplicating, and re-captioning pipeline to create MONET:

⛽ Sourced from 2.9B images from open datasets (LAION, COYO, etc.)
✅ Filtered for high-res, aesthetics & strict safety/NSFW standards
👬 Deduplicated & stripped of stock/watermarked images
💬 Re-captioned using 4 top VLMs for rich, diverse text descriptions
🕹️ Augmented with safe, permissive synthetic data

1

10

2

291

Clément Chadebec

@CChadebec

7 days ago

Using the MONET dataset exclusively, we trained a 4B T2I model from scratch. Built on an MMDiT-inspired architecture and trained via latent flow matching with a deep compression VAE, the model can generate images up to 2048x2048 resolution. 📜 : https://t.co/Kf6zDtNHTD 💻: https://t.co/K6VIU2wjtW

1

9

3

2

407

Clément Chadebec

@CChadebec

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users