Join us this afternoon to learn how you can build an open-source AI stack with Modin, @streamlit, #Arctic to power your LLM App! 🧑💻💡
🗓️ Thursday, June 6 ⏰ 2:30 PM - 2:50 PM PDT
📍 Builders Hub Theater (Basecamp, South)
#SnowflakeSummit#OpenSource#AI#LLM
📣 This year's @SnowflakeDB#DataCloudSummit features #DevDay on June 6, open to all developers. Get ready for an exciting line of speakers from Andrew Ng (Founder @DeepLearningAI), Lukas Biewald (CEO/Co-founder @weights_biases ), and more!
I'll be giving a Dev Day luminary talk on how you can build an entirely open source #AI stack, including @streamlit, @modin_project, and Arctic!
Learn how #oneAPI tools including the Intel Distribution of @modin_project, and @XGBoostProject optimizations from Intel are used to process data on freshwater supplies to determine whether or not the water is safe to drink: https://t.co/Mkfo9CToat
Lots of great stuff in the most recent Modin release (0.29.0), including two functions to make it easier to work w/ @dask_dev dataframes: to_dask, and from_dask.
Check out the to_dask / from_dask code: https://t.co/NzSP85gJo2
And the release notes: https://t.co/lpcrC0GqKx
💥Performance Improvement for .merge in Modin!💥
The result? H2O benchmark (500 MB) total execution dropped by 70% (28.64s -> 8.78s)
What changed?
Before: Gather the right df into one partition
After: Repartition so we can broadcast only row partitions of the right df
Dmitry Chigarev at @IntelDevTools wrote a post about Modin: https://t.co/1cEFVCXRNK
He takes a 4-minute pandas workload & says:
"Modin could be a solution here, offering a drop-in replacement for Pandas and efficient parallel implementations of its API."
With Modin? 55 seconds
If you're backed into a dark alley, facing a pack of ravenous NaN dogs, you'll want a sidekick like pandas dropna.
Drop rows with n missing values using thresh=n.
Drop entirely empty rows with how='all'.
Learn more from @__mharrison__: https://t.co/cUxVKDc6ec
#pandas#Python
If you want to automatically parallelize your pandas code, you should check out open-source Modin: https://t.co/uN0uRLwUWv
@modin_project uses the pandas API, so all you need to do is change "import pandas as pd" to "import modin.pandas as pd"
#pandas#Python#opensource
The Python Dataframe Interchange Protocol is a quiet hero 🦸♀️.
It makes it easy for libraries to accept many dataframes (Modin, Polars, Ibis, cuDF, Dask).
Read our encomium: https://t.co/LWOnZBI5IL
Thanks to Marco Gorelli & @ralfgommers for your work!
#python#dataframes
Want to learn 🧠 about @PyTorch DataLoaders 🏗️ ?
In this article, we:
- Present some background on DataLoaders
- Discuss how @modin_project can now do batch data loading through the new ModinDataLoader
- Share benchmarks
Check it out: https://t.co/8RdQlUsQ0Y
#pytorch#python
Check out this new @_odsc blog post on scaling #pandas! https://t.co/MCkToL4tlt
Written by @ponderdata CEO @dorisjlee, it discuss the @modin_project + Ponder revolution where you can speed up / scale your pandas workflows with almost no changes to your scripts / notebooks.
Modin's been downloaded 10 million times.
Along the way, it's been incorporated into major projects by Intel, AWS, and Ponder.
Learn more about Modin's growth story in this @intel post: https://t.co/UB7fqHLK7g
#pandas#python#datascience@IntelDevTools
MODIN HIT 10 MILLION DOWNLOADS! 🔥
https://t.co/heLck276Tq
That's lots of "import modin.pandas as pd," lots of large dataframes processed, lots of time saved -- time NOT spent laboriously parallelizing code.
Congratulations, and thanks, to all!
#python#pandas#datascience