Vaex version 4.0 is out! 🚀 with core @ApacheArrow integration
Blog: A Hybrid Apache Arrow/Numpy DataFrame with Vaex version 4.0 https://t.co/OHHn2TGHcp
#Python#vaex#datascience
Introducing Reacton 🥳 - a pure Python port of React for ipywidgets! With the proven success of ReactJS, we've implemented a similar API in Python to help create larger applications using ipywidgets. Check out my article at:
https://t.co/qw80Hulvi8
#React#Python#Jupyter
If you are in a hurry, read this thread, which summarizes @JovanVaex 's latest Vaex article.
If you want to read the full article:
https://t.co/1UndLJ6a2F
Learn 8 powerful Vaex DataFrame features you might not have known about.
Read @JovanVaex 's article
https://t.co/WbVPhfEQ6I
Or get the short version in this thread
👇🏻
Did you know that @vaex_io 's documentation contains a section with guides?
Guides can help one better understand certain functionalities of Vaex, especially those that are not common in other DataFrame libraries.
https://t.co/6Q8g1OhZ7H
1/4
Vaex
Vaex is a high-performance Python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets.
https://t.co/Vm9sIvG8pm
College completely failed to teach me data analysis.
So I spent over 10,000 hours learning Python.
Then, I picked the 13 best libraries for machine learning and data analysis.
But unlike college, these won't cost you $120,000.
Here they are for free:
Wondering how to make simple and fast @streamlit apps using large datasets? In our latest article we use combine @vaex_io and @streamlit to do just that. We cover:
- caching for efficient computations
- optimized delayed evaluations
- progress bars
https://t.co/xc7ZVcXu8c
📣 New blog post: "Out-of-core processing with Vaex", the second part of our series "The great Python dataframe showdown"
https://t.co/6jHxR4CONT
Ever heard of @vaex_io but haven't tried it yet? Read on 👇
RADIS 0.12.1 was released today ;
- more accurate for very thin lineshapes.
- @vaex_io is the new default HDF5 library for all platforms (faster!)
- use arbitrary @astropy units when rescaling spectra
- and many docs improvements
But mainly : we welcome 5 new contributors !
Jovan and Maarten will showcase @vaex_io, an open-source DataFrame library in Python, tailor-made to allow fast, interactive workflows with datasets that are too large to fit in RAM on a single node.
Join us live here https://t.co/5HGLXyBPtB on Wednesday at 9am PST/ 5pm UK!
Is your data in the Cloud? No problem: Vaex can open and stream your data directly from your favorite cloud storage provider. Even better - it will only download the parts of the data that you view or use!
Learn more about reading data from the cloud at: https://t.co/YuDexBTwJD
It would be interesting to add support for the SIMD instruction sets of various POWER architectures in #xsimd, which underlies SIMD acceleration in Scipy (through Pythran), Apache Arrow, Xtensor, and more.
Would folks at @ibm be interested in funding such work?
Vaex 4.8.0 is out! It includes various performance improvements, improved asyncio support, fancy progressbars and more!
See the full changelog at https://t.co/VqxbVLg4wm
Turn caching on to speed up computations in Vaex! Especially useful for dashboard backends and interactive EDA.
In this example we see that the groupby operation of 1 billion rows is sped up significantly by reusing a previous result from the unique calculation.
Did you know that Vaex can do multiple computations with a single pass over the data?
Set “delay=True” to schedule operations to be executed in parallel. The Rich progress bars nicely indicate this with [1] and [2], showing how we aggregate the 1.1 billion rows on a single node!
Watch this new #GOTOcph talk where Jovan and @maartenbreddels show how you can access and stream your data directly from the #Cloud — perfect for building cloud services.
All of that using just a single machine!
https://t.co/irEOWAAGxb
Wordify 2.0 is out! 🚀
Now faster, more interactive, and has more languages. These improvements have been possible thanks to great libraries in the ecosystem, respectively: @vaex_io, @streamlit, and @spacy_io.
Thanks to @federicobianchy, @dirk_hovy, and the @MilaNLProc lab!
Together with the #Rich library, Vaex can show detailed information about how it executes a computation. Next to the time spent on each step, we also see how many passes over the data it takes (in square brackets). Great for understanding and optimizing your processing pipelines!