collapse

@collapse_R

A C/C++ based package for advanced data transformation and statisical computing in R. Account managed by the author. #rcollapse

CRAN and Github

Joined June 2020

29 Following

954 Followers

196 Posts

Pinned Tweet

collapse @collapse_R

3 days ago

The article on "collapse: Advanced and Fast Statistical Computing and Data Transformation in R" has now been published (open-access) in the Journal of Statistical Software: https://t.co/0jt6PMEO5t It offers a concise yet thorough introduction to the package. #rstats #DataScience

289

collapse @collapse_R

3 months ago

Recording of my talk on {collapse} and the {fastverse} at the Bank of Portugal‘s workshop on „Speeding up Empirical Research: Tools and Techniques for Fast Computing“ in December is now online: https://t.co/nhDYym94Pz #Rstats #DataScience

319

collapse @collapse_R

3 months ago

Updated windows benchmarks for in-memory database-like operations by Adrian Antico show that {collapse} still leads on lagging and casting benchmarks (not covered in DuckDB benchmarks) and remains overall very competitive: https://t.co/JuSLclANwS #Rstats #DataScience

342

collapse @collapse_R

4 months ago

I've released a new package {flownet} for efficient transport modeling and graph manipulation: https://t.co/0LOeuXDsNT. It builds on 7 {fastverse} libraries, most notably {collapse}, from which it imports 60 functions. Thus, another great learning resource for developers #rstats

166

Who to follow

Tim Tiefenbach

@TimTeaFan

CX Data Scientist 📊 by day ☀️ Indie hacker by night 🌙 Web & App Dev. All things AI 🤖 Also 🇯🇵 👨‍👩‍👧‍👦☕🍵🚀

Bruno Rodrigues

@brodriguesco

Sworn in Data Janitor blog: https://t.co/id384GfiNt youtube: https://t.co/zKHU0Ygj28 mastodon: https://t.co/Dooa731Hln bsky: https://t.co/LKIAD3Ssd4 'nix solves this'

Gábor Csárdi @[email protected]

@GaborCsardi

Software Engineer at @posit_pbc Opinions are my own Also at @[email protected]

collapse @collapse_R

4 months ago

I'm excited to share the release and rOpenSci publication of dfms 1.0 (https://t.co/3Evhp1o6kz), a high-performance, {collapse} and {RcppArmadillo}-based, and feature-rich, implementation of Dynamics Factor Models in R. More in the release blog post at: https://t.co/bsBXqhSGvd

152

collapse @collapse_R

5 months ago

A few updates: (1) new fastverse domain at https://t.co/Wxvg9HcxEf (2) the collapse (kit) repos moved to https://t.co/eN6eqM7oUT (/kit) and site to https://t.co/Udx09wcVp7 (/kit) (3) A group of maintainers has been given access (4) collapse has a DeepWiki https://t.co/wdZim3mafW

362

collapse @collapse_R

about 1 year ago

{collapse} 2.1.0 is out! It introduces a new fslice() function (https://t.co/p25YbZujp5), a new theory-consistent weighted quantile algorithm (https://t.co/NAJD2y7zpk) with excellent properties. And some convenience features such as join requirements: #rstats #DataScience

$collapse_R's tweet photo. {collapse} 2.1.0 is out! It introduces a new fslice() function (https://t.co/p25YbZujp5), a new theory-consistent weighted quantile algorithm (https://t.co/NAJD2y7zpk) with excellent properties. And some convenience features such as join requirements: #rstats #DataScience https://t.co/9GHtI2ll0o$

collapse @collapse_R

over 1 year ago

@JosiahParry It’s a never ending hackathlon, unfortunately. But still good to have rigorous C/C++ checks done for free on many platforms. Creates more robust software.

128

collapse @collapse_R

over 1 year ago

The {collapse} arXiv paper has just been updated - following extensive revision: https://t.co/Ft1vDdCGfQ. I believe it is a great resource for anyone doing scientific computing with #rstats.

collapse @collapse_R

over 1 year ago

There is now a #fastverse benchmark wiki (https://t.co/mwx7SR8rvT) where users can freely contribute benchmarks. If you have benchmarks involving {fastverse} packages ({collapse}, {data.table}, etc., including extensions) please contribute them (takes 1 min) #rstats #DataScience

525

collapse @collapse_R

over 1 year ago

I just improved the vignette a bit further, adding some detailed benchmarks and a section on Global Options. I needed to correct myself: it is not true that {collapse} global options should never be invoked in packages - they just need to be reversed like #rstats global options.

220

collapse @collapse_R

over 1 year ago

It's nice to see an increasing number of #rstats packages using {collapse}. A developer focused vignette was long planned and now it is here - with modest advice on writing efficient R package code in general and using {collapse} in particular: https://t.co/5j7EiFk4dk

collapse @collapse_R

over 1 year ago

{collapse} is now also on BlueSky (https://t.co/ODTcOh0GjA) and I am also there (https://t.co/yQnDqHAANa) [and on Mastodon: https://t.co/kKomhUJbOp]. I will repost {collapse} posts but also share about research/data. This X account will remain active. #rstats

498

collapse @collapse_R

over 1 year ago

@MislavSagovac @r_data_table Of course not. But {collapse} can do many things, particularly complex statistical things, that cannot be done with {data.table} or are more difficult to do or do efficiently with {data.table} - as shown in this post and further in the docs/resources: https://t.co/pAm5Hb7NyC

243

collapse_R retweeted

Data Table @r_data_table

over 1 year ago

Check out the latest package to be granted the Seal of Approval: {collapse} by Sebastian Krantz! {collapse} is a partner package, that implements various data transformation and statistical analysis tasks using ultra fast C/C++ implementations. https://t.co/fJ0QdElEVX

collapse @collapse_R

almost 2 years ago

{collapse} v2.0.15, with fast aggregation pivots, has just reached CRAN. A minor but neat feature worth pointing out in this release is enhanced join verbosity. In addition to the join success rates, the join relationship is now determined and reported - at no extra cost #rstats

$collapse_R's tweet photo. {collapse} v2.0.15, with fast aggregation pivots, has just reached CRAN. A minor but neat feature worth pointing out in this release is enhanced join verbosity. In addition to the join success rates, the join relationship is now determined and reported - at no extra cost #rstats https://t.co/kCZYFESNun$

collapse @collapse_R

almost 2 years ago

@statquant @JosiahParry Agreed, it’s not more apples to apples, but equally valid. DuckDB benchmarks max out all frameworks on a large linux cloud server. This one compares performance on a local windows system through IDE‘s and is thus closer to many users. For the 100M data they are quite similar…

105

collapse @collapse_R

almost 2 years ago

New independent benchmark by Adrian Antico: https://t.co/prVkzllrtD Setup: - large local Windows machine - real data - broad range of tasks - scripts executed inside Rstudio and VScode -> shows that {collapse} is an absolute top performer in this setting #rstats #DataScience

collapse @collapse_R

almost 2 years ago

@Yann_the3rd @JosiahParry Thanks, but I doubt that {collapse} is buggy. Many people are using it and most issues I get are feature requests. It simply does not support tidyselect syntax. across() inside fmutate() works fine. Please carefully read the docs and report any issues that you find on GitHub.

collapse @collapse_R

almost 2 years ago

@JosiahParry No, not at the moment, but it uses pointers to access their contents.

242

collapse

@collapse_R

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users