Where I hang to be part of the data science community, especially R. Retired CEO, OppenheimerFunds. Former Chair, Nat. Museum of Mathematics. Also @apsteinmetz
@kyle_e_walker@duckdb Interesting. When using duckplyr using group_by(col) |> summarize(...) forces a fallback to dplyr. I have to use the equivalent, summarize(.by = col,...)
@kyle_e_walker@ipums Arrow for R is cool because it uses tidy vernacular out-of-the-box. I find that duckDB is even faster, uses the Arrow file format and, with the duckplyr package, is nearly seamless with tidy verbs.
@kyle_e_walker@OvertureMaps Wow. Thanks for sharing this. I've been struggling, on and off, for years to get a good data set of NYC building polygons. The NYT has no trouble but I have. I didn't know about Overture. Separately, the duckplyr package makes duckDB as "tidy" as Arrow...and faster.
@JosiahParry No.
Start every R big DB session with
library(tidyverse)
library(duckplyr)
methods_overwrite()
Watch everything run superfast without using any SQL. That's nearly all you need to know about duckDB.
@rbhar90 You get the answer to the question you ask. "Random" isn't in the prompt. This is the answer to the question "What number between 1 and 100" appears most in your training set?
Some context: We don't invite speakers, they respond to the call for presenters. We usually get 50+ submissions. We have 25 right now. There are usually some women in that group. Right now there are 0. That's why I specifically mentioned the need for women speakers.
@vsbuffalo I've been toying with migrating from #rstats to Python and quickly went down the rabbit hole of so. many. environments. WTH? It's a S*show. In R there is one ring to rule them all.
So many great #rstats database engines. Data.table, Arrow, duckDB and polars. There are #tidyverse wrappers for all of them, but should you use them? https://t.co/eDgwizQNzt
@MrLarrieu @duckdb Thanks for sharing. I am going to try this in R with my Sunpower system. I also have Tesla batteries so I'm not sure how to interpret the data.
@RitchieVink Hopefully this is not a loaded question, but how do you generally feel about the tidy wrappers (R/Python) around Polars? Too much overhead, lacking functionality? Or, full-featured and efficient?
@EmilyRiederer Old dog. No new tricks. I have wrestled a few Python scripts but managing the environment is alien. VS code is for coders, not data scientists. Let's see what Posit comes up with as they become more language agnostic.
@macro_synergy "Kelly Betting" in investment management is a feeling, not a measurement. Can I quantify my "edge?" Can I quantify my payoff in "winning?"