The third iteration of NYC Privacy Day is going to happen this October. Register and learn about the cutting-edge work done in security and privacy in the last few months :) https://t.co/vUNSAI402H
@aryehazan The history of original k-means is interesting, but won't fit in the margins of this tweet. However, k-means++ as the now standard way to initialize k-means did come from theory.
Another new paper on the arxiv to talk about: https://t.co/TigTtrooAF . This paper is my first foray into differential privacy, which was fun and forced me to learn a lot.
New paper just hit the arxiv, and which was one of the most fun and interesting research projects that I've ever worked on: https://t.co/nnCfhcsizL . Long story short, we found some super interesting and surprising behavior in the most well-studied online problems: ski rental!
Training ML models with differential privacy could be challenging.
To aid practitioners, we wrote a detailed survey with known best practices of DP-training of ML models: https://t.co/RtW5VyAdRh
Super excited about a new preprint, "Faster Matchings via Learned Duals", with Sungjin Im, Thomas Lavastida, Ben Moseley, and @vsergei . Long story short: we can use ML to massively speed up min-cost perfect matching computations! https://t.co/KYRlIWPHle
@Aaroth A nice simple exercise. Suppose you are estimating the mean of a Gaussian distribution from iid samples. Compare the DP error to the finite sample error. TL;DR; with DP you need O(\sqrt{log n}) more samples to get parity.
Coming out of twitter hibernation to say that part 1 of the clustering book with @geomblog is available at https://t.co/jLwrKXiobf !
As we say in the intro: Clustering is more than just a collection of tools...
it is a systematic way to think about how data should be organized.