(1/6) Recommender systems shape our digital experiences, filtering much of what we see online. But how should we measure their influence? We provide a unified causal framework to think through this question and develop metrics to audit recommender systems.
https://t.co/Yq9XvLSO2g
Time to enjoy yet another way GenAI is more of a headache than enterprises realized:
Data deletion takes on a whole new meaning when it’s possible to recover data a company didn’t know it had.
I explain in the blog 👇
https://t.co/lS5sT2RPte
Time to enjoy yet another way GenAI is more of a headache than enterprises realized:
Data deletion takes on a whole new meaning when it’s possible to recover data a company didn’t know it had.
I explain in the blog 👇
https://t.co/lS5sT2RPte
It's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something.
They don't care if the tokens happen to represent little text chunks. It could just as well be little image patches, audio chunks, action choices, molecules, or whatever. If you can reduce your problem to that of modeling token streams (for any arbitrary vocabulary of some set of discrete tokens), you can "throw an LLM at it".
Actually, as the LLM stack becomes more and more mature, we may see a convergence of a large number of problems into this modeling paradigm. That is, the problem is fixed at that of "next token prediction" with an LLM, it's just the usage/meaning of the tokens that changes per domain.
If that is the case, it's also possible that deep learning frameworks (e.g. PyTorch and friends) are way too general for what most problems want to look like over time. What's up with thousands of ops and layers that you can reconfigure arbitrarily if 80% of problems just want to use an LLM?
I don't think this is true but I think it's half true.
🎉Congrats to @OpenAI for releasing o1:
- Economics: @tylercowen asked o1 basically to write a college essay
- Genetics: @catbrownstein asked o1 to help her reason through "n of 1" cases - medical cases that nobody has ever seen
- Physics: @mariokrenn6240 used o1 to draft and reason through complex quantum physics equations
- Code: @ren_hongyu prompted a full snake game and it was generated zero shot, working perfectly, and obeyed instructions to add obstacles
.@SimonPrinceAI finished 68 (!!!) notebooks that go with his book "Understanding Deep Learning".
These are *excellent* for learning/teaching deep learning.
Notebooks and book pdf at: https://t.co/0gLIK79j3y
#100DaysOfMLCode
This is not good, "Surprisingly, we observe a significant decline in LLMs’ reasoning abilities under format restrictions."
Link: https://t.co/SDhgjHsEMS
It’s like fractals. The 1% is insanely better at math than the 99%. The 0.1% is yet again insanely better. And yet again the 0.01%. The 0.0001% is a whole different species. I have friends in the 0.00001% and they scare the hell out of me. And then there’s Terry Tao.
Building a platform for generative AI applications
https://t.co/G9CFC6HLxt
After studying how companies deploy generative AI applications, I noticed many similarities in their platforms. This post outlines these common components, what they do, and implementation considerations.
This post starts from the simplest architecture and progressively adds more components.
1. Enhance context input into a model by giving the model access to external data sources and tools for information gathering.
2. Put in guardrails to protect your system and your users.
3. Add model router and gateway to support complex pipelines and add more security.
4. Optimize for latency and costs with cache.
5. Add complex logic and write actions to maximize your system’s capabilities.
I try my best to keep the architecture general, but certain applications might deviate. As always, feedback is appreciated!
Breakthrough paper, showing that optimal learners in stochastic optimization memorize a constant fraction of their training data.
Mahdi will be talking on Thursday at ICML, poster immediately after.
In talking to policy makers and AI researchers, I realised there's a fact agreed upon by all researchers, but understood by almost no policy makers.
This uncomfortable fact is why AI policy is hard.
a gem of a paper with real-world impact. audit of an algorithm that the Danish child protective services wants to deploy shows the algorithm is not fit for purpose and should not be deployed. read and amplify