We've all become experts at clicking "I agree" without a second thought. In my latest blog post available on @huggingface, I explore why these traditional consent models are increasingly problematic in the age of generative AI.
The article examines three fundamental challenges:
- Scope problem: how can you possibly know what you're agreeing to when AI could use your data in a million different ways?
- Temporality problem: once an AI system learns from your data, good luck trying to make it "unlearn" it. Your "no" comes way too late.
- Autonomy trap: the data you share today could create systems that pigeonhole you tomorrow.
The current model places an unrealistic burden on individuals while tech giants accumulate unprecedented power. In my blog post, I propose several approaches to rebalance this dynamic -- from collective advocacy and stronger technological safeguards to establishing "data fiduciaries" with a legal duty to protect our digital interests.
This post is a sneak peek at research I'm working on with my colleague @TrevelinBruna for an upcoming chapter for a Cambridge University Press book.
Great read on how the debate on AI’s technical nature and its impact on human life raises key questions we should also be asking about AI’s role in the legal system. Check out @Thom_Wolf's insightful post https://t.co/vMBhQIznq1 and https://t.co/bd1mnmGhFV
...to immediate large-scale risks and to collaborative solutions supported by evidence helps - as long as developers disclose enough about their design choices.
full blog here: https://t.co/4uFJD49d8o
based on our submitted response with @frimelle and @TrevelinBruna 🤗
2/2
Our Open Source Developers Guide to the EU AI Act is now live! Check it out for an introduction to the AI Act and useful tools that may help prepare for compliance, with a focus on open source. Amazing to work with @frimelle and @YJernite on this! https://t.co/75oR1PBpdS
Glad to see the @OpenSourceOrg release their OSAI definition process after an extensive collaborative process, and especially happy to see the role of training data enshrined!
Head over to the OSI HF org page if you want to discuss the definition on @huggingface 🤗
1/2🧵
China is advancing rapidly in AI technology while maintaining a strong focus on governance 🇨🇳📑
We've collected key AI governance documents released since 2017 and will continue updating them in the China LLMs space on @huggingface
👉 https://t.co/dz6Nx3fpXS
Any feedback is welcome🤗
Open video datasets are badly missing and slowing down the development of open-source video AI. This is why we're excited to introduce 🎥 FineVideo!
43k+ videos/3.4k hours annotated with rich descriptions, narrative details scene splits and QA pairs.
https://t.co/OR0W05AAIN
🚨 [AI RESEARCH] "The Environmental Impacts of AI - Primer," by @SashaMTL, @TrevelinBruna & @mmitchell_ai, is a MUST-READ for everyone in AI. Quotes:
"It can be hard to understand the extent of AI’s impacts on the environment given the separation between where you interact with an AI system, and how that interaction has come to be – most AI models run on data centers that are physically located far away from their users, who only interact with their outputs. But the reality is that AI’s impressive capabilities come with a substantial cost in terms of natural resources, including energy, water and minerals, and non-negligible quantities of greenhouse gas emissions."
-
"For a full picture of AI’s environmental impact, we need both consensus on what to consider as part of “AI”, and much more transparency and disclosures from the companies involved in creating it. AI refers to a broad set of techniques, including machine learning, but also rule-based systems. A common point of contention is the scoping of what constitutes AI and what to include when estimating its environmental impacts. Core to this challenge is the fact that AI is often a part of, as opposed to the entirety of, any given system – e.g. smart devices, autonomous vehicles, recommender systems, Web search, etc. How to delineate and quantify the environmental impacts of AI as a field is therefore a topic of much debate, and there is currently no agreed-upon definition of the scope of AI."
-
"Environmental protection is also stated as being one of the core values put forward by the EU AI Act, and appears several times in its text. As provided in the AI Act, the energy consumption of AI models is at the core of this topic, and is stated as one of the criteria that must be taken into consideration when training and deploying them. The AI Act stipulates that the providers of general-purpose AI models (GPAIs) specifically should share the known or estimated energy consumption of their models. It also provides that high-risk AI systems should report on resource performance, such as consumption of energy and of 'other resources' during the AI systems’ life cycle, which could include water and minerals depending on the level of detail of the standards that will guide compliance to this reporting obligation."
👉 Read the full paper below.
🔥 To stay up to date with the latest developments in AI policy, compliance & regulation, including excellent research, join 34,000+ people who subscribe to my weekly newsletter (link below).
We passed 5 million users.
🥳That's 5 million of you who have signed up on the Hub 🚀 thank you for contributing to the ecosystem and making open Machine Learning happen!
We're just getting started 🤗
Check out the new feature @huggingface is experimenting with to help address concerns related to personal information in datasets: https://t.co/ixu46PHOvu @qlhoest @mmitchell_ai
No, AI will not singlehandedly "solve" climate change, nor is it the only technology requiring natural resources like energy and water.
We need transparency about AI's environmental impacts in order to make informed decisions around its deployment.
More EU data transparency 🇪🇺
The "sufficiently detailed summary" for GPAI is one of the most exciting parts of the AI Act, but the AI Office has their work cut out writing a template. A recent proposal led by @ZWarso provides a strong starting point,
https://t.co/ZQQgi7i6kK
1/2
Community-centric and awesome: @huggingface and @Wikimedia 🤗 I wrote an article on how we can advance ML with diverse datasets from @Wikipedia, why and how to create more Wikimedia datasets on Hugging Face, and community consent. Have a look! https://t.co/ydmo3QT0hN
My opinion piece about the Scarlett Johansson situation is live!
In it, I show that this situation is a symptom of the broader issue of objectifying women in AI - and what we can do about it:
https://t.co/ssSsTftIUL
Looking forward to hearing people's thoughts!
For the first time, I'll be presenting my ongoing research project on the values conveyed by open- and closed-source LLMs across multiple languages!
Online only, CEST time - registration here: https://t.co/z7dNqe308j
Interesting legislation: Generative AI Copyright Disclosure Act, recently introduced by California Democratic congressman Adam Schiff.
H/T @TrevelinBruna who helped me understand some key points. 🧵 1/
https://t.co/l6K6vRizZu