The year is 2022. Life, universe, and everything remain as wondrous as they’ve ever been, and as wondrous as they shall ever be. There is a enough for everyone to learn, invent, cherish and heal. Engage yourself in this wonder with wisdom and kindness✨
To me founding a company is the same as creating culture. This has inherent pushback. Ppl are used to their ways. It requires a mindset that accepts change. It requires a place that accepts change. Norway isn’t it. Yet.
🚨New paper on AI and copyright
Several authors have sued LLM companies for allegedly using their books without permission for model training.
👩⚖️Courts, however, require empirical evidence of harm (e.g., market dilution). Our new pre-registered study addresses exactly this gap.
Joint work with Profs @dhillon_p (@umsi) & Jane Ginsburg ( @ColumbiaLaw)
(1/n)🧵
This is big: 10 authors have filed a class action lawsuit against Microsoft for allegedly training AI models on pirated books.
They cite evidence Microsoft trained on Books3, a 200,000-book dataset known to derive from pirated sources.
Full complaint: https://t.co/rMmjs0YIMB
🚨 Another AI fair use ruling today, and this one is *much* better for creators. 🚨
tl;dr: The judge said "In many circumstances it will be illegal to copy copyright-protected works to train generative AI models without permission."
Authors sued Meta for training on their books; Meta claimed fair use. Judge Chhabria actually ruled it was fair use. *But* he was clear: he only ruled this because he felt the authors argued the case badly.
He went further. To the question of whether unlicensed training is illegal, he said "in most cases the answer will be yes".
He said generative AI can flood the market, undermining the market for the originals that are copied, disincentivizing creation.
Generative AI has "the ability to severely harm the market for the works being copied, and thus severely undermine the incentive for human beings to create."
He went further. He called out Judge Alsup's ruling yesterday in the Anthropic case, which went against authors on fair use, as being based on an "inapt analogy" (likening AI training to human learning), and accused him of "blowing off the most important factor in the fair use analysis" - the market effect on the work that is copied.
Tech lobbyists will frame the headline as "Meta wins on fair use", to try to convince people things are going tech companies' way. They are not.
Judge Chhabria could not be more clear. "The upshot is that in many circumstances it will be illegal to copy copyright-protected works to train generative AI models without permission. Which means that the companies, to avoid liability for copyright infringement, will generally need to pay copyright holders for the right to use their materials."
This is a much more thoughtful interpretation of copyright law than yesterday's decision, and I suspect time will show it is the correct one.
@ednewtonrex Did anthropomorphising AI lead to the bad? Authors argued training AI is like training a human, thus not transformative use, thus cannot be fair use. Judge argued a human (AI) can be asked to pay once, but not for every re-read, recall, writing stuff from learnings etc., so fair.
Pay attention to who is celebrating today's fair use judgement.
Despite authors' big piracy win, verdict's narrow scope, & likely appeal on fair use, these people celebrate a verdict that favors AI companies over creators defending their rights.
Tells you all you need to know.
The fair use ruling feels pretty unfair. Amongst other things, the end user is using these tools to quickly access raw info written by some human. Surely, it impacts the market for work in question.
Today's ruling in the authors vs. Anthropic copyright lawsuit is a mixed bag. It's not the win for AI companies some headlines suggest - there are good and bad parts.
In short, the judge said Anthropic's use of pirated books was infringing, but said its training on non-pirated work was fair use.
The pirated books part could lead to huge damages for Anthropic (amounts to be determined at trial). This is a massive win for the authors, particularly given how many AI companies + lobbyists were arguing using pirated books was acceptable. Lots of other AI companies train on pirated work, and it looks like all will be guilty of copyright infringement.
The fair use ruling is much more favorable to Anthropic. The judge based it on an assertion that training on authors' works doesn't disincentivize authorship. I think this is demonstrably false, and I suspect it will be contested.
A couple of important things to bear in mind on the fair use ruling:
1. This is not a blanket ruling that says all generative AI training is fair use. Other cases may go the other way, as the facts are different. The Copyright Office has already pointed out that some AI models are more transformative than others - for instance, they singled out AI music models as less transformative. Lobbyists will say this decision confirms that generative AI training is fair use - that's not true.
2. It will likely be appealed, and will probably go to higher courts. I think the judge mischaracterizes the effect of the copying on the market for & value of the original, and I suspect this will be the subject of more debate. This decision is unlikely to be the end of the story.
So there are good aspects - in particular, it looks like many AI companies will be determined to be infringing copyright on a massive scale due to their use of masses of pirated works.
But the fair use decision, while it isn't broad and will be appealed, is a blow for creators. It's a decision that will be celebrated by the people who have stakes in, or who are paid by, AI companies, and that tells you all you need to know.
You would not believe the quotations we have from inside TikTok. They know what they are doing to children including the addiction ("compulsive use") and attention fragmentation. You've got to read the quotations yourself:
https://t.co/Rau9NZ2lvI
AI, in its present form, can be characterized as a technology that erodes intellectual capital, which has been adopted en masse in a way that erodes social capital. This is not how I want my field to evolve. This is not the AI I want my kids to encounter, nor should you.
The in-person generative AI protests have begun.
Authors today protesting the unauthorized use of their books for AI training at Meta’s UK offices.
Expect lots more protests like this.
Brua @brua_io — a medium through which trustworthy information flows reliably from one human to another, in a way that is fair and respectful towards those who form the pillars of knowledge that (often unknowingly) have shaped modern AI.
https://t.co/t42E7axsIh