‘Neo-Nazi madness’: Meta’s top AI lawyer explains why he fired the company


The only exception to this is the UMG vs. Anthropic In this case, because at least initially, earlier versions of Anthropic generated song lyrics in the output. It’s a problem. The current status of this case is that they have safeguards in place to try to prevent this from happening, and the parties have sort of agreed that, pending resolution of the case, those safeguards are sufficient, they therefore no longer seek a preliminary injunction.

Ultimately, the hardest question for AI companies is not is it legal to take training? It is What do you do when your AI generates a result that is too similar to a particular work?

Do you think the majority of these cases will go to trial, or do you see settlements on the horizon?

There may well be regulations. Where I expect to see settlements is with large players who either own large swaths of content or particularly valuable content. The New York Times could reach a settlement and licensing agreement, perhaps where OpenAI would pay money to use New York Times content.

There’s enough money at stake that we’ll probably get at least some judgments that will set the parameters. I have the impression that the class action plaintiffs have stars in their eyes. There are a lot of class action lawsuits, and I suspect the defendants will resist them and hope to win on summary judgment. It is not clear that they will be judged. The Supreme Court in the Google vs. Oracle This case strongly pushed fair use law to be resolved by summary judgment, not before a jury. I think the AI ​​companies are going to do their best to have these cases decided by summary judgment.

Why would it be better for them to win on summary judgment rather than a jury verdict?

It’s quicker and cheaper than going to trial. And AI companies are worried that they won’t be seen as popular, and that many people think, Oh, you made a copy of work that should be illegal and not digging into the details of the fair use doctrine.

There have been many agreements between AI companies and mediacontent providers and other rights holders. Most of the time, these deals seem to be more about research than fundamental models, or at least that’s how it was described to me. In your opinion, is the licensing of content in AI search engines (where answers come from retrieval augmented generation or RAG) legally mandatory? Why do they do it this way?

If you use AR on targeted, specific content, then your fair use argument becomes more difficult. It is much more likely that AI-generated search will generate text directly from a particular source in the output, and it is much less likely that this is fair use. I’m serious could be, but the risk is that it is much more likely to compete with the original source material. If instead of directing people to a New York Times article, I give them my AI prompt that uses RAG to pull text directly from that New York Times article, that seems like a substitution that could harm the New York Times. The legal risk is greater for the AI ​​company.

What do you want people to know about copyright struggles related to generative AI that they may not already know, or may have been misinformed about?

What I hear most often as being technically wrong is the idea that they are just plagiarism machines. All they do is take my stuff and then get it back in text and response form. I hear a lot of artists say that, and I hear a lot of lay people say that, and it’s just not technically correct. You can decide whether generative AI is good or bad. You can decide it’s legal or illegal. But it really is a fundamentally new thing that we’ve never experienced before. Just because it needs to practice a bunch of content to understand how sentences work, how arguments work, and to understand various facts about the world doesn’t mean it’s just copying and pasting things or create a collage. It’s really generating things that no one could have expected or predicted, and it’s giving us a lot of new content. I think it’s important and valuable.