Home MarketsEurope & Middle East AI models do not need data from publishers

AI models do not need data from publishers

by SuperiorInvest

Sam Altman, CEO of OpenAI, attends the 54th annual meeting of the World Economic Forum in Davos, Switzerland, on January 18, 2024.

Denis Balibouse | Reuters

DAVOS, Switzerland – Sam Altman said he was “surprised” by the New York Times’ lawsuit against his company, OpenAI, saying its artificial intelligence models did not need to be trained on the publisher’s data.

Describing the legal action as “somewhat strange,” Altman said OpenAI had been in “productive negotiations” with the Times before news of the lawsuit broke. According to Altman, OpenAI wanted to pay the outlet “a lot of money to show its content” on ChatGPT, the company’s popular AI chatbot.

“We were as surprised as anyone to read that we were being sued in The New York Times. That was kind of strange,” the OpenAI leader said on stage at the World Economic Forum in Davos on Thursday.

He added that he is not that concerned about the Times lawsuit and that a resolution with the publisher is not a priority for OpenAI.

“We are open to training [AI] in The New York Times, but it’s not our priority,” Altman said in front of a packed crowd in Davos.

“We don’t actually need to train on their data,” he added. “I think this is something people don’t understand. Any particular training source doesn’t move us much.”

The Times sued both microsoft and OpenAI late last year, accusing the companies of alleged copyright infringement by using their articles as training data for their AI models.

The media outlet seeks to hold Microsoft and OpenAI liable for “billions of dollars in legal and actual damages” related to the “illegal copying and use of the Times’ exceptionally valuable works.”

In the lawsuit, the Times showed examples in which ChatGPT returned nearly identical versions of the publisher’s stories. OpenAI has disputed the Times’ allegations.

Ian Crosby, a partner at Susman Godfrey who represents The New York Times as lead counsel, said in a statement that Altman’s comment on the lawsuit shows that OpenAI admits to having used copyrighted content to train its models and effectively “take advantage of ” of the newspaper’s investments. in journalism.

“OpenAI acknowledges that they have trained their models on the Times’ copyrighted works in the past and admits that they will continue to copy those works when they access the Internet to train models in the future,” Crosby said in a statement emailed to CNBC. Thursday.

He called that practice “the opposite of fair use.”

The legal action has raised concerns that more media publishers could go after OpenAI with similar claims. Other outlets are looking to partner with the company to license their own content, rather than fight in court. Axel Springer, for example, has an agreement with the company whereby he licenses his content.

OpenAI responded to the Times’ lawsuit earlier this year, saying in a statement that instances of “regurgitation,” or spitting out entire “memorized” portions of specific content or articles, “is a rare error that we are working to fix.” . zero.”

In that same statement, the AI ​​developer said it is working to collaborate with news organizations and create new revenue and monetization opportunities for the industry. “Training is fair use, but we offer the option to opt out because it is the right thing to do,” the company said.

Altman’s comments echo comments the AI ​​leader made at a Bloomberg-hosted event in Davos earlier this week. Altman then said he wasn’t all that concerned about the Times lawsuit, disputed the publisher’s allegations and said there would be many ways to monetize news content in the future.

“There are all the negatives of these people saying…don’t do this, but the positive is that I think there will be great new ways to consume and monetize news and other published content,” Altman said.

“And for every New York Times situation, we have a lot more super productive stuff about people who are excited about building the future and not doing theater.”

Altman added that there were ways OpenAI could modify the company’s GPT models, so that they don’t regurgitate any stories or features published online verbatim.

“We don’t want to regurgitate someone else’s content,” he said. “But the problem is not as easy as it seems in a vacuum. I think we can get that number lower and lower, pretty low. And that seems like a super reasonable thing to evaluate ourselves on.”

Source Link

Related Posts