NYT Forces OpenAI To Retain Chat Data In Court

A federal court has delivered a significant blow to OpenAI, upholding a sweeping order that compels the AI giant to indefinitely retain all ChatGPT user logs, including those previously deleted, as part of its ongoing copyright infringement lawsuit with The New York Times and other news organizations. The decision has ignited a fierce debate over user privacy, setting a potentially alarming precedent for the rapidly evolving field of artificial intelligence.

Last week, OpenAI’s legal team contested the order in court, arguing that it fundamentally undermines the company’s “long-standing privacy norms” and violates the privacy protections promised to users in its terms of service. However, US District Judge Sidney Stein was unconvinced, promptly denying the company’s objections. Judge Stein pointed to OpenAI’s own user agreement, suggesting it already specifies that user data could be retained for legal processes, which he affirmed is precisely the current situation.

The controversial order was initially issued by magistrate judge Ona Wang at the urgent request of news plaintiffs, led by The New York Times. They argued that the preservation of chat logs was critical to securing potential evidence. The news organizations allege that ChatGPT users may have utilized the chatbot to bypass their paywalls to access copyrighted news content, and then deleted the incriminating chats.

An OpenAI spokesperson confirmed to Ars Technica that the company intends to “keep fighting” the ruling, though its legal avenues appear to be narrowing. The firm could petition the Second Circuit Court of Appeals for an emergency stay, but such interventions are rare and would require demonstrating an extraordinary abuse of discretion by the lower court. OpenAI has not confirmed if it will pursue this high-stakes legal maneuver.

In the interim, OpenAI finds itself in a precarious position, forced to negotiate with the news plaintiffs on a process for searching the vast trove of retained data. The company is caught between a rock and a hard place: it can cooperate to expedite the search process and hope for a quicker deletion of the sensitive data, or it can prolong the legal battle over the order, which risks exposing even more user conversations to potential scrutiny or a data breach.

While the prospect of The New York Times combing through every user’s chat history is unlikely, the agreed-upon process will involve searching a sample of the data based on specific keywords. This search will reportedly occur on OpenAI’s servers with anonymized data, which is not expected to be handed over directly to the plaintiffs.

For the news organizations, access to these logs is not necessarily a linchpin for their core copyright case but could provide crucial evidence of market dilution. They aim to show that ChatGPT’s ability to generate content similar to their own articles harms their business, a factor that could weigh heavily against OpenAI’s “fair use” defense.

The ruling has drawn sharp criticism from privacy advocates. Jay Edelson, a prominent consumer privacy lawyer, expressed deep concern, telling Ars Tecnica that the potential evidence within the logs may not significantly advance the plaintiffs’ case while drastically altering a product used daily by millions. Edelson warned of the security risks, noting that while OpenAI’s security may be robust, “lawyers have notoriously been pretty bad about securing data.” The idea of law firms handling “some of the most sensitive data on the planet,” he argued, “should make everyone uneasy.”

New Shortcuts update brings ChatGPT-style replies to iOS and macOS

Edelson suggested the order could have a chilling effect, pushing users to rival AI services and improperly influencing market dynamics. He posited a “cynical” view that the news plaintiffs might be leveraging the privacy concerns to pressure OpenAI into a settlement.

Critics also highlight the “bonkers” nature of the order, which notably excludes enterprise customers, a move Edelson believes has “no logic.” He argued this exemption protects powerful businesses while leaving “the common people” and their personal data exposed. This, he said, “is really offensive,” particularly as a request by two individual ChatGPT users to intervene in the case was denied.

“We are talking about billions of chats that are now going to be preserved,” Edelson stated, emphasizing the deeply personal nature of user interactions with ChatGPT, which can range from medical queries to marital advice. The primary risk is a data breach, but the prolonged retention also exposes user data to future legal requests from law enforcement or other private litigants.

Even OpenAI CEO Sam Altman’s recent public statements championing user privacy have been met with skepticism. Edelson characterized Altman as trying to “protect OpenAI” rather than genuinely caring for consumer privacy rights, suggesting the company’s financial motivations to resolve the case might not align with protecting its users.

“What’s really most appalling to me is the people who are being affected have had no voice in it,” Edelson concluded, criticizing the judges for dismissing user concerns and setting a precedent that could see more AI-generated data frozen in future litigation.

Featured image credit

NYT Forces OpenAI To Retain Chat Data In Court

Stay Ahead of the Curve!

Related Posts

What Is Text Mining? – Dataconomy

A Project Manager’s Common Mistakes: Insights From Danila Vasilyev

Leave a Reply Cancel reply