Home » 100,000 ChatGPT Chats Leaked Via Google Search

100,000 ChatGPT Chats Leaked Via Google Search

A new report confirms the number of indexed chats is significantly higher than first thought, including confidential business data and intimate discussions.

The scope of the recently discovered privacy issue involving shared ChatGPT conversations being indexed by Google is vastly larger than initially reported. A new investigation by 404 Media reveals that a researcher has scraped a dataset of nearly 100,000 publicly shared chats, exposing a trove of sensitive information ranging from confidential business contracts to deeply personal relationship advice.

This development follows our report from July 31, which, based on a Fast Company article, highlighted that thousands of private user conversations were appearing in Google search results. At the time, a specific site search revealed approximately 4,500 indexed chats, raising initial alarms about user privacy. The new figure suggests the problem was more than twenty times larger, providing a much clearer picture of the scale of the data exposure.

According to the 404 Media report published on August 5, an anonymous researcher compiled the massive dataset, which contains a wide array of user interactions with the AI chatbot. The exposed information includes:

  • A user uploading what they claimed was a copy of OpenAI’s own non-disclosure agreement.
  • Drafts of confidential business contracts for named companies.
  • Intimate conversations, including a user seeking advice on whether to contact an ex-partner and asking ChatGPT to draft the message.
  • Chats containing enough personal details, such as names, to potentially identify the individuals involved.

In response to these findings, OpenAI has now removed the feature responsible for the leak. In a statement provided to 404 Media, OpenAI’s Chief Information Security Officer (CISO), Dane Stuckey, described it as a “short-lived experiment.”

“We just removed a feature from [ChatGPT] that allowed users to make their conversations discoverable by search engines, such as Google,” Stuckey stated. “This feature required users to opt-in, first by picking a chat to share, then by clicking a checkbox for it to be shared with search engines.”

The core issue stemmed from this opt-in “share with search engines” function, which many users, as noted in our previous coverage, may not have fully understood. Stuckey acknowledged the design flaw, adding, “Ultimately we think this feature introduced too many opportunities for folks to accidentally share things they didn’t intend to, so we’re removing the option.”

Despite OpenAI’s move to retract the feature and its stated efforts to remove the content from search engines, the 404 Media report reveals a critical point: the data has already been captured. The fact that a researcher was able to scrape and archive nearly 100,000 of these conversations means the information now exists independently of OpenAI’s or Google’s platforms, where it can no longer be controlled or retracted.


Featured image credit

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *