ChatGPT is full of sensitive private information and spits out verbatim text from CNN, Goodreads, WordPress blogs, fandom wikis, Terms of Service agreements, Stack Overflow source code, Wikipedia pages, news blogs, random internet comments, and much more....
It's not hard to understand. People already trust the output of LLMs way too much because it sounds reasonable. On further inspection often it turns out to be bullshit. So LLMs increase the level of bullshit compared to the input data. Repeat a few times and the problem becomes more and more obvious.
Google Researchers’ Attack Prompts ChatGPT to Reveal Its Training Data (www.404media.co)
ChatGPT is full of sensitive private information and spits out verbatim text from CNN, Goodreads, WordPress blogs, fandom wikis, Terms of Service agreements, Stack Overflow source code, Wikipedia pages, news blogs, random internet comments, and much more....