As publishers increasingly sign content licensing deals with OpenAI, a new study by Columbia Journalism School’s Tow Center for Digital Journalism sheds light on ChatGPT’s citation practices, revealing significant issues with accuracy and transparency.

Findings Highlight Misinformation Risks

The research evaluated ChatGPT’s ability to correctly cite the sources of 200 quotations from 20 publishers, including The New York Times, The Washington Post, and The Financial Times. The results were concerning:

  • Frequent Inaccuracies: ChatGPT provided partially or entirely incorrect citations for 153 of the 200 quotes analyzed. Only seven instances featured acknowledgments of uncertainty, with phrases like “it’s possible” or “I couldn’t locate the exact article.”
  • Confident Misrepresentation: In most cases, ChatGPT confidently delivered incorrect attributions, misleading users about the validity of its responses.
  • Rewarding Plagiarism: In some cases, ChatGPT incorrectly cited websites that had plagiarized content, bypassing the original sources like The New York Times.

No Guarantees for Publishers

The study highlights that even publishers with licensing deals, like The Financial Times, are not spared from inaccurate citations. Similarly, those blocking OpenAI’s crawlers are not immune to issues, as ChatGPT resorts to “confabulation” to fill information gaps, sometimes attributing content incorrectly.

For publishers without deals, the findings are grim: allowing OpenAI’s crawlers in doesn’t ensure accurate citations or increased visibility, while blocking them doesn’t prevent reputational risks.

Systemic Issues with Generative AI

Tow researchers Klaudia Jaźwińska and Aisvarya Chandrasekar argue that OpenAI’s models treat journalism as “decontextualized content,” neglecting the circumstances of its original production. Additionally, ChatGPT’s inconsistency in responses undermines trust, as identical queries often yield different answers.

OpenAI Responds

In response, OpenAI called the study an “atypical test of our product” and emphasized its efforts to enhance in-line citation accuracy and respect publisher preferences. The company highlighted its support for publishers by connecting millions of ChatGPT users with quality content.

Takeaways for Publishers

The study underscores a troubling lack of control for publishers over how their content is used and attributed by generative AI tools like ChatGPT. Whether through licensing agreements or other measures, the findings suggest significant improvements are needed to ensure accurate citations and fair representation of journalistic work.

Leave a Reply

Your email address will not be published. Required fields are marked *