Views: 2908
As more publishers cut content licensing deals with ChatGPT-maker OpenAI, a study put out this week by the Tow Center for Digital Journalism — looking at how the AI chatbot produces citations (i.e. sources) for publishers’ content — makes for interesting, or, well, concerning, reading.
In a nutshell, the findings suggest publishers remain at the mercy of the generative AI tool’s tendency to invent or otherwise misrepresent information, regardless of whether or not they’re allowing OpenAI to crawl their content.
The research, conducted at Columbia Journalism School, examined citations produced by ChatGPT after it was asked to identify the source of sample quotations plucked from a mix of publishers — some of which had inked deals with OpenAI and some which had not.
The Center took block quotes from 10 stories apiece produced by a total of 20 randomly selected publishers (so 200 different quotes in all) — including content from The New York Times (which is currently suing OpenAI in a copyright claim ); The Washington Post (which is unaffiliated with the ChatGPT maker); The Financial Times ( which has inked a licensing deal ); and others.
“We chose quotes that, if pasted into Google or Bing, would return the source article among the top three results and evaluated whether OpenAI’s new search tool would correctly identify the article that was the source of each quote,” wrote Tow researchers Klaudia Jaźwińska and Aisvarya Chandrasekar in a blog post explaining their approach and summarizing their findings.
“What we found was not promising for news publishers,” they go on. “Though OpenAI emphasizes its ability to provide users ‘timely answers with links to relevant web sources,’ the company makes no explicit commitment to ensuring the accuracy of those citations. This is a notable omission for publishers who expect their content to be referenced and represented faithfully.”
“Our tests found that no publisher — regardless of degree of affiliation with OpenAI — was spared inaccurate representations of its content in ChatGPT,” they added.
Unreliable sourcing
The researchers say they found “numerous” instances where publishers’ content was inaccurately cited by ChatGPT — also finding what they dub “a spectrum of accuracy in the responses”. So while they found “some” entirely correct citations (i.e. meaning ChatGPT accurately returned the publisher, date, and …
Related
Discover more from 25finz, L.L.C
Subscribe to get the latest posts sent to your email.
No comments yet. Be the first to comment!
You must be logged in to post a comment. Log in