It’s “We the people,” not we the AI.
Computer language models are under the impression that the US Constitution — written way before the Internet in 1787 — is an AI-generated document.
The explanation is simple, according to Edward Tian, creator of AI writing catcher GPTZero, a service that also found the biblical book of Genesis to be 88% computerized.
“The US Constitution is a text fed repeatedly into the training data of many large language models,” he told Ars Technica.
“As a result, many of these large language models are trained to generate similar text to the Constitution and other frequently used training texts. GPTZero predicts text likely to be generated by large language models, and thus this fascinating phenomenon occurs.”
But this spells out a larger issue than just James Madison rolling over in his grave. It signifies a major problem in AI’s inability to differentiate computer-generated text from human at a time when college professors are quaking in fear over digital plagiarism.
Last spring at Texas A&M, a professor reportedly flunked an entire class after ChatGPT claimed to be responsible for students’ assignments — despite their pleas of innocence.
In the case of GPTZero — which goes off “a large, diverse corpus of human-written and AI-generated text, with a focus on English prose” — the AI seeks out “perplexity” as a sign of human touch in writing.
“Perplexity is a function of ‘how surprising is this language based on what I’ve seen?’ ” Margaret Mitchell, of the AI company Hugging Face, told the outlet.
So, when a paper is turned in with much of its language consistent with training data a k a famous documents, manifestos and proper writing, a perplexity score would hit quite low and trip AI sensors.
Burstiness, otherwise known as the consistency of how words and phrases appear in a writing sample, is also used as a security measure.
However, recent in-depth, human-engineered research from the University of Maryland doubled down that these kinds of methods are “not reliable in practical scenarios” and don’t deserve an A for effort.
The technological shortcoming has inspired some professors, including Wharton’s Ethan Mollick, to embrace AI in education rather than shun it.
“There is no tool that can reliably detect ChatGPT-4/ Bing/ Bard writing. The existing tools are trained on GPT-3.5, they have high false positive rates (10%+), and they are incredibly easy to defeat,” he tweeted in May.
The AI sites aren’t ignorant of all of this, either. In Tian’s case, he’s already modifying GPTZero to wane off plagiarism hunting. The Constitution bug has already been fixed since going viral in April.
“Compared to other detectors, like Turn-it-in, we’re pivoting away from building detectors to catch students,” he said. “Instead, the next version of GPTZero will not be detecting AI but highlighting what’s most human, and helping teachers and students navigate together the level of AI involvement in education.”
Source