The rising popularity of generative language models has fueled the proliferation of AI-generated content, causing universities to act and craft policies on the responsible use of artificial intelligence. Aware that students may have the propensity to cheat, teachers also use AI detection tools to catch instances of plagiarism. However, the use of these tools raises concerns about their accuracy and biases, particularly against non-native English speakers.

A team of researchers from Stanford University, led by Weixin Liang, recently published a study in which they evaluated seven GPT detectors and found that these detectors incorrectly marked essays submitted for the Test of English as a Foreign Language standardized test as AI-generated.

On average, the detectors had a 61.3 percent false-positive rate after evaluating 91 TOEFL essays that were obtained from a Chinese forum. On the other hand, the detectors were able to correctly classify 88 sample essays written by US eighth-grade students without issue.


The researchers believe that the bias exhibited by the GPT detectors is due to text perplexity, which is a metric for gauging a language model’s ability to predict the next word in a sentence. Non-native writers with their relatively limited diction tend to produce written output with low text perplexity, and apparently this is an indicator used by GPT detectors that the text is AI-generated.

When the researchers used ChatGPT to modify the TOEFL essays to have enhanced word choices, thereby increasing the text perplexity, the false-positive rate dropped significantly.

With their findings, the researchers emphasize the “potential for unwarranted consequences” when using GPT detectors. The researchers also recommend defining appropriate applications of GPT models across different contexts, particularly in the academia and the workplace.

Leave a comment

Your email address will not be published. Required fields are marked *