AI-generated code could be a disaster for the software supply chain. Here’s why.

May Be Interested In:India great Virat Kohli retires from test cricket



In AI, hallucinations occur when an LLM produces outputs that are factually incorrect, nonsensical, or completely unrelated to the task it was assigned. Hallucinations have long dogged LLMs because they degrade their usefulness and trustworthiness and have proven vexingly difficult to predict and remedy. In a paper scheduled to be presented at the 2025 USENIX Security Symposium, they have dubbed the phenomenon “package hallucination.”

For the study, the researchers ran 30 tests, 16 in the Python programming language and 14 in JavaScript, that generated 19,200 code samples per test, for a total of 576,000 code samples. Of the 2.23 million package references contained in those samples, 440,445, or 19.7 percent, pointed to packages that didn’t exist. Among these 440,445 package hallucinations, 205,474 had unique package names.

One of the things that makes package hallucinations potentially useful in supply-chain attacks is that 43 percent of package hallucinations were repeated over 10 queries. “In addition,” the researchers wrote, “58 percent of the time, a hallucinated package is repeated more than once in 10 iterations, which shows that the majority of hallucinations are not simply random errors, but a repeatable phenomenon that persists across multiple iterations. This is significant because a persistent hallucination is more valuable for malicious actors looking to exploit this vulnerability and makes the hallucination attack vector a more viable threat.”

In other words, many package hallucinations aren’t random one-off errors. Rather, specific names of non-existent packages are repeated over and over. Attackers could seize on the pattern by identifying nonexistent packages that are repeatedly hallucinated. The attackers would then publish malware using those names and wait for them to be accessed by large numbers of developers.

The study uncovered disparities in the LLMs and programming languages that produced the most package hallucinations. The average percentage of package hallucinations produced by open source LLMs such as CodeLlama and DeepSeek was nearly 22 percent, compared with a little more than 5 percent by commercial models. Code written in Python resulted in fewer hallucinations than JavaScript code, with an average of almost 16 percent compared with a little over 21 percent for JavaScript. Asked what caused the differences, Spracklen wrote:

share Share facebook pinterest whatsapp x print

Similar Content

What comes next for Trump election interference case
What comes next for Trump election interference case
Front of a Lumia 1020 case with iPhone SE 3 internals. An iPhone lockscreen, with the fish/aquarium wallpaper, showing the time, with a Dell running macOS immediately behind.
Windows LumiPhone? Modder elegantly fits 2022 iPhone into 2013 Nokia body
Russia-Ukraine war: U.S. holds separate talks on partial ceasefire - National | Globalnews.ca
Russia-Ukraine war: U.S. holds separate talks on partial ceasefire – National | Globalnews.ca
A luxury lighthouse stay in northern Spain: ‘Windows look east and west to sunrise and sunset’
A luxury lighthouse stay in northern Spain: ‘Windows look east and west to sunrise and sunset’
Sara Pascoe's secret London: where the comedian eats, drinks and shops
Sara Pascoe’s secret London: where the comedian eats, drinks and shops
UK firms slash jobs as stagflation fears grow – business live
UK firms slash jobs as stagflation fears grow – business live

Leave a Reply

Your email address will not be published. Required fields are marked *

The Daily Globe: Headlines from Around the World | © 2025 | Daily News