Researchers: Bugs in open source software are waning
Developers of the Linux OS, Apache Web server, and about 250 other different open source projects have removed more than 8,500 individual bugs from their code over the past two years, according to a study released this week.
Linux developers, according to the Scan Report on Open Source Software 2008, accomplished this feat using a scanning Web site developed by Coverity, Inc. with support from the US Dept. of Homeland Security. In this expansive study, researchers reported a 16% reduction in static analysis density since 2006, among many other findings.
This level of reduction might not sound like that much, acknowledged David Maxwell, Coverity's open source strategist, in an interview with BetaNews.
"But you have to consider that we're dealing with more than 55 million lines of code on a recurring basis. [Sixteen percent] of 55 million lines of code amounts to a lot of code," Maxwell told BetaNews.
In other results from the report, the average rate of false positives identified on the Scan site turned out to be less than 14%.
"This is really important. Developers prefer to work on bug fixes when the numbers are 'real.' That way, it's easier to fix things. Otherwise, it can be too frustrating for them," Maxwell said.
As previously reported, the Coverity Scan site was originally developed as part of the US federal government's Open Source Hardening Project. Source code analysis from the Scan site is available free of charge to all qualified open source projects at http://www.coverity.com.
Other open source projects using the Scan site include BSD, the Firefox Web browser, and Samba, an open source implementation of SMB -- a protocol use by Microsoft Windows for file and print services.
The exhaustive report released this week is based on 14,238 project runs over the past two years, for a grand total of almost 10 billion lines of code. Most of the code analyzed was written in C, with some in C++ or Java. "But we didn't split things out by programming language," Maxwell told BetaNews.
In one particularly intriguing finding, data from the report contradicts common wisdom by indicating that projects with large average function lengths are no more prone than those shorter in length to higher defect densities.
Also from the study, "NULL pointer deference" turned out to be the most frequently occurring defect across the scan database, whereas "Use before test of negative values" was the least frequent, according to Maxwell.