PHP coding
Some website security vulnerabilities could be caused by web developers following flawed online tutorials containing vulnerable code snippets iStock

German cybersecurity researchers have discovered that many security vulnerabilities in websites could in fact be due to people learning to code using popular online tutorials that are riddled with mistakes.

Computer scientists from the Technical University of Berlin, Saarland University, the Technical University of Braunschweig and cybersecurity firm Trend Micro analysed thousands of PHP coding projects on the coding repository GitHub and cross-referenced the code against a handful of popular coding tutorials that rank at the top of Google.

Many of the most popular tutorials online focus on teaching new coders how to perform a particular task, such as how to create a search form in PHP; how to accept a user's input from a message box/HTML form and output it in JavaScript; or tutorials on how to start using the open source database management system MySQL.

The tutorials offer coding examples that people can copy, but often the coding examples contain mistakes that make it possible for attackers to later perform cross-site scripting (XSS) and SQL injection attacks on vulnerable websites to steal sensitive data from users.

The researchers checked the top five results for each of the above three coding tutorial queries, and they found that nine of the 15 results contained vulnerable codes. They loaded 64,415 PHP codebases on GitHub in a database and then ran queries asking a regular desktop PC to locate the code snippets.

After manually analysing the results that the computer produced, they discovered 117 vulnerabilities that were very similar to the code snippets featured in the tutorials.

"Developers routinely consult programming resources as software is written. Although formal documentation such as language and API reference manuals provide detailed guidance, tutorials on the Web are as easily available and are more succinct. The lure of quick actionable advice makes tutorials an appealing reference for developers," the researchers conclude in their paper.

"In our large-scale case study, we find over 100 vulnerable code snippets in application code that are syntactically similar, and in eight instances identical, to tutorial code. These findings corroborate our hypothesis that vulnerable tutorials can be used to seed large-scale vulnerability discovery. They also suggest that there is a pressing need for code audit of widely consumed tutorials, perhaps with as much rigour as for production code."

The paper, entitled "Leveraging Flawed Tutorials for Seeding Large-Scale Web Vulnerability Discovery" is published on Cornell University Library's open source database.