News

Inputs are dangerous

Well, it's "Patch Day Tuesday" again, and Microsoft just released another raft of security bulletins. We got ten bulletins this time, but that amounts to something upwind of 20 separate vulnerabilities patched; Microsoft still likes to group multiple vulnerabilities together to keep the shock factor down. Even so, that's a lot of patches. What can developers learn from this month's mess?

The first thing to note is that Microsoft's publicly-announced steps for ending this madness are not working all that well. You may recall a while ago that the folks in Redmond were crowing about retraining all their developers and doing a full security review on the Windows source code, even though that meant they had to slip schedules? It didn't work. Not only are plenty of these patches applicable to Windows 2003, but at least one (the SMTP vulnerability) appears to have been introduced with the Windows 2003 code. Well, that just proves something we already knew: writing secure code is hard.

But beyond picking on Microsoft, I want to point out a pattern that we're starting to see in more and more of these vulnerabilities:

Of course last month we had a fuss involving JPEG files, and before that we've seen that bad things can lurk in Word files that someone has maliciously altered. The lesson for developers is simple: if it comes from outside of your own code, you can't trust it.

Think about the Excel problem for a moment. Without seeing the code involved, my bet is that some developer working on the parsing routine for a newly-opened Excel file made the natural assumption that he knew what the file structure would be. After all, Excel files come from Excel. But somewhere there's a loop or data-copying routine that crashes and burns when faced with a file that isn't really Excel, but just looks like it long enough to get to the danger point. All the testing in the world wouldn't find this flaw, if the testing were performed with valid Excel files.

In today's well-connected world, you need to consider every bit of data that you accept from outside of your own code as a potential attack vector. If you're opening a file, someone else could have put whatever they like in that file. If you're reading settings from an XML configuration file or a registry key, how do you know that someone else didn't put code in those settings just to do bad things to you? If you look up data on the Internet, consider what could happen if someone wanted to send back more data than you're prepared to handle.

Thinking about inputs is part of the larger process of threat modeling, in which you systematically try to figure out how your code might be vulnerable to the Bad People. If you don't know much about threat modeling, it's worth digging in. A good place to start (though it makes you wonder about the left hand knowing what the right hand is doing in large organizations) is the MSDN Threat Modeling page.

About the Author

Mike Gunderloy has been developing software for a quarter-century now, and writing about it for nearly as long. He walked away from a .NET development career in 2006 and has been a happy Rails user ever since. Mike blogs at A Fresh Cup.