MD5 and SHA-1 Destined for Skid Row

Okay I admit it, sometimes I’ve been guilty of using the MD5 hashing algorithm in my code simply because it’s easily available in Java, it’s convenient and familiar, and I implicitly trust it. (Jeez, I mean like relax already, it’s not like I said “I ‘encrypt’ users' passwords using Base64 encoding and then place them in a .txt file on a public web folder” or something like that... Luckily, none of those times when I “blindly” used MD5 were for out-and-out security critical functions. But in each case, uniqueness of the hash was kind of the main reason for using it).

Anyway, it’s been known for a while that MD5 isn’t perfect... that it’s possible to create collisions – where two different documents result in the same hash. In August last year, Chinese researchers proved [pdf] that it’s actually a lot easier to create collisions than it was previously thought; and in fact, given a document format that can contain hidden text (Word, HTML etc), it’s downright easy to make two arbitrary documents’ MD5 hashes collide.

The US Government’s “boy wonder” algorithm SHA-1, considered more secure than MD5, fell prey to similar repeated prodding on the 16 December 2004.

And even more recently (as reported by web security expert Bruce Schneier), two researchers from the Institute for Cryptology and IT-Security generated PostScript files with identical MD5-sums but different content. The issue was also brought to the attention of the “public at large” in a recent legal trial in Australia, where a Sydney magistrate dismissed the charge against an alleged road speedster because the prosecutors couldn’t prove that their vital photographic evidence was cryptographically secure. And it was again commented on by the appropriately-named Bruce.

Luckily for Java users, the JDK contains nice implementations of the SHA-256 and SHA-512 hash algorithms, which are altogether more secure than MD5. So (note to self): there’s no excuse for using MD5 or SHA-1 in my code from now on. Must write out 100 times...

About the Author

Matt Stephens is a senior architect, programmer and project leader based in Central London. He co-wrote Agile Development with ICONIX Process, Extreme Programming Refactored, and Use Case Driven Object Modeling with UML - Theory and Practice.