Many of you will know by now of the serious security problems revealed yesterday by Debian, the Linux distribution. We use Debian exclusively for our server platform, so we had to react very quickly to this issue, and make sure our systems were secure again. I think we’ve made all of the necessary changes now to ensure we’re safe from this particular problem.
I have also made some attempt to get to the bottom of what actually went on, and I’ll record it here for posterity. If any of the below is wrong, please let me know!
The story, basically, is this. In April 2006 bug #363516 was raised, suggesting that openssl wasn’t clean for valgrind. Valgrind is a package that detects problems in C code, and is widely used to help ensure software is correct. Valgrind reported some errors with openssl, and the reporter wanted to be able to use valgrind with openssl.
At that bug url a change is discussed to the openssl codebase. The general feeling from the bug discussion is that making this change isn’t a good idea, but then a patch was applied on 4th May 2006. There are two instances of the specific issue in the bug, one in ssleay_rand_add and one in ssleay_rand_bytes.
In the meantime, a discussion took place on the openssl-dev list. This mentions the same two lines, and on the 1st May email@example.com says he is in favour of removing them.
The patch amends the two lines suggested.
The problem, as I understand it, was a misunderstanding by the Debian Developer who made the change. The change to ssleay_rand_bytes was fine – this added some uninitialised memory into the entropy pool, which is fine. But the software doesn’t rely on it for security, and removing it is fine.
But the other change, in ssleay_rand_add, is a complete disaster. It alters the seeding for the random number generator in key generation, a serious flaw.
This reduces the keyspace to a few hundred thousand possible keys. It’s possible to generate all these keys in a few hours, and brute force a machine that’s using public key authentication with a compromised key in a few minutes, potentially. This is a security disaster of the first water, considering the number of organisations (such as ours) that rely on public key authentication for a lot of our inter-machine security. This also affected key generation for self signed email, web certificates, private networks, anonymous proxy networks and all sorts of other things. The cleaning up is going to take some time, and cost an awful lot. Some people are going to be compromised by this, and a lot of machines may be broken into.
Some background on how distributions work
Debian has a vast number of packages under management. They produce these packages by taking source packages from “upstream” (the people who maintain the software) and modifying it to fit the rules and goals of the distribution.
Some of these changes are for compatibility – for example, using standard file locations or configuration systems. Some of them are mechanical changes to do with integration with the build process. Quite a few changes are bug fixes.
It’s recommended that bug fixes be coordinated with upstream – send patches back to them, so everyone in the community can benefit from the changes.
Whose Fault Was It
After going through the above, it’s pretty clearly the DD (Debian Developer) in question’s fault. Although he suggested making changes on the openssl-dev list, and got an affirmative from someone on the project, it was pretty clear in the response that this was “if this helps with debugging it’s a good idea” not “i’ve closely read the code in question, and I agree”.
The DD should have also submitted a patch back to the openssl guys. They’d have spotted the error and screamed blue murder. He was a bit lazy and thoughtless here, and I imagine right now he wishes he could crawl into a hole and die.
What to do about it
Debian are getting badly slammed for this but it is worth keeping some perspective. We, and many others, use Debian because of it’s long history of excellent package quality. This is a result both of their culture (which is aggressively perfectionist) and their selection criteria for developers, which weeds out many dodgy ones. We are proud to use Debian, and will continue to do so.
DD’s are generally conscientious, knowledgeable and dedicated to their work. I have no reason to believe this DD was any different. Even conscientious, knowledgeable and dedicated people make mistakes. This is what process is for, to help mitigate human error. I think there was clearly a lack of process.
Two things would have really helped. Code review internally to debian and code review by upstream. I don’t think it’s unreasonable that for security critical packages Debian should require both for non-critical changes to these packages. Even critical changes should be reviewed as soon as possible.
Internal code review is impractical for every package, since it requires a good understanding of the code in question, and would impose a huge workload – but for critical packages I think it’s a necessity.
Upstream review is potentially tricky too. Some upstreams don’t have the time or inclination to participate. There is also often a lot of friction between distributions and upstream, since they have very different goals. This isn’t a problem that can be easily resolved – these groups really do have different goals and values, and sometimes unreconcilable differences arise. But for the good of their eventual users they need to work together to help stop this sort of problem occurring.