Debian’s OpenSSL Disaster

Many of you will know by now of the serious security problems revealed yesterday by Debian, the Linux distribution. We use Debian exclusively for our server platform, so we had to react very quickly to this issue, and make sure our systems were secure again. I think we’ve made all of the necessary changes now to ensure we’re safe from this particular problem.

I have also made some attempt to get to the bottom of what actually went on, and I’ll record it here for posterity. If any of the below is wrong, please let me know!

What Happened

The story, basically, is this. In April 2006 bug #363516 was raised, suggesting that openssl wasn’t clean for valgrind. Valgrind is a package that detects problems in C code, and is widely used to help ensure software is correct. Valgrind reported some errors with openssl, and the reporter wanted to be able to use valgrind with openssl.

At that bug url a change is discussed to the openssl codebase. The general feeling from the bug discussion is that making this change isn’t a good idea, but then a patch was applied on 4th May 2006. There are two instances of the specific issue in the bug, one in ssleay_rand_add and one in ssleay_rand_bytes.

In the meantime, a discussion took place on the openssl-dev list. This mentions the same two lines, and on the 1st May ulf@openssl.org says he is in favour of removing them.

The patch amends the two lines suggested.

The problem, as I understand it, was a misunderstanding by the Debian Developer who made the change. The change to ssleay_rand_bytes was fine – this added some uninitialised memory into the entropy pool, which is fine. But the software doesn’t rely on it for security, and removing it is fine.

But the other change, in ssleay_rand_add, is a complete disaster. It alters the seeding for the random number generator in key generation, a serious flaw.
This reduces the keyspace to a few hundred thousand possible keys. It’s possible to generate all these keys in a few hours, and brute force a machine that’s using public key authentication with a compromised key in a few minutes, potentially. This is a security disaster of the first water, considering the number of organisations (such as ours) that rely on public key authentication for a lot of our inter-machine security. This also affected key generation for self signed email, web certificates, private networks, anonymous proxy networks and all sorts of other things. The cleaning up is going to take some time, and cost an awful lot. Some people are going to be compromised by this, and a lot of machines may be broken into.

Some background on how distributions work

Debian has a vast number of packages under management. They produce these packages by taking source packages from “upstream” (the people who maintain the software) and modifying it to fit the rules and goals of the distribution.

Some of these changes are for compatibility – for example, using standard file locations or configuration systems. Some of them are mechanical changes to do with integration with the build process. Quite a few changes are bug fixes.
It’s recommended that bug fixes be coordinated with upstream – send patches back to them, so everyone in the community can benefit from the changes.

Whose Fault Was It

After going through the above, it’s pretty clearly the DD (Debian Developer) in question’s fault. Although he suggested making changes on the openssl-dev list, and got an affirmative from someone on the project, it was pretty clear in the response that this was “if this helps with debugging it’s a good idea” not “i’ve closely read the code in question, and I agree”.

The DD should have also submitted a patch back to the openssl guys. They’d have spotted the error and screamed blue murder. He was a bit lazy and thoughtless here, and I imagine right now he wishes he could crawl into a hole and die.

What to do about it

Debian are getting badly slammed for this but it is worth keeping some perspective. We, and many others, use Debian because of it’s long history of excellent package quality. This is a result both of their culture (which is aggressively perfectionist) and their selection criteria for developers, which weeds out many dodgy ones. We are proud to use Debian, and will continue to do so.

DD’s are generally conscientious, knowledgeable and dedicated to their work. I have no reason to believe this DD was any different. Even conscientious, knowledgeable and dedicated people make mistakes. This is what process is for, to help mitigate human error. I think there was clearly a lack of process.
Two things would have really helped. Code review internally to debian and code review by upstream. I don’t think it’s unreasonable that for security critical packages Debian should require both for non-critical changes to these packages. Even critical changes should be reviewed as soon as possible.
Internal code review is impractical for every package, since it requires a good understanding of the code in question, and would impose a huge workload – but for critical packages I think it’s a necessity.

Upstream review is potentially tricky too. Some upstreams don’t have the time or inclination to participate. There is also often a lot of friction between distributions and upstream, since they have very different goals. This isn’t a problem that can be easily resolved – these groups really do have different goals and values, and sometimes unreconcilable differences arise. But for the good of their eventual users they need to work together to help stop this sort of problem occurring.

About us: Isotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.

3 thoughts on “Debian’s OpenSSL Disaster

  1. dazza

    This is all a very unfortunate incident. I learned this lesson a long time ago. When it comes to security code you just can’t be too careful. As clever as I think I am, as well documented to code may be and as well read my co workers are I still don’t trust myself to change this type of code.
    It’s a sad fact that ALL code has assumptions that are not always apparent to the casual observer.
    I have a lot of respect for the OpenSSL guys since they deal with a very hot potato on a daily basis. They must struggle with every line of code they write :)
    I hope the Debian project doesn’t suffer too much from this mistake.

  2. Theodore Tso

    As an upstream maintainer, I make it a practice to periodically download distribution source packages and see what patches enterprising young developers have done to my code before packaging it. I have sometimes been horrified by what I find. In other cases, I’ll find useful bug fixes, grump a bit about why they didn’t bother to tell me (the upstream maintainer), and integrate it it into the code.
    In other cases, where they did something horrifically wrong from a UI perspective, I’ll add the feature the right way, and then let them sort out the mess when they added something the wrong way and now need to maintained backwards compatibility with their users.
    I can’t blame the OpenSSL developers for not doing this, but it’s a really good thing to do. Many users don’t separate the reputation of the upstream maintainer from what distributions do to the package before they package it. I started doing this many years ago when Debian tried to “helpfully” add support for filesystems bigger than 4 gigabytes (2**32 bytes), but they screwed it up, and really angry users e-mailed *me* complaining that I had trashed their data. Well, the reason why I hadn’t added llseek() support was because it was complicated, and glibc made life difficult, and the obvious fix didn’t work. In the end, I solved that problem by taking over e2fsprogs maintenance for Debian, so for Debian the maintainer *IS* the upstream.

  3. David Schwartz

    This patch clearly never went through any kind of audit or approval process. For one thing, half the patch only affects compilation with ‘PURIFY’ defined, it with ‘PURIFY’ defined, it won’t even compile. (C-style comments don’t nest.)
    This is simply inexcusable for a major distribution and a security-critical package.
    This should be a wake-up call to all distributions and vendors that they need to audit all changes to security-critical code.

Comments are closed.