Category Archives: System Engineering

Buildout Basics Part 1

Introduction to the series

This is the first in a 3 part series of tutorials about creating, configuring and maintaining buildout configuration files, and how buildout can be used to deploy and configure both python-based and other software.
During the course of this series, I will cover buildout configuration files, some buildout recipes and a simple overview structure of a buildout recipe. I will not cover creating a recipe, or developing buildout itself.

For a very good guide on the python packaging techniques that we will be relying on, see this guide: http://guide.python-distribute.org

All code samples will be python 2.4+ compatible, system command lines will be debian/ubuntu specific, but simple enough to generalise out to most systems (OSX and Windows included).
Where a sample project or code is required, I’ve used Django as it’s what I’m most familiar with, but this series is all about the techniques and configuration in buildout itself, it is not Django specific, so don’t be scared off if you happen to be using something else.

Buildout Basics

So, what’s this buildout thing anyway?

If you’re a python developer, or even just a python user, you will probably have come across either easy_install or pip (or both). These are pretty much two methods of achieving the same thing; namely, installing python software

$> sudo easy_install Django

This is a fairly simple command, it will install Django onto the system path, so from anywhere you can do

>>> import django
>>>

While this is handy, it’s not ideal for production deployment. System installing a package will lead to problems with maintenance, and probably also lead to version conflict problems, particuarly if you have multiple sites or environments deployed on the same server. One environment may need Django 1.1, the newest may need 1.3. There are significant differences in the framework from one major version to another, and a clean upgrade may not be possible. So, system-installing things is generally considered to be a bad idea, unless it’s guaranteed to be a dedicated machine.

So what do you do about it?
One answer is to use a virtualenv. This is a python package that will create a ‘clean’ python environment in a particular directory:

$> virtualenv deployment --no-site-packages

This will create a directory called ‘deployment’, in which is a clean python intepreter with only a local path. This environment will ignore any system-installed packages (--no-site-packages), and give you a fresh, self contained python environment.

Once you have activated this environment, you can then easy_install or pip install the packages you require, and then use them as you would normally, safe in the knowledge that anything you install in this environment is not going to affect the mission-critical sales (or pie-ordering website) process that’s running in the directory next door.

So, if virtualenv can solve this problem for us, why do we need something else, something more complex to solve essentially the same problem?

The answer, is that buildout doesn’t just solve this problem, it solves a whole lot more problems, particuarly when it comes to ‘how do I release this code to production, yet make sure I can still work on it, without breaking the release?’

Building something with buildout

The intent is to show you the parts for a buildout config, then show how it all fits together. If you want to see the final product, then dissemble it to find the overall picture, scroll to the end of this, have a look, then come back. Go on, it’s digital, this will still be here when you come back….

Config Files

Pretty much everything that happens with buildout is controlled by its config file (this isn’t quite true, but hey, ‘basics’). A config file is a simple ini (ConfigParser) style text file; that defines some sections, some options for those sections, and some choices in the options.

In this case, the sections of a buildout config file (henceforth referred to as buildout.cfg) are generally referred to as parts. The most important of these parts is the buildout part itself, which controls the options for the buildout process.

An absolute minimum buildout part looks something like this:

[buildout]
parts = noseinstall

While this is not a complete buildout.cfg, it is the minimum that is required in the buildout part itself. All is is doing is listing the other parts that buildout will use to actually do something, in this case, it is looking for a single part named noseinstall. As this part doesn’t exist yet, it won’t actually work. So, lets add the part, and in the next section, see what it does:

[buildout]
parts = noseinstall

[noseinstall]
recipe = zc.recipe.egg
eggs = Nose

An aside about bootstrap.py

We now have a config file that we’re reasonably sure will do something, if we’re really lucky, it’ll do something that we actually want it to do. But how do we run it? We will need buildout itself, but we don’t have that yet. At this point, there are two ways to proceed.

  1. sudo apt-get install python-zc.buildout
  2. wget http://python-distribute.org/bootstrap.py && python bootstrap.py

For various reasons, unless you need a very specific version of buildout, it is best to use the bootstrap.py file. This is a simple file that contains enough of buildout to install buildout itself inside your environment. As it’s cleanly installed, it can be upgraded, version pinned and generally used in the same manner as in a virtualenv style build. If you system-install buildout, you will not be able to easily upgrade the buildout instance, and may run into version conflicts if a project specifies a version newer than the one you have. Both approaches have their advantages, I prefer the second as it is slightly more self contained. Mixing the approaches (using bootstrap.py with a system-install is possible, but can expose some bugs in the buildout install procedure).

The rest of this document is going to assume that you have used bootstrap.py to install buildout.

Running some buildout

Now we have a method of running buildout, it’s time to do it in the directory where we left the buildout.cfg file created earlier:

$> bin/buildout

At this point, buildout will output something along the lines of:

Getting distribution for 'zc.recipe.egg'.
Got zc.recipe.egg 1.3.2.
Installing noseinstall.
Getting distribution for 'Nose'.
no previously-included directories found matching 'doc/.build'
Got nose 1.0.0.
Generated script '/home/tomwardill/tmp/buildoutwriteup/bin/nosetests-2.6'.
Generated script '/home/tomwardill/tmp/buildoutwriteup/bin/nosetests'.

Your output may not be exactly similar, but should contain broadly those lines.

The simple sample here is using the zc.recipe.egg recipe. This is probably the most common of all buildout recipes as it is the one that will do the heavy work of downloading an egg, analysing its setup.py for dependencies (and installing them if required), and then finally installing the egg into the buildout path for use. Recipes are just python eggs that contain code that buildout will run. The easiest way to think of this is that while a recipe is an egg, recipe contains instructions for the buildout process itself, and therefore will not be available to code at the end.

An analysis of the buildout output shows exactly what it has done. It has downloaded an egg for zc.recipe.egg and run the noseinstall part. Let’s take a closer look at that noseinstall part from before:

[noseinstall]
recipe = zc.recipe.egg
eggs = Nose

So, we can see why buildout has installed zc.recipe.egg, it is specified in the recipe option of this part, so buildout will download it, install it and then run it. We will take a closer look at the construction of a recipe in a later article, but for now, assume that buildout has executed a bunch of python code in the recipe, and we’ll carry on.
The python code in this case will look at the part that it is in, and look for an option called eggs. As we have specified this option, it will then look at this as a list, and install all the eggs that we have listed; in this case, just the one, the unittest test runner Nose.
As you can see from the bottom of the buildout output, the recipe has downloaded Nose, extracted it and created two files; bin/nosetests and bin/nosetests-2.6. Running one of those files like so:

$> bin/nosetests

----------------------------------------------------------------------
Ran 0 tests in 0.002s

OK
$>

We can see that this is nose, as we expect it to be. Two files have been generated because that is that the setup.py for Nose defines, a base nosetest executable, and one for the specifc python version that we have used (python 2.6 in my case). These are specified in the setup.py that makes up the nose egg, which will be covered in a later article.

Conclusion

We can install buildout into a development environment, and use a simple config file to install a python egg. The next article will cover a development example for using with django, and some niceties such as version pinning and running python in our buildouted environment.

Announcing FuncBrows – A browser testing tool

For the short version: System web browser testing abstraction layer. Get it here

For the last month or so, I have been heavily investigating various functional browser testing tools, with the aim of adding them to our Continuous Integration build.

The History

I settled on using zc.testbrowser for writing quick, functional tests that can be added to the end of a unit test run. As testbrowser isn’t a full browser in itself, it’s fast to run tests with, and can easily be integrated with the Zope and Plone functional tools.
However, for a full suite of integration tests, testbrowser isn’t a great fit. It doesn’t support javascript, and won’t tell you that ‘The login button is missing in IE6’, or that ‘Firefox 3 can’t access the profile changer’. For this type of test, the current leader is Selenium, which can instrument and drive real browsers (IE, Firefox, Chrome, Safari), in order to perform more accurate real world tests.
Selenium is currently undergoing a massive revamp in order to add better functionality and clean up the API (including a mode that will work similar to testbrowser), however this means that we are currently stuck with the older, stable version, as it has python bindings and more documentation.

So, given these two tools, I wrote a simple suite of tests for a project. They registered a user, logged in, logged out, and changed a user profle. Not massively complex, but it’s often surprising how many times such simple processes can be broken, and you won’t notice as you tend to always use the same test user details.

This was all well and good, and everyone was happy and it was sunny.

The Problem

The problem then became, that although developers can write testbrowser scripts fairly easily, and run them quickly, selenium is a lot more heavyweight, requiring selenium installs, multiple browsers and virtual machines to run a full test.

Fundamentally, the selenium API is very different from testbrowser, and asking people to write both was never going to happen.

This meant that selenium was added to the CI build, but the developers would never usually run the tests themselves, creating a disconnect between the current tests that the developers would run as part of their unit test suite, and what would be tested more heavily with the CI build.

The Solution

I (with a lot of help from other people, some ‘creative language’, and some frustration) created FuncBrows. This is a simple, lightweight abstraction tool over both testbrowser and selenium, that can easily be extended to add more tools to it, when required (selenium v2 and twill are on the target list).
It can easily be included, and configured in one line, with a base set of tests that can then be run by every tool, as required.

This means that the developers can write fast tests for their own use, and the exact same code can then be reused for more complete browser testing later in the system tests. A quick, simple way to smoke test for browser issues

There is a samples directory in the github repository, with a simple example of how to set up the tests so that they can be run with either the python standard test runner, or nosetests.
It’s fairly simple, and can’t do any advanced stuff, it only progressed to the stage where we could dogfood our existing tests, I expect the API to grow slightly as we need more functionality from it.
Patches and issues gratefully accepted at the github page.

Get the bits here: Isotoma Github
Or ‘easy_install FuncBrows’

Big thanks to Rapidswitch

Our server ISP is RapidSwitch. It’s unfortunate, but most ISPs are pretty poor – there aren’t enough good people to go around, margins are very tight and it’s just the kind of work where it’s hard to keep good people. Last night RapidSwitch showed that at least not all ISPs are poor.

Because of the OpenSSL issues yesterday, we chose to reboot all of our servers last night, to ensure every service was using the new SSL code. A new kernel image came down too yesterday, and a number of our machines had the updated kernel.

We rebooted a number of machines on one cluster simultaneously… and they didn’t come back. We requested a KVM session, but in the meantime one of RapidSwitch’s engineers had noticed 4 of our machines were down simultaneously, so he went and took a look. Proactivity!

He worked out what had happened, and raised a ticket for us, telling us that the new Debian kernel was incompatible with our Network Cards. We asked him to manually boot the machines into the previous kernel, and they came back up without a hitch. Clue!

He then said RapidSwitch were aware of this issue and they were offering a free PCI network card to anyone who needed them. Planning!

Frankly this is unheard of in my experience. Massively well done guys – that’s what I call service.

Debian’s OpenSSL Disaster

Many of you will know by now of the serious security problems revealed yesterday by Debian, the Linux distribution. We use Debian exclusively for our server platform, so we had to react very quickly to this issue, and make sure our systems were secure again. I think we’ve made all of the necessary changes now to ensure we’re safe from this particular problem.

I have also made some attempt to get to the bottom of what actually went on, and I’ll record it here for posterity. If any of the below is wrong, please let me know!

What Happened

The story, basically, is this. In April 2006 bug #363516 was raised, suggesting that openssl wasn’t clean for valgrind. Valgrind is a package that detects problems in C code, and is widely used to help ensure software is correct. Valgrind reported some errors with openssl, and the reporter wanted to be able to use valgrind with openssl.

At that bug url a change is discussed to the openssl codebase. The general feeling from the bug discussion is that making this change isn’t a good idea, but then a patch was applied on 4th May 2006. There are two instances of the specific issue in the bug, one in ssleay_rand_add and one in ssleay_rand_bytes.

In the meantime, a discussion took place on the openssl-dev list. This mentions the same two lines, and on the 1st May ulf@openssl.org says he is in favour of removing them.

The patch amends the two lines suggested.

The problem, as I understand it, was a misunderstanding by the Debian Developer who made the change. The change to ssleay_rand_bytes was fine – this added some uninitialised memory into the entropy pool, which is fine. But the software doesn’t rely on it for security, and removing it is fine.

But the other change, in ssleay_rand_add, is a complete disaster. It alters the seeding for the random number generator in key generation, a serious flaw.
This reduces the keyspace to a few hundred thousand possible keys. It’s possible to generate all these keys in a few hours, and brute force a machine that’s using public key authentication with a compromised key in a few minutes, potentially. This is a security disaster of the first water, considering the number of organisations (such as ours) that rely on public key authentication for a lot of our inter-machine security. This also affected key generation for self signed email, web certificates, private networks, anonymous proxy networks and all sorts of other things. The cleaning up is going to take some time, and cost an awful lot. Some people are going to be compromised by this, and a lot of machines may be broken into.

Some background on how distributions work

Debian has a vast number of packages under management. They produce these packages by taking source packages from “upstream” (the people who maintain the software) and modifying it to fit the rules and goals of the distribution.

Some of these changes are for compatibility – for example, using standard file locations or configuration systems. Some of them are mechanical changes to do with integration with the build process. Quite a few changes are bug fixes.
It’s recommended that bug fixes be coordinated with upstream – send patches back to them, so everyone in the community can benefit from the changes.

Whose Fault Was It

After going through the above, it’s pretty clearly the DD (Debian Developer) in question’s fault. Although he suggested making changes on the openssl-dev list, and got an affirmative from someone on the project, it was pretty clear in the response that this was “if this helps with debugging it’s a good idea” not “i’ve closely read the code in question, and I agree”.

The DD should have also submitted a patch back to the openssl guys. They’d have spotted the error and screamed blue murder. He was a bit lazy and thoughtless here, and I imagine right now he wishes he could crawl into a hole and die.

What to do about it

Debian are getting badly slammed for this but it is worth keeping some perspective. We, and many others, use Debian because of it’s long history of excellent package quality. This is a result both of their culture (which is aggressively perfectionist) and their selection criteria for developers, which weeds out many dodgy ones. We are proud to use Debian, and will continue to do so.

DD’s are generally conscientious, knowledgeable and dedicated to their work. I have no reason to believe this DD was any different. Even conscientious, knowledgeable and dedicated people make mistakes. This is what process is for, to help mitigate human error. I think there was clearly a lack of process.
Two things would have really helped. Code review internally to debian and code review by upstream. I don’t think it’s unreasonable that for security critical packages Debian should require both for non-critical changes to these packages. Even critical changes should be reviewed as soon as possible.
Internal code review is impractical for every package, since it requires a good understanding of the code in question, and would impose a huge workload – but for critical packages I think it’s a necessity.

Upstream review is potentially tricky too. Some upstreams don’t have the time or inclination to participate. There is also often a lot of friction between distributions and upstream, since they have very different goals. This isn’t a problem that can be easily resolved – these groups really do have different goals and values, and sometimes unreconcilable differences arise. But for the good of their eventual users they need to work together to help stop this sort of problem occurring.