Author Archives: Tom Wardill

Buildout Basics Part 1

Introduction to the series

This is the first in a 3 part series of tutorials about creating, configuring and maintaining buildout configuration files, and how buildout can be used to deploy and configure both python-based and other software.
During the course of this series, I will cover buildout configuration files, some buildout recipes and a simple overview structure of a buildout recipe. I will not cover creating a recipe, or developing buildout itself.

For a very good guide on the python packaging techniques that we will be relying on, see this guide: http://guide.python-distribute.org

All code samples will be python 2.4+ compatible, system command lines will be debian/ubuntu specific, but simple enough to generalise out to most systems (OSX and Windows included).
Where a sample project or code is required, I’ve used Django as it’s what I’m most familiar with, but this series is all about the techniques and configuration in buildout itself, it is not Django specific, so don’t be scared off if you happen to be using something else.

Buildout Basics

So, what’s this buildout thing anyway?

If you’re a python developer, or even just a python user, you will probably have come across either easy_install or pip (or both). These are pretty much two methods of achieving the same thing; namely, installing python software

$> sudo easy_install Django

This is a fairly simple command, it will install Django onto the system path, so from anywhere you can do

>>> import django
>>>

While this is handy, it’s not ideal for production deployment. System installing a package will lead to problems with maintenance, and probably also lead to version conflict problems, particuarly if you have multiple sites or environments deployed on the same server. One environment may need Django 1.1, the newest may need 1.3. There are significant differences in the framework from one major version to another, and a clean upgrade may not be possible. So, system-installing things is generally considered to be a bad idea, unless it’s guaranteed to be a dedicated machine.

So what do you do about it?
One answer is to use a virtualenv. This is a python package that will create a ‘clean’ python environment in a particular directory:

$> virtualenv deployment --no-site-packages

This will create a directory called ‘deployment’, in which is a clean python intepreter with only a local path. This environment will ignore any system-installed packages (--no-site-packages), and give you a fresh, self contained python environment.

Once you have activated this environment, you can then easy_install or pip install the packages you require, and then use them as you would normally, safe in the knowledge that anything you install in this environment is not going to affect the mission-critical sales (or pie-ordering website) process that’s running in the directory next door.

So, if virtualenv can solve this problem for us, why do we need something else, something more complex to solve essentially the same problem?

The answer, is that buildout doesn’t just solve this problem, it solves a whole lot more problems, particuarly when it comes to ‘how do I release this code to production, yet make sure I can still work on it, without breaking the release?’

Building something with buildout

The intent is to show you the parts for a buildout config, then show how it all fits together. If you want to see the final product, then dissemble it to find the overall picture, scroll to the end of this, have a look, then come back. Go on, it’s digital, this will still be here when you come back….

Config Files

Pretty much everything that happens with buildout is controlled by its config file (this isn’t quite true, but hey, ‘basics’). A config file is a simple ini (ConfigParser) style text file; that defines some sections, some options for those sections, and some choices in the options.

In this case, the sections of a buildout config file (henceforth referred to as buildout.cfg) are generally referred to as parts. The most important of these parts is the buildout part itself, which controls the options for the buildout process.

An absolute minimum buildout part looks something like this:

[buildout]
parts = noseinstall

While this is not a complete buildout.cfg, it is the minimum that is required in the buildout part itself. All is is doing is listing the other parts that buildout will use to actually do something, in this case, it is looking for a single part named noseinstall. As this part doesn’t exist yet, it won’t actually work. So, lets add the part, and in the next section, see what it does:

[buildout]
parts = noseinstall

[noseinstall]
recipe = zc.recipe.egg
eggs = Nose

An aside about bootstrap.py

We now have a config file that we’re reasonably sure will do something, if we’re really lucky, it’ll do something that we actually want it to do. But how do we run it? We will need buildout itself, but we don’t have that yet. At this point, there are two ways to proceed.

  1. sudo apt-get install python-zc.buildout
  2. wget http://python-distribute.org/bootstrap.py && python bootstrap.py

For various reasons, unless you need a very specific version of buildout, it is best to use the bootstrap.py file. This is a simple file that contains enough of buildout to install buildout itself inside your environment. As it’s cleanly installed, it can be upgraded, version pinned and generally used in the same manner as in a virtualenv style build. If you system-install buildout, you will not be able to easily upgrade the buildout instance, and may run into version conflicts if a project specifies a version newer than the one you have. Both approaches have their advantages, I prefer the second as it is slightly more self contained. Mixing the approaches (using bootstrap.py with a system-install is possible, but can expose some bugs in the buildout install procedure).

The rest of this document is going to assume that you have used bootstrap.py to install buildout.

Running some buildout

Now we have a method of running buildout, it’s time to do it in the directory where we left the buildout.cfg file created earlier:

$> bin/buildout

At this point, buildout will output something along the lines of:

Getting distribution for 'zc.recipe.egg'.
Got zc.recipe.egg 1.3.2.
Installing noseinstall.
Getting distribution for 'Nose'.
no previously-included directories found matching 'doc/.build'
Got nose 1.0.0.
Generated script '/home/tomwardill/tmp/buildoutwriteup/bin/nosetests-2.6'.
Generated script '/home/tomwardill/tmp/buildoutwriteup/bin/nosetests'.

Your output may not be exactly similar, but should contain broadly those lines.

The simple sample here is using the zc.recipe.egg recipe. This is probably the most common of all buildout recipes as it is the one that will do the heavy work of downloading an egg, analysing its setup.py for dependencies (and installing them if required), and then finally installing the egg into the buildout path for use. Recipes are just python eggs that contain code that buildout will run. The easiest way to think of this is that while a recipe is an egg, recipe contains instructions for the buildout process itself, and therefore will not be available to code at the end.

An analysis of the buildout output shows exactly what it has done. It has downloaded an egg for zc.recipe.egg and run the noseinstall part. Let’s take a closer look at that noseinstall part from before:

[noseinstall]
recipe = zc.recipe.egg
eggs = Nose

So, we can see why buildout has installed zc.recipe.egg, it is specified in the recipe option of this part, so buildout will download it, install it and then run it. We will take a closer look at the construction of a recipe in a later article, but for now, assume that buildout has executed a bunch of python code in the recipe, and we’ll carry on.
The python code in this case will look at the part that it is in, and look for an option called eggs. As we have specified this option, it will then look at this as a list, and install all the eggs that we have listed; in this case, just the one, the unittest test runner Nose.
As you can see from the bottom of the buildout output, the recipe has downloaded Nose, extracted it and created two files; bin/nosetests and bin/nosetests-2.6. Running one of those files like so:

$> bin/nosetests

----------------------------------------------------------------------
Ran 0 tests in 0.002s

OK
$>

We can see that this is nose, as we expect it to be. Two files have been generated because that is that the setup.py for Nose defines, a base nosetest executable, and one for the specifc python version that we have used (python 2.6 in my case). These are specified in the setup.py that makes up the nose egg, which will be covered in a later article.

Conclusion

We can install buildout into a development environment, and use a simple config file to install a python egg. The next article will cover a development example for using with django, and some niceties such as version pinning and running python in our buildouted environment.

Announcing FuncBrows – A browser testing tool

For the short version: System web browser testing abstraction layer. Get it here

For the last month or so, I have been heavily investigating various functional browser testing tools, with the aim of adding them to our Continuous Integration build.

The History

I settled on using zc.testbrowser for writing quick, functional tests that can be added to the end of a unit test run. As testbrowser isn’t a full browser in itself, it’s fast to run tests with, and can easily be integrated with the Zope and Plone functional tools.
However, for a full suite of integration tests, testbrowser isn’t a great fit. It doesn’t support javascript, and won’t tell you that ‘The login button is missing in IE6’, or that ‘Firefox 3 can’t access the profile changer’. For this type of test, the current leader is Selenium, which can instrument and drive real browsers (IE, Firefox, Chrome, Safari), in order to perform more accurate real world tests.
Selenium is currently undergoing a massive revamp in order to add better functionality and clean up the API (including a mode that will work similar to testbrowser), however this means that we are currently stuck with the older, stable version, as it has python bindings and more documentation.

So, given these two tools, I wrote a simple suite of tests for a project. They registered a user, logged in, logged out, and changed a user profle. Not massively complex, but it’s often surprising how many times such simple processes can be broken, and you won’t notice as you tend to always use the same test user details.

This was all well and good, and everyone was happy and it was sunny.

The Problem

The problem then became, that although developers can write testbrowser scripts fairly easily, and run them quickly, selenium is a lot more heavyweight, requiring selenium installs, multiple browsers and virtual machines to run a full test.

Fundamentally, the selenium API is very different from testbrowser, and asking people to write both was never going to happen.

This meant that selenium was added to the CI build, but the developers would never usually run the tests themselves, creating a disconnect between the current tests that the developers would run as part of their unit test suite, and what would be tested more heavily with the CI build.

The Solution

I (with a lot of help from other people, some ‘creative language’, and some frustration) created FuncBrows. This is a simple, lightweight abstraction tool over both testbrowser and selenium, that can easily be extended to add more tools to it, when required (selenium v2 and twill are on the target list).
It can easily be included, and configured in one line, with a base set of tests that can then be run by every tool, as required.

This means that the developers can write fast tests for their own use, and the exact same code can then be reused for more complete browser testing later in the system tests. A quick, simple way to smoke test for browser issues

There is a samples directory in the github repository, with a simple example of how to set up the tests so that they can be run with either the python standard test runner, or nosetests.
It’s fairly simple, and can’t do any advanced stuff, it only progressed to the stage where we could dogfood our existing tests, I expect the API to grow slightly as we need more functionality from it.
Patches and issues gratefully accepted at the github page.

Get the bits here: Isotoma Github
Or ‘easy_install FuncBrows’

Installing Postgis on Ubuntu Karmic Koala (9.10)

Karmic has Postgres 8.4 as it’s default version, and 8.3 can prove tricky to install.
Unfortunately, not all of the contrib packages were upgraded to run on the new version in time.

One of these is postgis, the geodatabase extensions. You have two options, build it from source, or install the Lucid (10.4) package.
Obviously, Lucid isn’t released yet, but I have had success with just installing the package. Your Mileage May Vary.

i386 Package
AMD Package

Hope this helps someone!
–t

Dedicated Server Configuration in L4D2 (or how isotoma play zombies)

Here at isotoma, we’re partial to a few evenings spent giggling like maniacs and coming up with inventive ways to kill as many zombies as possible.
Nothing quite like arguing with a coworker whether an aerial kill with a frying pan is better than a survivor steered into a convenient pool of spit.

Naturally, our game of choice is the rather good Left 4 Dead 2.

However, our experience of the L4D2 lobby system has been variable, with some servers being slow, or configured weirdly. The usual answer for this is to run a dedicated server, as has been done since time untold (Quake).

In L4D2, the tools to do this aren’t immediately obvious, and have a nasty habit of changing from patch to patch.
This is a quick note to show the current method that we have found works.
I won’t cover how to install and set up the server, there’s plenty of guides out there to do that, this is how to configure it, and how to start a game on that server.

Server Configuration

In server.cfg, do this (replace things in [] with a value, including the brackets themselves):

hostname “[SERVERNAME]”
sv_region 3
sv_search_key “[AUNIQUEKEY]”
rcon_password "rcon"
sv_lan 0
sv_allow_lobby_connect_only 0

The important part of this is the sv_search_key setting. This should be a unique key (a steam account name is a reasonable bet), and something that you can type into the game console. Do not attempt to set the Steam group settings that you can find documented elsewhere. This will break the search key, and mean you then cannot reliably connect to this exact server.

Run the server with a line similar to:

./srcds_run l4d2 -autoupdate +ip [SERVERIP] +hostport [PORT] +exec server.cfg

SERVERIP should be the public ip of the server machine
PORT should be the port to run the server on (27203 is our favourite)

Once this is done, you should see some nice console lines scrolling by and your server is now up. The search_key setting means that only people with that key will be able to find and connect to the server using the lobby system. You will be able to connect directly using the server ip and port if necessary.

Connecting

In the game, go to Settings > Keyboard Settings and enable the ‘Developer Console’

To start a lobby game:
From main game screen
Bring down console (~ key, usually next to 1 on a keyboard)
Type: sv_search_key [AUNIQUEKEY]
(This should be the sv_search_key you set above)
Create a game with friends
Do not change the server choice (dedicated, best available etc), leave it as it appears.
You can change game settings, maps etc.
Play

To connect directly:
Bring down console (~ key, usually next to 1 on a keyboard)
Type: connect [server ip]:[port]
This will drop you straight into the server, from where you can invite people to join you.

This is all a bit fiddly, but it works for us. I hope it helps someone who was as equally as lost as me to get this going.

–t

Textual Log Analysis using Python

Here at isotoma, we have a company irc channel that is used for general communication, chattering and link sharing.
Everyone joins it at the start of the day, and keeps up to date with what’s going on, and who’s talking about what.
lolcats are occasionally mentioned.

Now, having logs of the channel reaching many megabytes, I was curious as to the text statistics produced by this channel, who has what reading age, and how much they’ve talked in comparison to other people.
While I won’t release the actual statistics I’ve gathered for the channel, I did think it’d be cool to release the script I wrote to do the analysis itself.

It uses the Natural Language Toolkit (NLTK), and the readability contrib module for it. It’s not particularly nice code (inline html generation and other nastiness), but it does work. I’ll attempt to release a cleaned up version when I get some more time to work on it.

Currently, it expects a log in the format from znc, the irc bouncer software that I use, although it can be modified easily by altering the timestamp_count to the correct number to skip the timestamp. It also expects nicks to be surrounded in ‘<’ and ‘>’. I _did_ say it wasn’t particularly nice code.

However, code style issues aside, it is a demonstration and example of using NTLK and the readability module on real world data, and the output is kind of cool. Especially when you find out that the ircbot has a higher reading age than you.

Find the source attached.

log-analyser

Uniquifying Delicious Links

We use delicious a lot here at isotoma, but given that we’re all likely to mostly bookmark the same links, we thought it’d be cool to filter the stream for a single copy of each link.

Click here

This site should pull all the streams for the entered users over the last number of days and give you all the links that were posted in the last number of days (or 100 items, whichever is smaller).

I hope it’s of use.

Formatting timedelta objects in python

Unlike the datetime and other date related objects in python, timedelta does not have any useful strong formatting options.

It also has no external method that can format it into something that you might want to present to an end user. As such, say you wanted Hours:Minutes:Seconds, you’d have to do some fancy things with the .seconds attribute on the timedelta.

To save time, the code for ‘fancy things’ is given below:

hours, remainder = divmod(duration_time_delta, 3600)
minutes, seconds = divmod(remainder, 60) 

duration_formatted = '%s:%s:%s' % (hours, minutes, seconds)

Debugging django unit tests with WING IDE

Wing is a nice IDE, it does autocompletion, project management and other fancy stuff like SVN integration.

There is a page here on how to set up Wing so that it’ll debug within a django environment, including how to make it work with the auto reloading feature, however this will not help if you wish to debug within unit tests, which can be extremely useful, particuarly if you are following TDD or test-first philosophies.

The answer to this:

  1. Set the manage.py of your django project to be the ‘Main Debug File’. Do this by finding manage.py in the Project View and right clicking on it. The option is about half way down the drop down.
  2. Set the environment variables required in the Project settings.
  3. Right click the manage.py again and go to “File Properties”
  4. In the Debug tab of File Properties, set the ‘Run Arguments’ to ‘test’ (without the quotes)
  5. Add a breakpoint somewhere in your tests
  6. Press Debug
  7. Marvel.

Hope this helps someone, I’ve just spent time getting it working, only to blast my .wpr file and have to remember how to do it again, so I thought I’d write it down.