Category Archives: System Engineering

The problem with Backing Stores, or what is NoSQL and why would you use it anyway

Durability is something that you normally want somewhere in a system: where the data will survive reboots, crashes, and other sorts of things that routinely happen to real world systems.

Over the many years that I have worked in system design, there has been a recurring thorny problem of how to handle this durable data.  What this means in practice when building a new system is the question “what should we use as our backing store?”.  Backing stores are often called “databases”, but everyone has a different view of what database means, so I’ll try and avoid it for now.

In a perfect world a backing store would be:

  • Correct
  • Quick
  • Always available
  • Geographically distributed
  • Highly scalable

While we can do these things quite easily these days with the stateless parts of an application, doing them with durable data is non-trivial. In fact, in the general case, it’s impossible to do all of these things at once (The CAP theorem describes this quite well).

This has always been a challenge, but as applications move onto the Internet, and as businesses become more geographically distributed, the problem has become more acute.

Relational databases (RDBMSes) have been around a very long time, but they’re not the only kind of database you can use. There have always been other kinds of store around, but the so-called NoSQL Movement has had particular prominence recently. This champions the use of new backing stores not based on the relational design, and not using SQL as a language. Many of these have radically different designs from the sort of RDBMS system that has been widely used for the last 30 years.

When and how to use NoSQL systems is a fascinating question, and I put forward our thinking on this. As always, it’s kind of complicated.  It certainly isn’t the case that throwing out an RDBMS and sticking in Mongo will make your application awesome.

Although they are lumped together as “NoSQL”, this is not actually a useful definition, because there is very little that all of these have in common. Instead I suggest that there are these types of NoSQL backing store available to us right now:

  • Document stores – MongoDB, XML databases, ZODB
  • Graph databases – Neo4j
  • Key/value stores – Dynamo, BigTable, Cassandra, Redis, Riak, Couch

These are so different from each other that lumping them in to the same category together is really quite unhelpful.

Graph databases

Graph databases have some very specific use cases, for which they are excellent, and probably a lot of utility elsewhere. However, for our purposes they’re not something we’d consider generally, and I’ll not say any more about them here.

Document stores

I am pretty firmly in the camp that Document stores, such as MongoDB, should never be used generally either (for which I will undoubtedly catch some flak). I have a lot of experience with document databases, particularly ZODB and dbxml, and I know whereof I speak.

These databases store “documents” as schema-less objects. What we mean by a “document” here is something that is:

  • self-contained
  • always required in it’s entirety
  • more valuable than the links between documents or it’s metadata.

My experience is that although often you may think you have documents in your system, in practice this is rarely the case, and it certainly won’t continue to be the case. Often you start with documents, but over time you gain more and more references between documents, and then you gain records and and all sorts of other things.

Document stores are poor at handling references, and because of the requirement to retrieve things in their entirety you denormalise a lot. The end result of this is loss of consistency, and eventually doom with no way of recovering consistency.

We do not recommend document stores in the general case.

Key/value stores

These are the really interesting kind of NoSQL database, and I think these have a real general potential when held up against the RDBMS options.  However, there is no magic bullet and you need to choose when to use them carefully.

You have to be careful when deciding to build something without an RDBMS. An RDBMS delivers a huge amount of value in a number of areas, and for all sorts of reasons. Many of the reasons are not because the RDBMS architecture is necessarily better but because they are old, well-supported and well-understood.

For example, PostgreSQL (our RDBMS of choice):

  • has mature software libraries for all platforms
  • has well-understood semantics for backup and restore, which work reliably
  • has mature online backup options
  • has had decades of performance engineering
  • has well understood load and performance characteristics
  • has good operational tooling
  • is well understood by many developers

These are significant advantages over newer stores, even if they might technically be better in specific use cases.

All that said, there are some definite reasons you might consider using a key/value store instead of an RDBMS.

Reason 1: Performance

Key/value stores often naively appear more performant than RDBMS products, and you can see some spectacular performance figures in direct comparisons. However, none of them really provide magic performance increases over RDBMS systems, what they do is provide different tradeoffs. You need to decide where your performance tradeoffs lie for your particular system.

In practice what key/value stores mostly do is provide some form of precomputed cache of your data, by making it easy (or even mandatory) to denormalize your data, and by providing the performance characteristics to make pre-computation reasonable.

If you have a key/value store that has high write throughput characteristics, and you write denormalized data into it in a read-friendly manner then what you are actually doing is precomputing values. This is basically Just A Cache. Although it’s a pattern that is often facilitated by various NoSQL solutions, it doesn’t depend on them.

RDBMS products are optimised for correctness and query performance and  write performance takes second place to these.  This means they are often not a good place to implement a pre-computed cache (where you often write values you never read).

It’s not insane to combine an RDBMS as your master source of data with something like Redis as an intermediate cache.  This can give you most of the advantages of a completely NoSQL solution, without throwing out all of the advantages of the RDBMS backing store, and it’s something we do a lot.

Reason 2: Distributed datastores

If you need your data to be highly available and distributed (particularly geographically) then an RDBMS is probably a poor choice. It’s just very difficult to do this reliably and you often have to make some very painful and hard-to-predict tradeoffs in application design, user interface and operational procedures.

Some of these key/value stores (particularly Riak) can really deliver in this environment, but there are a few things you need to consider before throwing out the RDBMS completely.

Availability is often a tradeoff one can sensibly make.  When you understand quite what this means in terms of cost, both in design and operational support (all of these vary depending on the choices you make), it is often the right tradeoff to tolerate some downtime occasionally.  In practice a system that works brilliantly almost all of the time, but goes down in exceptional circumstances, is generally better than one that is in some ways worse all of the time.

If you really do need high availability though, it is still worth considering a single RDBMS in one physical location with distributed caches (just as with the performance option above).  Distribute your caches geographically, offload work to them and use queue-based fanout on write. This gives you eventual consistency, whilst still having an RDBMS at the core.

This can make sense if your application has relatively low write throughput, because all writes can be sent to the single location RDBMS, but be prepared for read-after-write race conditions. Solutions to this tend to be pretty crufty.

Reason 3: Application semantics vs SQL

NoSQL databases tend not to have an abstraction like SQL. SQL is decent in its core areas, but it is often really hard to encapsulate some important application semantics in SQL.

A good example of this is asynchronous access to data as parts of calculations. It’s not uncommon to need to query external services, but SQL really isn’t set up for this. Although there are some hacky workarounds if you have a microservice architecture you may find SQL really doesn’t do what you need.

Another example is staleness policies.  These are particularly problematic when you have distributed systems with parts implemented in other languages such as Javascript, for example if your client is a browser or a mobile application and it encapsulates some business logic.

Endpoint caches in browsers and mobile apps need to represent the same staleness policies you might have in your backing store and you end up implementing the same staleness policies in Javascript and then again in SQL, and maintaining them. These are hard to maintain and test at the best of times. If you can implement them in fewer places, or fewer languages, that is a significant advantage.

In addition, it is a practical case that we’re not all SQL gurus. Having something that is suboptimal in some cases but where we are practically able to exploit it more cheaply is a rational economic tradeoff.  It may make sense to use a key/value store just because of the different semantics it provides – but be aware of how much you are losing without including an RDBMS, and don’t be surprised if you end up reintroducing one later as a platform for analysis of your key/value data.

Reason 4: Load patterns

NoSQL systems can exhibit very different performance characteristics from SQL systems under real loads. Having some choice in where load falls in a system is sometimes useful.

For example, if you have something that scales front-end webservers horizontally easily, but you only have one datastore, it can be really useful to have the load occur on the application servers rather than the datastore – because then you can distribute load much more easily.

Although this is potentially less efficient, it’s very easy and often cheap to spin up more application servers at times of high load than it is to scale a database server on the fly.

Also, SQL databases tend to have far better read performance than write performance, so fan-out on write (where you might have 90% writes to 10% reads as a typical load pattern) is probably better implemented using a different backing store that has different read/write performance characteristics.

Which backing store to use, and how to use it, is the kind of decision that can have huge ramifications for every part of a system.  This post has only had an opportunity to scratch the surface of this subject and I know I’ve not given some parts of it the justice they deserve – but hopefully it’s clear that every decision has tradeoffs and there is no right answer for every system.

About us: Isotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.

Buildout Basics Part 1

Introduction to the series

This is the first in a 3 part series of tutorials about creating, configuring and maintaining buildout configuration files, and how buildout can be used to deploy and configure both python-based and other software.
During the course of this series, I will cover buildout configuration files, some buildout recipes and a simple overview structure of a buildout recipe. I will not cover creating a recipe, or developing buildout itself.

For a very good guide on the python packaging techniques that we will be relying on, see this guide:

All code samples will be python 2.4+ compatible, system command lines will be debian/ubuntu specific, but simple enough to generalise out to most systems (OSX and Windows included).
Where a sample project or code is required, I’ve used Django as it’s what I’m most familiar with, but this series is all about the techniques and configuration in buildout itself, it is not Django specific, so don’t be scared off if you happen to be using something else.

Buildout Basics

So, what’s this buildout thing anyway?

If you’re a python developer, or even just a python user, you will probably have come across either easy_install or pip (or both). These are pretty much two methods of achieving the same thing; namely, installing python software

$> sudo easy_install Django

This is a fairly simple command, it will install Django onto the system path, so from anywhere you can do

>>> import django

While this is handy, it’s not ideal for production deployment. System installing a package will lead to problems with maintenance, and probably also lead to version conflict problems, particuarly if you have multiple sites or environments deployed on the same server. One environment may need Django 1.1, the newest may need 1.3. There are significant differences in the framework from one major version to another, and a clean upgrade may not be possible. So, system-installing things is generally considered to be a bad idea, unless it’s guaranteed to be a dedicated machine.

So what do you do about it?
One answer is to use a virtualenv. This is a python package that will create a ‘clean’ python environment in a particular directory:

$> virtualenv deployment --no-site-packages

This will create a directory called ‘deployment’, in which is a clean python intepreter with only a local path. This environment will ignore any system-installed packages (--no-site-packages), and give you a fresh, self contained python environment.

Once you have activated this environment, you can then easy_install or pip install the packages you require, and then use them as you would normally, safe in the knowledge that anything you install in this environment is not going to affect the mission-critical sales (or pie-ordering website) process that’s running in the directory next door.

So, if virtualenv can solve this problem for us, why do we need something else, something more complex to solve essentially the same problem?

The answer, is that buildout doesn’t just solve this problem, it solves a whole lot more problems, particuarly when it comes to ‘how do I release this code to production, yet make sure I can still work on it, without breaking the release?’

Building something with buildout

The intent is to show you the parts for a buildout config, then show how it all fits together. If you want to see the final product, then dissemble it to find the overall picture, scroll to the end of this, have a look, then come back. Go on, it’s digital, this will still be here when you come back….

Config Files

Pretty much everything that happens with buildout is controlled by its config file (this isn’t quite true, but hey, ‘basics’). A config file is a simple ini (ConfigParser) style text file; that defines some sections, some options for those sections, and some choices in the options.

In this case, the sections of a buildout config file (henceforth referred to as buildout.cfg) are generally referred to as parts. The most important of these parts is the buildout part itself, which controls the options for the buildout process.

An absolute minimum buildout part looks something like this:

parts = noseinstall

While this is not a complete buildout.cfg, it is the minimum that is required in the buildout part itself. All is is doing is listing the other parts that buildout will use to actually do something, in this case, it is looking for a single part named noseinstall. As this part doesn’t exist yet, it won’t actually work. So, lets add the part, and in the next section, see what it does:

parts = noseinstall

recipe = zc.recipe.egg
eggs = Nose

An aside about

We now have a config file that we’re reasonably sure will do something, if we’re really lucky, it’ll do something that we actually want it to do. But how do we run it? We will need buildout itself, but we don’t have that yet. At this point, there are two ways to proceed.

  1. sudo apt-get install python-zc.buildout
  2. wget && python

For various reasons, unless you need a very specific version of buildout, it is best to use the file. This is a simple file that contains enough of buildout to install buildout itself inside your environment. As it’s cleanly installed, it can be upgraded, version pinned and generally used in the same manner as in a virtualenv style build. If you system-install buildout, you will not be able to easily upgrade the buildout instance, and may run into version conflicts if a project specifies a version newer than the one you have. Both approaches have their advantages, I prefer the second as it is slightly more self contained. Mixing the approaches (using with a system-install is possible, but can expose some bugs in the buildout install procedure).

The rest of this document is going to assume that you have used to install buildout.

Running some buildout

Now we have a method of running buildout, it’s time to do it in the directory where we left the buildout.cfg file created earlier:

$> bin/buildout

At this point, buildout will output something along the lines of:

Getting distribution for 'zc.recipe.egg'.
Got zc.recipe.egg 1.3.2.
Installing noseinstall.
Getting distribution for 'Nose'.
no previously-included directories found matching 'doc/.build'
Got nose 1.0.0.
Generated script '/home/tomwardill/tmp/buildoutwriteup/bin/nosetests-2.6'.
Generated script '/home/tomwardill/tmp/buildoutwriteup/bin/nosetests'.

Your output may not be exactly similar, but should contain broadly those lines.

The simple sample here is using the zc.recipe.egg recipe. This is probably the most common of all buildout recipes as it is the one that will do the heavy work of downloading an egg, analysing its for dependencies (and installing them if required), and then finally installing the egg into the buildout path for use. Recipes are just python eggs that contain code that buildout will run. The easiest way to think of this is that while a recipe is an egg, recipe contains instructions for the buildout process itself, and therefore will not be available to code at the end.

An analysis of the buildout output shows exactly what it has done. It has downloaded an egg for zc.recipe.egg and run the noseinstall part. Let’s take a closer look at that noseinstall part from before:

recipe = zc.recipe.egg
eggs = Nose

So, we can see why buildout has installed zc.recipe.egg, it is specified in the recipe option of this part, so buildout will download it, install it and then run it. We will take a closer look at the construction of a recipe in a later article, but for now, assume that buildout has executed a bunch of python code in the recipe, and we’ll carry on.
The python code in this case will look at the part that it is in, and look for an option called eggs. As we have specified this option, it will then look at this as a list, and install all the eggs that we have listed; in this case, just the one, the unittest test runner Nose.
As you can see from the bottom of the buildout output, the recipe has downloaded Nose, extracted it and created two files; bin/nosetests and bin/nosetests-2.6. Running one of those files like so:

$> bin/nosetests

Ran 0 tests in 0.002s


We can see that this is nose, as we expect it to be. Two files have been generated because that is that the for Nose defines, a base nosetest executable, and one for the specifc python version that we have used (python 2.6 in my case). These are specified in the that makes up the nose egg, which will be covered in a later article.


We can install buildout into a development environment, and use a simple config file to install a python egg. The next article will cover a development example for using with django, and some niceties such as version pinning and running python in our buildouted environment.

Announcing FuncBrows – A browser testing tool

For the short version: System web browser testing abstraction layer. Get it here

For the last month or so, I have been heavily investigating various functional browser testing tools, with the aim of adding them to our Continuous Integration build.

The History

I settled on using zc.testbrowser for writing quick, functional tests that can be added to the end of a unit test run. As testbrowser isn’t a full browser in itself, it’s fast to run tests with, and can easily be integrated with the Zope and Plone functional tools.
However, for a full suite of integration tests, testbrowser isn’t a great fit. It doesn’t support javascript, and won’t tell you that ‘The login button is missing in IE6’, or that ‘Firefox 3 can’t access the profile changer’. For this type of test, the current leader is Selenium, which can instrument and drive real browsers (IE, Firefox, Chrome, Safari), in order to perform more accurate real world tests.
Selenium is currently undergoing a massive revamp in order to add better functionality and clean up the API (including a mode that will work similar to testbrowser), however this means that we are currently stuck with the older, stable version, as it has python bindings and more documentation.

So, given these two tools, I wrote a simple suite of tests for a project. They registered a user, logged in, logged out, and changed a user profle. Not massively complex, but it’s often surprising how many times such simple processes can be broken, and you won’t notice as you tend to always use the same test user details.

This was all well and good, and everyone was happy and it was sunny.

The Problem

The problem then became, that although developers can write testbrowser scripts fairly easily, and run them quickly, selenium is a lot more heavyweight, requiring selenium installs, multiple browsers and virtual machines to run a full test.

Fundamentally, the selenium API is very different from testbrowser, and asking people to write both was never going to happen.

This meant that selenium was added to the CI build, but the developers would never usually run the tests themselves, creating a disconnect between the current tests that the developers would run as part of their unit test suite, and what would be tested more heavily with the CI build.

The Solution

I (with a lot of help from other people, some ‘creative language’, and some frustration) created FuncBrows. This is a simple, lightweight abstraction tool over both testbrowser and selenium, that can easily be extended to add more tools to it, when required (selenium v2 and twill are on the target list).
It can easily be included, and configured in one line, with a base set of tests that can then be run by every tool, as required.

This means that the developers can write fast tests for their own use, and the exact same code can then be reused for more complete browser testing later in the system tests. A quick, simple way to smoke test for browser issues

There is a samples directory in the github repository, with a simple example of how to set up the tests so that they can be run with either the python standard test runner, or nosetests.
It’s fairly simple, and can’t do any advanced stuff, it only progressed to the stage where we could dogfood our existing tests, I expect the API to grow slightly as we need more functionality from it.
Patches and issues gratefully accepted at the github page.

Get the bits here: Isotoma Github
Or ‘easy_install FuncBrows’

Big thanks to Rapidswitch

Our server ISP is RapidSwitch. It’s unfortunate, but most ISPs are pretty poor – there aren’t enough good people to go around, margins are very tight and it’s just the kind of work where it’s hard to keep good people. Last night RapidSwitch showed that at least not all ISPs are poor.

Because of the OpenSSL issues yesterday, we chose to reboot all of our servers last night, to ensure every service was using the new SSL code. A new kernel image came down too yesterday, and a number of our machines had the updated kernel.

We rebooted a number of machines on one cluster simultaneously… and they didn’t come back. We requested a KVM session, but in the meantime one of RapidSwitch’s engineers had noticed 4 of our machines were down simultaneously, so he went and took a look. Proactivity!

He worked out what had happened, and raised a ticket for us, telling us that the new Debian kernel was incompatible with our Network Cards. We asked him to manually boot the machines into the previous kernel, and they came back up without a hitch. Clue!

He then said RapidSwitch were aware of this issue and they were offering a free PCI network card to anyone who needed them. Planning!

Frankly this is unheard of in my experience. Massively well done guys – that’s what I call service.

Debian’s OpenSSL Disaster

Many of you will know by now of the serious security problems revealed yesterday by Debian, the Linux distribution. We use Debian exclusively for our server platform, so we had to react very quickly to this issue, and make sure our systems were secure again. I think we’ve made all of the necessary changes now to ensure we’re safe from this particular problem.

I have also made some attempt to get to the bottom of what actually went on, and I’ll record it here for posterity. If any of the below is wrong, please let me know!

What Happened

The story, basically, is this. In April 2006 bug #363516 was raised, suggesting that openssl wasn’t clean for valgrind. Valgrind is a package that detects problems in C code, and is widely used to help ensure software is correct. Valgrind reported some errors with openssl, and the reporter wanted to be able to use valgrind with openssl.

At that bug url a change is discussed to the openssl codebase. The general feeling from the bug discussion is that making this change isn’t a good idea, but then a patch was applied on 4th May 2006. There are two instances of the specific issue in the bug, one in ssleay_rand_add and one in ssleay_rand_bytes.

In the meantime, a discussion took place on the openssl-dev list. This mentions the same two lines, and on the 1st May says he is in favour of removing them.

The patch amends the two lines suggested.

The problem, as I understand it, was a misunderstanding by the Debian Developer who made the change. The change to ssleay_rand_bytes was fine – this added some uninitialised memory into the entropy pool, which is fine. But the software doesn’t rely on it for security, and removing it is fine.

But the other change, in ssleay_rand_add, is a complete disaster. It alters the seeding for the random number generator in key generation, a serious flaw.
This reduces the keyspace to a few hundred thousand possible keys. It’s possible to generate all these keys in a few hours, and brute force a machine that’s using public key authentication with a compromised key in a few minutes, potentially. This is a security disaster of the first water, considering the number of organisations (such as ours) that rely on public key authentication for a lot of our inter-machine security. This also affected key generation for self signed email, web certificates, private networks, anonymous proxy networks and all sorts of other things. The cleaning up is going to take some time, and cost an awful lot. Some people are going to be compromised by this, and a lot of machines may be broken into.

Some background on how distributions work

Debian has a vast number of packages under management. They produce these packages by taking source packages from “upstream” (the people who maintain the software) and modifying it to fit the rules and goals of the distribution.

Some of these changes are for compatibility – for example, using standard file locations or configuration systems. Some of them are mechanical changes to do with integration with the build process. Quite a few changes are bug fixes.
It’s recommended that bug fixes be coordinated with upstream – send patches back to them, so everyone in the community can benefit from the changes.

Whose Fault Was It

After going through the above, it’s pretty clearly the DD (Debian Developer) in question’s fault. Although he suggested making changes on the openssl-dev list, and got an affirmative from someone on the project, it was pretty clear in the response that this was “if this helps with debugging it’s a good idea” not “i’ve closely read the code in question, and I agree”.

The DD should have also submitted a patch back to the openssl guys. They’d have spotted the error and screamed blue murder. He was a bit lazy and thoughtless here, and I imagine right now he wishes he could crawl into a hole and die.

What to do about it

Debian are getting badly slammed for this but it is worth keeping some perspective. We, and many others, use Debian because of it’s long history of excellent package quality. This is a result both of their culture (which is aggressively perfectionist) and their selection criteria for developers, which weeds out many dodgy ones. We are proud to use Debian, and will continue to do so.

DD’s are generally conscientious, knowledgeable and dedicated to their work. I have no reason to believe this DD was any different. Even conscientious, knowledgeable and dedicated people make mistakes. This is what process is for, to help mitigate human error. I think there was clearly a lack of process.
Two things would have really helped. Code review internally to debian and code review by upstream. I don’t think it’s unreasonable that for security critical packages Debian should require both for non-critical changes to these packages. Even critical changes should be reviewed as soon as possible.
Internal code review is impractical for every package, since it requires a good understanding of the code in question, and would impose a huge workload – but for critical packages I think it’s a necessity.

Upstream review is potentially tricky too. Some upstreams don’t have the time or inclination to participate. There is also often a lot of friction between distributions and upstream, since they have very different goals. This isn’t a problem that can be easily resolved – these groups really do have different goals and values, and sometimes unreconcilable differences arise. But for the good of their eventual users they need to work together to help stop this sort of problem occurring.

About us: Isotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.