QCon London 2010

A couple of us went to QCon London last week, which as usual had some excellent speakers and some cutting edge stuff.  QCon bills itself as “enterprise software development conference designed for team leads, architects and project management”, but it has a reputation for being an awful lot more interesting than that.  In particular it covers a lot of cutting-edge work in architecture.

Scale, scale, scale

What that means in 2010 is scale, scale, scale – how do you service a bazillion people.  In summary, nobody really has a clue.  There were presentations from Facebook, Skype, BBC, Sky and others on how they’ve scaled out, as well as presentations on various architectural patterns that lend themselves to scale.

Everyone has done it differently using solutions tailored to their specific problem-space, pretty much all using Open Source technology but generally building something in-house to help them manage scale.  This is unfortunate – it would be lovely to have a silver bullet for the scale problem.

Functional languages

From the academics there is a strong consensus that functional languages are the way forward, with loads of people championing Erlang.  I’m a big fan of Erlang myself, and we’ve got a few Erlang coders here at Isotoma.

There was also some interesting stuff on other functional approaches to concurrency, in Haskell specifically and in general.  One of the great benefits of functional languages is their ability to defer execution through lazy evaluation, which showed some remarkable performance benefits compared with more traditional data synchronisation approaches.  I’d have to wave my hands to explain it better, sorry.

Real-world solutions

Erlang is now being used in production in some big scale outs now too: the BBC are using CouchDB, which they gave a glowing report to.

Skype are using Postgres (our preferred RDBMS here) and achieving remarkable scale using pretty simple technologies like pgbouncer.  The architect speaking for Skype said one of their databases had 60 billion rows, spread over 64 servers, and that it was performing fine.  That’s a level of scale that’s outside what you’d normally consider sane.

They did need a dedicated team of seriously clever people though – and that’s one of the themes from all the really big shops who talked, that they needed large, dedicated teams of very highly-paid engineers.  Serious scale right now is not an off-the-shelf option.

NoSQL

Erlang starred in one of the other big themes being discussed, NoSQL databases.  We’ve had our own experience with these here, specifically using Oracle’s dbXML, with not fantastic results.  XML is really not suited to large scale performance unfortunately.  Some of the other databases being talked about now though: Cassandra from Facebook, CouchDB and Voldemort from Amazon.

None of these are silver bullets either though – many of them do very little heavy lifting for you – often your application needs custom consistency or transaction handling, or you get unpredictable caching (i.e. “eventual consistency”).  You need to architect around your user’s actual requirements, you can’t use an off-the-shelf architecture and deploy it for everyone.

The need to design around your user’s was put very eloquently by Udi Dahan in his Command-Query Responsibility Segregation talk.  This was excellent, and it was pleasant to discover that an architecture we’d already derived ourselves from first principles (which I can’t talk about yet) had an actual name and everything!  In particular he concentrated on divining User Intent rather than throwing in your normal GUI toolkit for building UIs – he took data grids to pieces, and championed the use of asynchronous notification.  The idea of a notification stream as part of a call-centre automation system, rather than hitting F5 to reload repeatedly, was particularly well told.

DevOps, Agile and Kanban

Some of the other tracks were particularly relevant to us.  The DevOps movement attempts to make it easier for development and operations teams to work closely together.  For anyone who has worked in this industry this will be familiar issue – development and ops have different definitions of success, and different expectations from their customers.  When these come into conflict, everyone gets hurt.

There was a great presentation from Simon Stewart of webdriver fame about his role as a System Engineer in Test at Google, where they have around one SET to 7 or 8 developers to help productionise the software, provide a proper test plan and generally improve the productivity and quality of code by applying ops and automated testing principles to development.

One of the things we’ve experienced a lot here over the last year, as we’ve grown, is that there are a lot of bottlenecks, pinch points and pain in areas outside development too.  Agile addresses a lot of the issues in a development team, but doesn’t address any of the rest of the process of going from nothing to running software in production.  We’ve experienced this with pain in QA, productionisation, documentation, project management, specification – in fact every area outside actual coding!

Lean Kanban attempts to address this, with methods adopted from heavy industry. I’m not going to talk about it here, but there’s definitely a role for this kind of process management, if you can get your customer on-side.

Training and Software Craftsmanship

Finally what I think was the most interesting talk of the conference and one directly relevant to my current work, Jason Gorman gave a fantastic talk about a training scheme he is running with the BBC to improve software craftsmanship using peer-review.  I’ll be trying this out at Isotoma, and I’ll blog about it too!

On Ubuntu Python, Exceptions and unnecessary imports

A few days ago Alexander Limi (one of Plone’s founders) tweeted the following:

Ubuntu Python: Raise an exception, import 190 modules: http://bit.ly/bCxlhC – this is why you don’t want to use the system Python.

Now this gets my goat on a few points. First up, why the hell would I not want to use the system python? If I’m using any sane distribution I’ll have package management and security updates, and any flaw in Python will be patched, packaged and tested by people that are far smarter than me. Upgrading the Python that ships with the Plone Unified Installer just isn’t going to be as easy, however you play it. And that’s without the risk of the Plone community moving on to more exciting things, leaving their version of Python unsupported.

Secondly, there’s a fatal flaw in the original blog post to which limi refers. Yes, on the desktop, Ubuntu imports 190 packages when an exception is raised. As the author explains this is to enable Apport to provide as much information to the Ubuntu devs about application failures. What the author does not mention is that this doesn’t happen on the Server edition of Ubuntu. Why would it? Apport is designed to handle desktop application failures and to improve the end user experience. It isn’t installed by default on the server edition, because it isn’t needed.

On my Karmic desktop:

Python 2.6.4 (r264:75706, Dec  7 2009, 18:43:55)
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> len(sys.modules)
35
>>> raise KeyError
Traceback (most recent call last):
  File "", line 1, in 
KeyError
>>> len(sys.modules)
225

On my Karmic server:

Python 2.6.4 (r264:75706, Dec  7 2009, 18:43:55)
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> len(sys.modules)
32
>>> raise KeyError
Traceback (most recent call last):
  File "", line 1, in 
KeyError
>>> len(sys.modules)
32

Being quick to condemn Ubuntu, and their packaging of Python, doesn’t do anyone any good. Think before you tweet. And don’t go live on a desktop distro.

Location of Zope/ZEO transient caches

When Zope connects to ZEO it is configured with a database cache, where objects are cached by Zope so that it doesn’t have to constantly return to ZEO. These caches can be transient (they don’t persist past a Zope restart) or persistent (where they do).

If all you do is uncomment the example ZEO client configuration in your zope.conf you’ll have a 20MB transient cache. If you follow one of the many buildout recipes on the web you’ll likely have a transient cache of anywhere from 30MB (plone.recipe.zope2instance) up to 300MB (from this howto on plone.org).

Zope uses the Python tempfile module to decide where to put these transient caches, which on Debian defaults to /tmp. Our default production server installation has /tmp as a subdirectory of /, with the root partition being only 1GB. This means we have a maximum of about 400MB for temporary files on our servers.

With the default 20MB or 30MB this is fine, even in multiple Zope/ZEO installations on the same box. However, when we recently tried to up the size of the caches on a system that needed it we quickly found ourselves running out of diskspace in the root partition, which in turn caused some very strange behaviour, ultimately resulting in the site being unavailable.

The trick is to move these transient files to a more sensible location (like /var/tmp). You can do this (with thanks to mauro on zodb-dev) by setting an environment variable in your zope.conf, either manually:

<environment>
  TMPDIR /var/tmp
</environment>

Or in your buildout:

[instance]
recipe = plone.recipe.zope2install
...
environment-vars =
     TMPDIR /var/tmp

Compiling Python2.3 on Karmic

We still have some (now very old) Plone 2.1 sites running.  Up until Ubuntu 8.10 python2.3 was still available from package management (albeit with limited support), but as of Ubuntu 9.04 python2.3 was removed entirely.

To recreate a development environment for these older sites means downloading Python 2.3 and compiling it by hand.  Unfortunately it’s not a default build any more – as it says on the download page:

Since the release candidate, we received various reports that the this release may fail to build on current operating systems, in particular on OS X. We have made no attempt to fix these problems, as the release is targeted for systems that were current at the time Python 2.3 was originally released. For more recent systems, you might have to come up with work-arounds.

For Ubuntu 9.10 (Karmic Koala) to compile python2.3 you’ll need to disable buffer overflow checking by doing the following:

wget http://www.python.org/ftp/python/2.3.7/Python-2.3.7.tgz
tar zxvf Python-2.3.7.tgz
cd Python-2.3.7
./configure --prefix=/usr/local BASECFLAGS=-U_FORTIFY_SOURCE
make
sudo make install

Javascript localization within Plone

For a while now, I’ve been trying to think of the best way to get localized strings into Javascript under Plone. Many existing packages and parts of Plone use one Javascript file per language, containing all translations. Whilst this is effective, it doesn’t “feel right” having all your translations distributed to the user on each page load, much less having them drawn from two different sources. Worse still (in my mind) is having translations dumped out into hidden HTML elements and then recalled by Javascript using getElementById or some such.

The solution I’ve eventually settled on is one that combines AJAX and client-side caching in cookies to minimize page load time. The implementation consists of a single view, a Javascript file and a bunch of XML & ZCML to tie it all together.

The powers that be here at Isotoma, have kindly agreed to let me open source this under the Apache 2.0 license.

Full documentation can be found for the code can be found in the README.txt, however I’ve included a quickstart guide below.

Continue reading ‘Javascript localization within Plone’

Beginning development with Plone 4 & Dexterity

Over the past few days, I’ve been tinkering with the latest alphas of Plone 4, particularly with an eye to trying out Dexterity on the latest version.

I started out, as many people will, by downloading the unified installer which will install Python 2.6, Zope 2.12 and the Plone 4.0 alpha for you. After a few teething problems with multiple versions of Python on my Hardy host, I had my Plone install up and running.

First impressions among myself and my colleagues here at Isotoma were that firstly, it was a heck of a lot faster than its predecessor. In fact, John Stahl recently blogged that Plone 4 is potentially three times faster than Drupal, Joomla and Wordpress. The other main, marked difference was the default theme, which is a lot slicker, though in my own opinion with its blocks of bright colours and rounded corners, a little too overtly “Web 2.0” (insert air-quotes here).

My next stop was Martin Aspeli’s Dexterity developer manual which whilst up-to-date for the current stable release of Plone, required some tweaking to get going with Plone 4.

The unified installer, by default, makes use of several config files for buildout, which keeps a lot of the core settings in separate files (base.cfg & versions.cfg). I hear that roadrunner is almost ready for Plone 4, but it’ll be a little while before we’re getting it without checking out the source so that had to be chopped. The extends entry for Dexterity also required updating to the latest alpha.

Otherwise, things went very straightforwardly. My buildout.cfg for use with the unified installer can be found below the fold.
Continue reading ‘Beginning development with Plone 4 & Dexterity’

Break browser finger-printing by making it more unique?

The EFF are worried about an alternative way of tracking your web activity without the need for cookies, and have set up a site (Panopticlick) that checks your browser for how trackable it is.

If you’re worried about the potential loss of privacy, they have a few suggestions for how to make your fingerprint less unique.

Having tried out the check myself, my configuration was recognisably unique out of nearly 200,000 checked so far. I have to admit it surprised me.

The suggestions from the EFF will help you blend in with the crowd and make you harder to track, but generally at the expense of turning off large chunks of functionality in your browser.

There are a couple of possible alternatives:

  1. Have some agreed ’sets’ of plugins and fonts, something like, “bare-bones”, “simple”, “moderate”, “everything”; or “basic user”, “user”, “web developer” – just some way to minimise the number of possible configurations.
  2. Add some entropy into the finger-print so it changes a little or a lot every few pages. Return slightly different versions of the browser, report a few extra plugins and fonts.

For the first, it would be very hard to implement, how do you decide which plugins should be in the sets? How do you administrate it? What happens when a new plugin is released? You’re going to get people wanting a plugin that’s not in any of the sets for whatever reason, not to mention that it will effectively kill a lot of development work if the developer doesn’t think it’ll get into a set – maybe the plugin is useful, but to too limited an audience.

For the second, if you want the remote site to display properly in your browser you can’t lie to it too much – a different browser (IE/Firefox) may get different css; if you don’t tell it you have the flash plugin it’s not likely to serve you flash until you ‘install’ it. You could however tell it you have plugins and fonts that you don’t really. Of course this could result in receiving the wrong css or bad data just the same as lying about not having a plugin. So you’d have to make sure the plugins you lie about aren’t depended on server-side (i.e. it should make no difference to what gets served). You could even generate semi-random plugin and font names. It’s feasible that a remote site could weed out all but a known list of plugins and fonts, but it would add overhead to the check and would rely on their data being valid and up to date. Also, if they track by your IP or some other means they could work out which were the ‘core’ plugins/fonts that were always on your list and ignore the rest – again, a big overhead and not 100% reliable, especially over time with a random IP every time you connect to your DSL.

I think the standardised sets idea would work but would stifle the internet.

The randomised finger-prints offer a far greater probability for evading tracking, you could even do it through a browser plugin – which of course would never advertise itself.

Of course all of this could be a moot point if a major corporation were to get enough market share to force a standardised browsing experience on the world, but that has much bigger downsides than the original problem and I’m not even touching that topic.

Addendum: I’m touching that topic, briefly.

Regarding ’standardised browser experience’, the recent release of the iPad is exactly what I’m talking about.

It’s the most closed system currently in existence, and as a result very unlikely to support a wide range of plugins in the browser, let alone random user-created ones that are updated regularly (the finger-prints include the version numbers, many small version increments increase finger-print space and uniqueness).

Returning an actual proper real life HTTP code from a Django error page

Go to a non existent page on a Django site and you will (hopefully) be met with a friendly error page telling you not to panic, everything is OK and all you’ve done is mistyped the URL or something.

If it’s your thing, you may be interested enough to look see what the actual HTTP code for the page is in the header; chances are that it will be a 200 rather than a 404 as the default handler just passes the dealings onto the HttpResponse class.

Generally speaking this is fine, but there are situations where an accurate code would be very handy, as I found out the other day when I was trying to detect whether a file had been uploaded to a remote server. Scraping the resultant HTML for “Page not found” is not my idea of a robust solution.

So, instead, pass the error page’s HTML into the respective class by putting something like this in urls.py:


handler404 = 'urls.return_404'
handler500 = 'urls.return_500'

def return_404(request):
	return HttpResponseNotFound(
                render_to_string("errors/404.html"))

def return_500(request):
	return HttpResponseServerError(
                render_to_string("errors/500.html"))

Fullest of props to PiotrLegnica at Stack Overflow for this most elegant of solutions.

Edit: After further examination (see the comments) the default handlers do act as expected, but you’re still restricted to where you put your error templates, i.e. the root of the templates directory.
To my mind, it’s neater if you can specify a dedicated location.

Useful Plone template debugging functions

As Plone developers, a lot of the problems we have when writing code and templates are only revealed with cryptic, sometimes misleading error messages from somewhere way down the stack from their underlying cause, if at all. When an error is raised, by some template rendering, Zope does provide some useful traceback information specifying the template with line numbers and expressions and whatnot. But why shouldn’t we be able to access this information without raising an error? For example, to diagnose security or redirection problems that aren’t necessarily obvious even with extra logging & verbose security enabled.

The functions provided below allow the developer to gather this kind of feedback and output wherever he or she wishes, without having to provide any arguments that might not always be easy to get from the current part of source code. They work under Zope 2.7 but are untested under other versions. If you do try them out on other versions, please report back in the comments if they do work!

import sys
from Products.CMFCore.utils import getToolByName
from zope.tales.tales import TALESTracebackSupplement

def get_current_template_position():
    """ If called from a stack frame which has been called from template evaluation,
        returns a tuple of template filename, line number, column number and
        TALES expression closest in the stack to the caller. Otherwise, returns None.
    """
    i = 0
    curframe = sys._getframe(i)
    while True:
        locals = curframe.f_locals
        globals = curframe.f_globals
        if '__traceback_supplement__' in locals:
            # Use the supplement defined in the function.
            tbs = locals.get('__traceback_supplement__')
        elif '__traceback_supplement__' in globals:
            # Use the supplement defined in the module.
            # This is used by Scripts (Python).
            tbs = globals.get('__traceback_supplement__')
        else:
            tbs = None
        if tbs is not None:
            factory = tbs[0]
            args = tbs[1:]
            try:
                supp = factory(*args)
            except:
                continue
            if type(supp) is TALESTracebackSupplement:
                return (supp.context, supp.source_url, supp.line, supp.column, supp.expression)  

        i=i+1
        try:
            curframe = sys._getframe(i)
            if curframe is None:
                return None
        except:
            return None  

def dump_current_template_position(context=None, return_string=False):
    """ When called, attempts to print to the console the URL of the current request, the
        authenticated user, the currently executing template file, the line and column
        currently being evaluated in the file and the expression being evaluated. 

        Will not print if called from a stack frame which has been called from template
        evaluation. May not print if called from a .cpy or .vpy file, depending on
        permissions to 'print'.

        Wherever possible, this function should be called with the 'context' arguement
        specified.

        If the optional argument 'return_string' is set to True, the function returns the
        message that would be output, rather than printing.
    """
    tpos = get_current_template_position()
    if tpos is not None:
        (ctx, template, line, col, expr) = tpos
        url = 'Unknown'
        if context is not None:
            try:
                request = hasattr(context, 'request') and context.request or context.REQUEST
                url = request.get('ACTUAL_URL')
            except AttributeError:
                pass
        if url == 'Unknown':
            try:
                request = hasattr(ctx, 'request') and ctx.request or ctx.REQUEST
                url = request.get('ACTUAL_URL')
            except AttributeError:
                pass

        member = 'Unknown'
        if context is not None:
            try:
                mtool = getToolByName(context, 'portal_membership')
                member = mtool.getAuthenticatedMember()
            except AttributeError:
                pass
        if member == 'Unknown':
            try:
                mtool = getToolByName(ctx, 'portal_membership')
                member = mtool.getAuthenticatedMember()
            except AttributeError:
                pass

        output = "\tURL: %s\n\tAuth'd as: %s\n\tFile: %s\n\tLine: %s\n\tColumn: %s\n\tExpression: %s" % (url, member, template, line, col, expr)
        if return_string:
            return output
        print output

This may also be called from templates, provided the template has sufficient permissions to call the module it lies in:

<tal:block tal:define="dummy python:modules['myproject.app.utils'].dump_current_template_position(context)" />

Or, as it is mostly used, from code called by templates, simply by importing the function(s) as necessary and calling them with options of your choice. If calling from .cpy or .vpy files, the print command may not work properly, so the dump_current_template_position function may be called with the optional argument return_string set to True and then the result may be logged or printed using alternate methods.

Mixed metaphors and malapropisms from the mire of many meetings

Most of us find ourselves stuck in long boring meetings or conference calls more often than we care to remember. For a while now, I’ve found some respite in my habit of collecting some of the more hilarious manglings of the English language you find in such situations. I particularly love it when phrases end up meaning the opposite of what the speaker thinks they do. Here’s a selection from the past year:

  • Getting “buy-off” on specifications (probably thinking of buy-in or sign-off)
  • An item being “delegated to the bottom” of a menu (meaning relegated)
  • “visa versa”, used to mean something like “for example” (confused between vice versa and vis-à-vis)
  • “The train is already out of the tracks” (I think you meant to say station)
  • Extolling the close, mutually beneficial relationship between their organisation and ours as “incestuous and symbiotic”
  • “Begs the question” used synonymously with “Raises the question”

And some from emails:

  • “By all intensive purposes, I think I have the account setup and everything ready to go.”
  • “These are often required and might shoot you into a foot”
  • “I wonder if he’s been unendated with calls or e mails?”
  • (From a Kenyan security newsletter) “Wait until the crowd has disbursed”

Feel free to add more in the comments!