Category Archives: Uncategorized

Hacking together a standing desk

We’ve had a few people in the office transition to standing desks recently and the process by which everyone achieved their goal has been quite interesting. Or at least interesting to me.

doug's deskDoug was the first to try it and ended up, as a lot of people do, taking the old ‘big pile of books’ approach. It’s cheap, quick and,so long as you’re using a solid and non-tottering pile of books, probably pretty safe. Luckily the Isotoma office has a pretty extensive library of books–which in Doug’s case have mostly been memorised.

(I wouldn’t let him tidy his desk before I took this photo. He wanted me to tell you.)

A different approach that David introduced (as far as I know) and which I’ve adopted is the music stand approach. It’s a wee bit more expensive and depends on you having the kind of job that doesn’t really involve paper or double monitors – but I love it. The bonus of this approach is you get to feel a little bit like a member of an nineties synth-rock band. Always a bonus.

But the canonical Isotoma standing desk was “Ikea-hacked” together by Tom. He went to town on it to such an extent that we made him do an intranet post about it:

I’ve had a couple of enquiries about the parts for my standing desk, so rather than send sporadic emails, I thought I’d just put it here.

8717637419_6b5bfa4fa8_o8717637485_65832cd578_o

I built my standing desk based on this article, which is great, and has assembly instructions and everything (although it’s pretty trivial to put together). The guy bought all the bits from IKEA for $22. But obviously we don’t live in America, and also I needed to double up on some of the parts due to my dual monitor setup, so here is the full parts list, with links to the UK IKEA website:

Also, you will need to find 4 screws to screw the brackets into the table legs because annoyingly, the brackets have holes drilled, but no screws to go in said holes.

Grand total: £29 (including £10 postage)

Oh and don’t forget, you’ll probably want some kind of chair to give your legs a rest every now and then (yes it’s hilarious that Tom still has a chair etc, but it does come in handy, even if it’s just for 2×30 minute spells during the day). I got a bar stool with a back from Barnitts :)

This concludes my blog post about standing desks. Please add photos of your sweet, customised desks below in the comments.

Just kidding.

About usIsotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.

 

What Heartbleed means for you

On the 8th April a group of security researchers published information about a newly discovered exploit in a popular encryption library. With some marketing panache, they called this exploit “Heartbleed”.

A huge number of Internet services were vulnerable to this exploit, and although many of them have now been patched many remain. In particular, this was an Open Source library and so many of the very largest and most popular sites were directly affected.

Attention on the exploit has so far focused on the possible use of the exploit to obtain “key material” from affected sites, but there are some more immediate ramifications and you need to act to protect yourself.

Unfortunately the attack will also reveal other random bits of webserver’s memory, which can include usernames, passwords and cookies. Obtaining this information will allow attackers to log into these services as you, and then conduct more usual fraud and identity theft.

Once the dust has settled (so later today on the 9th, or tomorrow on the 10th) you should go and change every single one of your passwords. Start with the passwords you’ve used recently and high value services.

It’s probably a good idea to clear all your cookies too once you’ve done this, to force you to re-login to every service with your new password.

You should also log out of every single service on your phone, and then re-login in, to get new session cookies. If you are particularly paranoid, wipe your phone and reinstall. Mobile app session cookies are likely to be a very popular vector for this attack.

This is an enormous amount of work, but you can use it as an opportunity to set some decent random passwords for every service and adopt a tool like LastPass, 1Password or KeePass while you are at it.

Most people are hugely vulnerable to password disclosure because they share passwords between accounts, and the entire world of black-hats are out there right now slurping passwords off every webserver they can get them from. There is going to be a huge spike in fraud and identity theft soon, and you want to make sure you are not a victim to it.

The Man-In-The-Middle Attack

In simple terms this would allow an attacker to impersonate the site’s SSL certificate, so they can show the padlock icon that demonstrates a secure connection, even though they control your connection.

They can only do this if they also manage to somehow make your browser connect to their computers for the request. This can normally only be done by either controlling part of your connection directly (hacking your router maybe), or by “poisoning” your access to the Domain Name Service with which you find out how to reach a site (there are many ways to do this, but none of them are trivial).

You can expect Internet security types to be fretting about this one for a long time to come, and there are likely to be some horrific exploits against some high-profile sites executed by some of the world’s most skilled hackers. If they do it well enough, we may never hear of it.

The impact of this exploit is going to have huge ramifications for server operators and system designers, but there is very little in practical terms that most people can mitigate this risk for their own browsing.

About us: Isotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.

Reviewing Django REST Framework

Recently, we used Django REST Framework to build the backend for an API-first web application. Here I’ll attempt to explain why we chose REST Framework and how successfully it helped us build our software.

Why Use Django REST Framework?

RFC-compliant HTTP Response Codes

Clients (javascript and rich desktop/mobile/tablet applications) will more than likely expect your REST service endpoint to return status codes as specified in the HTTP/1.1 spec. Returning a 200 response containing {‘status’: ‘error’} goes against the principles of HTTP and you’ll find that HTTP-compliant javascript libraries will get their knickers in a twist. In our backend code, we ideally want to raise native exceptions and return native objects; status codes and content should be inferred and serialised as required.

If authentication fails, REST Framework serves a 401 response. Raise a PermissionDenied and you automatically get a 403 response. Raise a ValidationError when examining the submitted data and you get a 400 response. POST successfully and get a 201, PATCH and get a 200. And so on.

Methods

You could PATCH an existing user profile with just the field that was changed in your UI, DELETE a comment, PUT a new shopping basket, and so on. HTTP methods exist so that you don’t have to encode the nature of your request within the body of your request. REST Framework has support for these methods natively in its base ViewSet class which is used to build each of your endpoints; verbs are mapped to methods on your view class which, by default, are implemented to do everything you’d expect (create, update, delete).

Accepts

The base ViewSet class looks for the Accepts header and encodes the response accordingly. You need only specify which formats you wish to support in your settings.py.

Serializers are not Forms

Django Forms do not provide a sufficient abstraction to handle object PATCHing (only PUT) and cannot encode more complex, nested data structures. The latter limitation lies with HTTP, not with Django Forms; HTTP forms cannot natively encode nested data structures (both application/x-www-form-urlencoded and multipart/form-data rely on flat key-value formats). Therefore, if you want to declaratively define a schema for the data submitted by your users, you’ll find life a lot easier if you discard Django Forms and use REST Framework’s Serializer class instead.

If the consumers of your API wish to use PATCH rather than PUT, and chances are they will, you’ll need to account for that in your validation. The REST Framework ModelSerializer class adds fields that map automatically to Model Field types, in much the same way that Django’s ModelForm does. Serializers also allow nesting of other Serializers for representing fields from related resources, providing an alternative to referencing them with a unique identifier or hyperlink.

More OPTIONS

Should you choose to go beyond an AJAX-enabled site and implement a fully-documented, public API then best practice and an RFC or two suggest that you make your API discoverable by allowing OPTIONS requests. REST Framework allows an OPTIONS request to be made on every endpoint, for which it examines request.user and returns the HTTP methods available to that user, and the schema required for making requests with each one.

OAuth2

Support for OAuth 1 and 2 is available out of the box and OAuth permissions, should you choose to use them, can be configured as a permissions backend.

Browsable

REST framework provides a browsable HTTP interface that presents your API as a series of forms that you can submit to. We found it incredibly useful for development but found it a bit too rough around the edges to offer as an aid for third parties wishing to explore the API. We therefore used the following snippet in our settings.py file to make the browsable API available only when DEBUG is set to True:

if DEBUG:
    REST_FRAMEWORK['DEFAULT_RENDERER_CLASSES'].append(
        'rest_framework.renderers.BrowsableAPIRenderer'
    )

Testing

REST Framework gives you an APITestCase class which comes with a modified test client. You give this client a dictionary and encoding and it will serialise the request and deserialise the response. You only ever deal in python dictionaries and your tests will never need to contain a single instance of json.loads.

Documentation

The documentation is of a high quality. By copying the Django project’s three-pronged approach to documentation – tutorial, topics, and API structure, Django buffs will find it familiar and easy to parse. The tutorial quickly gives readers the feeling of accomplishment, the high-level topic-driven core of the documentation allows readers to quickly get a solid understanding of how the framework should be used, and method-by-method API documentation is very detailed, frequently offering examples of how to override existing functionality.

Project Status

At the time of writing the project remains under active development. The roadmap is fairly clear and the chap in charge has a solid grasp of the state of affairs. Test coverage is good. There’s promising evidence in the issue history that creators of useful but non-essential components are encouraged to publish their work as new, separate projects, which are then linked to from the REST Framework documentation.

Criticisms

Permissions

We found that writing permissions was messy and we had to work hard to avoid breaking DRY. An example is required. Let’s define a ViewSet representing both a resource collection and any document from that collection:

views.py:

class JobViewSet(ViewSet):
    """
    Handles both URLS:
    /jobs
    /jobs/(?P<id>\d+)/$
    """
    serializer_class = JobSerializer
    permission_classes = (IsAuthenticated, JobPermission)
 
    def get_queryset(self):
        if self.request.user.is_superuser:
            return Job.objects.all()
 
        return Job.objects.filter(
            Q(applications__user=request.user) |
            Q(reviewers__user=request.user)
        )

If the Job collection is requested, the queryset from get_queryset() will be run through the serializer_class and returned as an HTTPResponse with the requested encoding.

If a Job item is requested and it is in the queryset from get_queryset(), it is run through the serializer_class and served. If a Job item is requested and is not in the queryset, the view returns a 404 status code. But we want a 403.

So if we define that JobPermission class, we can fail the object permission test, resulting in a 403 status code:

permissions.py:

class JobPermission(Permission):
    def get_object_permission(self, request, view, obj):
    if obj in Job.objects.filter(
        Q(applications__user=request.user) |
        Q(reviewers__user=request.user)):
        return True
    return False

Not only have we duplicated the logic from the view method get_queryset (we could admittedly reuse view.get_queryset() but the method and underlying query would still be executed twice), if we don’t then the client is sent a completely misleading response code.

The neatest way to solve this issue seems to be to use the DjangoObjectPermissionsFilter together with the django-guardian package. Not only will this allow you to define object permissions independently of your views, it’ll also allow you filter querysets using the same logic. Disclaimer: I’ve not tried this solution, so it might be a terrible thing to do.

Nested Resources

REST Framework is not built to support nested resources of the form /baskets/15/items. It requires that you keep your API flat, of the form /baskets/15 and /items/?basket=15.

We did eventually choose to implement some parts of our API using nested URLs however it was hard work and we had to alter public method signatures and the data types of public attributes within our subclasses. We required entirely highly modified Router, Serializer, and ViewSet classes. It is worth noting that REST Framework deserves praise for making each of these components so pluggable.

Very specifically, the biggest issue preventing us pushing our nested resources components upstream was REST Framework’s decision to make lookup_field on the HyperlinkedIdentityField and HyperlinkedRelatedField a single string value (e.g. “baskets”). To support any number of parent collections, we had to create a NestedHyperlinkedIdentityField with a new lookup_fields list attribute, e.g. ["baskets", "items"].

Conclusions

REST Framework is great. It has flaws but continues to mature as an increasingly popular open source project. I’d whole-heartedly recommend that you use it for creating full, public APIs, and also for creating a handful of endpoints for the bits of your site that need to be AJAX-enabled. It’s as lightweight as you need it to be and most of what it does, it does extremely well.

About us: Isotoma is a bespoke software development company based in York and London specialising in web apps, mobile apps and product design. If you’d like to know more you can review our work or get in touch.

Ballet Phase 3 PPR

A strange one this. The project was delivered under budget and there were no significant quibbles from the customer about how quickly we turned it around. However, by the end of it, Jo and I in particular felt once again like actually sacking them as a customer. Our main point of contact is still Digital Manager David Watson (lol) who has a schizophrenic nature; capable of being extremely charming and reasonable and then suddenly changing everything, demanding the moon on a stick and threatening to get rid of us as an agency. Not the best.

We were also a bit twitchy going into this one due to the very unsuccessful nature of phase 2 that we delivered in September 2013.

So phase 1 was stressful and odd – but we delivered in spades. Phase 2 went very badly wrong (and was were we learned of David’s tendencies to be a tough customer). For phase 3 we actually learned from our mistakes in the previous releases:

1) David needed to understand why the site is the way it was. The timeline for the original build was *ridiculous* and decisions that were then practical seem odd if you don’t have the context of the original build.

2) David has a *very specific* image of what he wants that he is often unsuccessful at communicating. This is made to seem as if it’s our fault. Budgets need to be set accordingly. Contingency must be added; scope must be policed.

Although David did manage to extract a couple of change requests from us for free at the end of the project, we were largely successful in this – so nice work Tom and Antony.

I think the single most important step in this was reviewing the requirements in detail with him with a developer on the call. We could describe what we were intending to do and he could say, no, I want it covered in diamonds instead. We headed off at least 3 bad assumptions that would have killed us this way.

Learnings

Requirements are badly captured in designs – this customer particularly needs a written scope and needs to be grabbed by the shoulders and made to think about consequences.

Metrics
We invoiced £4,720 and timesheeted £3,520 — so our achieved day rate was an excellent £697.27. However it’s worth noting that the actual cost of the project was 4-5 days higher with the couple of CRs he managed to wangle. This was captured in support.

Shortlister PPR

So after this blogpost about what Shortlister is, I thought I’d put together some stats on the project from the PPR. We’re still finding a format for these, so while this is mostly as per Fig’s AuthorDirect post the other day, I’ve also felt free to freestyle. Booya!

See the original post over here

Profitability and the Achieved Day Rate (ADR)

For the initial phase of the project we ended up invoicing £70,160. The value of the actual time that we recorded on the project was £88,266. This gave us an achieved day rate of £490. The sweet spot for us from a profitability point of view is £500.

Sadly, over the next month or so, we ended up timesheeting a significant amount more on fiddling, faffing, documenting and supporting – not to mention post QA bugs and so the final timesheeted amount was a wee bit higher.

Error Rates and QA

240 tickets were opened by Alex, Ben and Francois during the build. There were a total of 61 QA defects raised on this – most of them in the final month of the project.

What did the customer think?

David swings between being delighted with the product (it is undeniably of high quality) and being frustrated by the process of software development. In a review meeting I said that the success of the initial phase of the project has put the bar very high for him. This was his first experience of the process and, in a way, his expectations of the smooth running of the process have been set weirdly high. What he conveniently forgets is the 6 months of wireframing, meetings and arguing that we went through at the start of the process.

What else did we learn?

Some thoughts from Alex, Ben, Francois, Jo and myself:

Guesswork

The project was a bit vaguely defined at the start, where FJ was making a lot of guesswork. But we had to make do with what we got from David. Although this frustrated FJ, it actually worked out pretty well from mine (and I think Andy’s) point of view because we could steer the project where David was too inexperienced to give us an absolute brief.

FJ often felt like he was guessing at how things should work, but I found that it was better to show David something that he could react too rather than endlessly talk about what-ifs.

Annoying customers

“David also had an annoying habit of contact me directly via email, skype or phone”

This came up for Ben and Francois. A nice problem to have in some ways but a distraction for Ben particularly. Our learning here is to limit access to devs – not only because it’s a timesink but also because it can lead to pretty severe mission creep.

From the point of view of developers, it’s always useful to speak up about this kind of thing. We have some pretty robust ways to stop it – but in a few instances early on in the project, it was actually both hard for me to spot that it was happening and for Ben to actually say ‘this dude is a problem’.

So, yes, “Ben is nice” is actually a learning.

The case for prototyping and user testing

Francois: “I really feel that a serious missing ingredient was user testing with an early prototype. I’m sure it would’ve improved the product greatly. I really still don’t know how well this application will perform in practice.

I think this is a valid concern but the budget was not there and, by the time it was, there was no time. My learning from this was to make sure that the client understands at all times where our responsibilities as an agency end. We underlined with David that we were delivering what was in the stories/wireframes and not what the user’s response to that was.

Webfonts – how do they work?

Francois “We clearly still have a lot to learn about web fonts. Discovered unexpected problems, platform inconsistencies. Had to do lots of last-minute research, and i’m still not sure whether we arrived at the best possible result.”

(My notes from this part of the meeting just say “Fucking web fonts”. Does that count as a learning?)

Wireframing on the hoof

“I feel I still have more to learn about putting video containers in web pages. I didn’t have dummy content I could test with, and didn’t know what the final markup was going to be. Bit of a black box.”

This vagueness in the wireframes actually helped the devs because there was no definite method of implementation up until the last minute – though Jo points out that this makes a bit of a nightmare for QA.

The main thing to keep in mind for this is that it’s the communication between IA, Dev and QA that makes it work – not the individual documents themselves.

Bootstrap = Not worse than Hitler

Adding final design as a “skin” on top of a wireframe-based design + Bootstrap base worked surprisingly well. We were able to skin the entire thing in just a few days, following high-level brand guidelines.

Bootstrap worked very well on this project, basically because we only had a wireframe aesthetic to follow. Design could then be added via a CSS skin, affecting surface appearance only, not layout.

 

…and that’s about it. Happy to discuss more in the comments.

 

Shortlister is live – GIF-fest

Shortlister is live at app.shortlister.com – the explanatory website, which we did not build is at www.shortlister.com

Shipping! Kaboom, right? The temptation to talk these things is always a little alarming because we all know how it can go:

Having said that though, the Shortlister project has gone actually really well. The project is, as we say, Very Isotoma. It’s business critical, it’s weird, it’s hard and it’s low on marketing puffery.

They’re an upstart-start-up who came to us nearly 2 years ago with some fairly disruptive ideas for the recruitment world.

At that time we talked. A lot. About all aspects of the application that they felt they wanted. We helped them define what it was they wanted and, in Christmas 2012, we’d designed the dream application in wireframe. All whistles and bells. We all agreed:

But it was expensive!

And they were like can you make it cheaper? And we were like

So then they were like

To which we were like: I know! We’ll do an MVP.

In early 2013 we defined what it was that was fundamental to the application, stripped back the functionality to something lean and deliverable that would work and went for it. Agile-style.

So. Agile:

We managed to maintain an unusually high level of Actual Proper Agile Development on this project for a few reasons but I think the chief amongst them is that our customer had a really high level of buy-in in the process. He understood that there was no flexibility in his budget and so there had to be flexibility in the dates or the scope (or both.)

So we took Francois’ excellent wireframes and we cut them back and back and back to something that could be delivered within the budget and it was painful but we got there.

We then built it. And from my point of view, that was probably the most uneventful and relaxing part of the any build I’ve been on. Ben, Alex and Francois rocked the shit out of it. I popped in once or twice a week to say

And Ben and Alex were all

Joking aside though, there is some *cutting edge* crap that this site does for video that’s worth getting into in the comments. Also it’s API first. We should probably talk about that.

I’m just going to stop now because oh god.

Oh god we’re having a party

You might have heard me mentioning something called The Happening if you spend any time near any of the orifices on my body that produce sounds. This is the top secret code name for a thing we’re doing. It’s a party to promote Isotoma and a couple of other York agencies.

So we’re not a very gregarious bunch – and we haven’t ever done a lot of self promotion, so this is a very new thing for us. I thought I’d note down a few lines about what it is and what’s involved. You can ask questions in the comments or just grab me about it if you want more info.

So it’s a party?
The idea came out of conversations with York based design-hipsters The Beautiful Meme, with whom we’ve done a few projects in the past. They thought it’d be interesting to organise an event for our friends in the industry to promote our agencies and get our names out there a bit.

Ah. So self aggrandisement then?
Weeeell kind of. It doesn’t hurt for people to know who we are (and for people who do know who we are to be reminded) – particularly given the fact that most of our best work is often quite invisible to both the general public and press.

What’s the twist?
Glad you asked, made up voice in my head. The twist is that instead of just getting people into a venue with a bar and pouring drinks down them, we’re going to introduce an economy of information to the evening. In order to get a drink, you’ll need to answer questions. These questions will start out relatively innocuous and become increasingly invasive and provocative (Sorry. Awful word.)

The info will be gathered by staff at the venue who’ll have ipads (or similar) and who’ll exchange information gathered for beer tokens that can be exchanged at the bar. There’s also some vague plans that you’ll be able to buy or win other things with these tokens. Cuddly toys, fitted kitchens, that kind of thing.

There’s some other stuff around changing the value of the information over the course of the evening (so 2 easy questions gets you 1 drink at the start whereas by 11pm you have to answer 4 very personal ones.) The details of this are still up in the air (of which, more later.)

Yes. We’re making a making a point about Things. Please feel free to draw your own allusions to your Thing of choice.

So what are we doing?
John’s already almost finished building the question-asking software. It’s a pretty simple django app when all’s said and done. You enter a ticket number, (all guests will get one so that there’s no link at all between your real ID and the answers you give.) and then it asks you some questions and dumps you back at the beginning again.

I expect it’ll want some tarting up, but that’s basically it – oh, except for the backend.

The backend?
So throughout the course of the night, data that we gather through this system will be exported (via csv or ‘live’ to Tableau talking to postgres) to the good people at York Metrics who’ll do analysis and fun stuff with the data. This analysis will be used in the venue throughout the night – in projections of data points, generalisations, statistics and more arty stuff. There’s also a plan to get some digital artists involved on the night but we’re not mega sure how that’s going to work yet.

Sounds marvellous. Anything else?
Yes! Because we’re gathering people’s information, we’re going to be making the point that none of it will be kept and it will all be ‘disposed of’ after the party. To hammer this point home, the application will only be running locally from a laptop we provide for the occasion. Once the party ends, the laptop will be brought out and ceremonially smashed to pieces and/or burned. FUN.

(And so will the machine that crunches the data. Probably. Details are hazy but the theatrical intent is there.)

Testing
So the basic idea is pretty simple but the devil is going to be in making sure the questions are interesting enough to make good data while not being controversial or invasive enough that people just flat out won’t answer them. I also don’t want them to be crass or cheap – but I may be alone in this.

So once we’ve got a candidate deck of questions, we’re going to organise a dummy run in York somewhere. More details of this when we know them. The long and short of it though is that Friday beers might get a bit interesting in January.

When is it?
Currently we’re aiming for 21st February at the Design Museum in London. (Yeah. London. We have to take the mountain to Mohammed.)

Can I come?
Short answer: No. Slightly longer answer: No because we’re aiming this at press and potential customers on a next-to-nothing budget. If you come, it’ll meant that we have to turn Charles Arthur away at the door and you know how he gets.

Demo
John’s work in progress demo is at http://floating-lake-4071.herokuapp.com/ – enter any number between 1 and 100 to see the demo (DEMO!) questions.

If you’ve got any thoughts – particularly around fun questions to ask people – I’d be really interested in hearing them.

 

 

PloneConf2010 – and there’s more

I’m here at PloneConf2010 too, and I’ve been to mostly different talks to Mitch, so here’s my write up too.  It’s taken a bit of time to find the time to write this up properly!

Calvin Hendryx-Parker: Enterprise Search in Plone using Solr

We’ve been using Solr for a few years here at Isotoma, but so far we’ve not integrated it with Plone.  Plone’s built-in Catalog is actually pretty good, however one thing it doesn’t do fantastically well is full-text search.  It is passable in English, but has very limited stemming support – which makes it terrible in other languages.

Calvin presented their experience of using Solr with Plone. They developed their own software to integrate the Plone catalog with Solr, instead of using collective.solr, which up till then was the canonical way of connecting them. Their new product alm.solrindex sounds significantly better than collective.solr.  Based on what I’ve heard here, you should definitely use alm.solrindex.

To summarise how this all hangs together, you need an instance of Solr installed somewhere that you can use.  You can deploy a solr specifically for each site, in which case you can deploy it through buildout.  Solr is Java, and runs inside various Java application servers.

You can also run a single Solr server for multiple Plone sites – in which case you partition the Solr database.

You then configure Solr, telling it how to index and parse the fields in your content. No configuration of this is required within Plone.  In particular you configure the indexes in Solr not in Plone.

Then install alm.solrindex in your plone site and delete all the indexes that you wish to use with Solr. alm.solrindex will create new indexes by inspecting Solr.

Then reindex your site, and you’re done!  It supports a lot of more complex use cases, but in this basic case you get top-end full text indexing at quite low cost.

Dylan Jay, PretaWeb: FunnelWeb

Funnelweb sounds invaluable if you want to convert an existing non-Plone site into a Plone site, with the minimum effort.

Funnelweb is a tool based on transmogrifier. Transmogrifier provides a “pipeline” concept for transforming content. Pipeline stages can be inserted into a pipeline, and these stages then have the ability to change the content in various ways.

Dylan wrote funnelweb to use transmogrifier and provide a harness for running it in a managed way over existing websites.  The goal is to create a new Plone site, using the content from existing websites.

Funnelweb uploads remotely to Plone over XML-RPC, which means none of transmogrifier needs to be installed in a Plone site, which is a significant advantage.  It is designed to be deployed using buildout, so a script will be provided in your build that executes the import.

A bunch of pipeline steps are provided to simplify the process of importing entire sites.  In particular funnelweb has a clustering algorithm that attempts to identify which parts of pages are content and which are templates.  This can be configured by providing xpath expressions to identify page sections, and then extract content from them for specific content fields.

It supports the concept of ordering and sorts, so that Ordered Folder types are created correctly.  It supports transmogrify.siteanalyser.attach to put attachments closer to pages and transmogrify.siteanalyser.defaultpage to detect index pages in collections and to make them folder indexes in the created sites.

Finally it supports relinking, so that pages get sane urls and all links to those pages are correctly referenced.

Richard Newbury: The State of Plone Caching

The existing caching solution for Plone 3 is CacheFu, which is now pretty long in the tooth.  I can remember being introduced to CacheFu by Geoff Davis at the Archipelago Sprint in 2006, where it was a huge improvement on the (virtually non-existent) support for HTTP caching in Plone.

It’s now looking pretty long in the tooth, and contains a bunch of design decisions that have proved problematic over time, particularly the heavy use of monkeypatching.

This talk was about the new canonical caching package for Plone, plone.app.caching. It was built by Richard Newbury, based on an architecture from the inimitable Martin Aspeli.

This package is already being used on high-volume sites with good results, and from what I saw here the architecture looks excellent.  It should be easy to configure for the general cases and allows sane extension of software to provide special-purpose caching configuration (which is quite a common requirement).

It provides a basic knob to control caching, where you can select strong, moderate or weak caching.

It can provide support for the two biggest issues in cache engineering: composite views (where a page contains content from multiple sources with different potential caching strategies) and split views (where one page can be seen by varying user groups who cannot be identified entirely from a tuple of URL and headers listed in Vary).

It provides support for nginx, apache, squid and varnish.  Richard recommends you do not use buildout recipes for Varnish, but I think our recipe isotoma.recipe.varnish would be OK, because it is sufficiently configuration.  We have yet to review the default config with plone.app.caching though.

Richard recommended some tools as well:

  • funkload for load testing
  • browsermob for real browsers
  • HttpFox instead of LiveHttpHeaders
  • Firebug, natch
  • remember that hitting refresh and shift-refresh force caches to refresh.  Do not use them while testing!

Jens Klein: Plone is so semantic, isn’t it?

Jens introduced a project he’s been working on called Interactive Knowledge Stack (IKS), funded by the EU.  This project is to provide an open source Java component for Content Management Systems in Europe to help the adoption of Semantic concepts online.  The tool they have produced is called FISE. The name is pronounced like an aussie would say “phase” ;)

FISE provides a RESTful interface to allow a CMS to associate semantic statements with content.  This allows us to say, for example that item X is in Paris, and in addition we can state that Paris is in France.  We can now query for “content in France” and it will know that this content is in France.

The provide a generic Python interface to FISE which is usable from within Plone.  In addition it provides a special index type that integrates with the Plone Catalog to allow for updating the FISE triple store with the information found in the content.  It can provide triples based on hierarchical relationships found in the plone database (page X is-a-child-of folder Y).

Jens would like someone to integrate the Aloha editor into Plone, which would allow much easier control by editors of semantic statements made about the content they are editing.

Querying Webtrends ODBC from the command line with WebtrendsQT

As I alluded to yesterday, and in my post about SQLAWebtrends, I’ve recently been doing a lot of work with the Webtrends analytics service, concerned mostly with getting data out of it via the old Windows ODBC drivers.

While turn around on new data available from reports could cause Methuselah to yawn, it could still be exceedingly time consuming loading up a spreadsheet app, defining queries in an ODBC query builder, and waiting for data to populate sheets; or at best writing several Python functions to query the last data; I would still have to spend tedious amounts of time tweaking and re-tweaking queries for different reports and/or datasets.

This lead me to make WebtrendsQT, a psql/mysql-like command line query tool for Webtrends using pyODBC.

WebtrendsQT is mostly just the ODBC extra tool provided by pyODBC, with some WT-specific changes. Namely the introduction of a “\p” command, which issues the {Call wtGetProfileList()} stored procedure against the WTSystem schema (via the system_cursor property), returning a list of profiles.
Similarly do_l (the handler for “\l”) instead of listing real schemas, lists the Webtrends ODBC equivalent templates.

do_c (“\c”) will work as you’d expect, taking a “schema” (e.g. template), and changing cursor to point to it, but also takes profile GUID as an optional first option to switch both profile and template (profiles define the data source and which report templates are available).

It took me some time to figure out that PyODBC‘s lovely columns() method wouldn’t work with the Webtrends driver, as some metadata isn’t provided by the driver and causes a segfault. Instead my hack is to use the DB API Cursor.description to get name and type details for columns on a table, unfortunately in order to get this information I need a cursor that specifically targets the table in question; and to get around this I make a simple query against the table that won’t return any information, but will still return a cursor:

@memoized()
def get_columns(self, name):
    columns = [['Column name', 'Type', 'Size',]]
    row = self.cursor.execute(
        'SELECT * FROM %s LIMIT 0' % (name,)
    ).fetchone()
    for r in row.cursor_description:
        columns.append(
            [r[0],
            self.db_types[r[1]],
            r[3],]
        )
    return columns

cursor_description is PyODBC’s special “always available even after query-set has been closed” reference to the cursor.description instance.

Unlike pyDBCLI.extras.odbc, WebtrendsQT takes a set of arguments rather than a single DSN string, due to the ODBC driver requiring a specific set of details to connect.

You most likely just want to install and run the tool under Windows, which if you have any experience with Python on Windows should be easy enough using easy_install or the included setup.py; if however you don’t have any Python-Windows experience and just want to get up and running with WebtrendsQT, the FAQ has a 5 step simple guide, including a pre-rolled pair of Windows scripts, that will install everything and create a batch script with all the Python paths set up to use.
When installed just type wtqt in the cmd.exe Window, provided by the batch script, and away you go.

C:\Users\test\Desktop> wtqt

ERROR: Must have a profile GUID, -p

Usage: wtqt.py [-u <user>] [-p <pass>] -d <system DSN> -h <host> [-P <port>] -t <template> -p <profile>

Options:
  -d, --systemdsn: Predefined system DSN
  -p, --profile : Webtrends profile GUID
  -t, --template : Template/schema
  -h, --host : Webtrends web instance
  -P, --port : Optional server port (default: 80)
  -u, --username: Optional username
  -k, --password: Optional password

Installing Postgis on Ubuntu Karmic Koala (9.10)

Karmic has Postgres 8.4 as it’s default version, and 8.3 can prove tricky to install.
Unfortunately, not all of the contrib packages were upgraded to run on the new version in time.

One of these is postgis, the geodatabase extensions. You have two options, build it from source, or install the Lucid (10.4) package.
Obviously, Lucid isn’t released yet, but I have had success with just installing the package. Your Mileage May Vary.

i386 Package
AMD Package

Hope this helps someone!
–t