Monthly Archive for July, 2010

Querying Webtrends ODBC from the command line with WebtrendsQT

As I alluded to yesterday, and in my post about SQLAWebtrends, I’ve recently been doing a lot of work with the Webtrends analytics service, concerned mostly with getting data out of it via the old Windows ODBC drivers.

While turn around on new data available from reports could cause Methuselah to yawn, it could still be exceedingly time consuming loading up a spreadsheet app, defining queries in an ODBC query builder, and waiting for data to populate sheets; or at best writing several Python functions to query the last data; I would still have to spend tedious amounts of time tweaking and re-tweaking queries for different reports and/or datasets.

This lead me to make WebtrendsQT, a psql/mysql-like command line query tool for Webtrends using pyODBC.

WebtrendsQT is mostly just the ODBC extra tool provided by pyODBC, with some WT-specific changes. Namely the introduction of a “\p” command, which issues the {Call wtGetProfileList()} stored procedure against the WTSystem schema (via the system_cursor property), returning a list of profiles.
Similarly do_l (the handler for “\l”) instead of listing real schemas, lists the Webtrends ODBC equivalent templates.

do_c (“\c”) will work as you’d expect, taking a “schema” (e.g. template), and changing cursor to point to it, but also takes profile GUID as an optional first option to switch both profile and template (profiles define the data source and which report templates are available).

It took me some time to figure out that PyODBC’s lovely columns() method wouldn’t work with the Webtrends driver, as some metadata isn’t provided by the driver and causes a segfault. Instead my hack is to use the DB API Cursor.description to get name and type details for columns on a table, unfortunately in order to get this information I need a cursor that specifically targets the table in question; and to get around this I make a simple query against the table that won’t return any information, but will still return a cursor:

@memoized()
def get_columns(self, name):
    columns = [['Column name', 'Type', 'Size',]]
    row = self.cursor.execute(
        'SELECT * FROM %s LIMIT 0' % (name,)
    ).fetchone()
    for r in row.cursor_description:
        columns.append(
            [r[0],
            self.db_types[r[1]],
            r[3],]
        )
    return columns

cursor_description is PyODBC’s special “always available even after query-set has been closed” reference to the cursor.description instance.

Unlike pyDBCLI.extras.odbc, WebtrendsQT takes a set of arguments rather than a single DSN string, due to the ODBC driver requiring a specific set of details to connect.

You most likely just want to install and run the tool under Windows, which if you have any experience with Python on Windows should be easy enough using easy_install or the included setup.py; if however you don’t have any Python-Windows experience and just want to get up and running with WebtrendsQT, the FAQ has a 5 step simple guide, including a pre-rolled pair of Windows scripts, that will install everything and create a batch script with all the Python paths set up to use.
When installed just type wtqt in the cmd.exe Window, provided by the batch script, and away you go.

C:\Users\test\Desktop> wtqt

ERROR: Must have a profile GUID, -p

Usage: wtqt.py [-u <user>] [-p <pass>] -d <system DSN> -h <host> [-P <port>] -t <template> -p <profile>

Options:
  -d, --systemdsn: Predefined system DSN
  -p, --profile : Webtrends profile GUID
  -t, --template : Template/schema
  -h, --host : Webtrends web instance
  -P, --port : Optional server port (default: 80)
  -u, --username: Optional username
  -k, --password: Optional password

Writing interactive command line DB query tools with pyDBCLI

While some people can’t navigate a relational database without reaching for a GUI, the vast majority of us spend a good proportion of our lives inside interactive command line interfaces, such as psql or mysql.

What do you do, however, if there isn’t a CLI query tool available for the DB you’re working with?
I had this exact same problem recently with a project, the main reason for the lack of any such tooling is because I wasn’t actually dealing with a real DB at all, but an ODBC interface to a web service that exposed reporting data as if it were tables in a DB. Rather than spend the rest of my life messing around with queries in a hooked up spreadsheet, or repeatedly writing one off Python snippets, I decided to write my own psql-like tool.

Thus the first version of pyDBCLI was born; well not really, my first tool really only handled querying Webtrends (I’ll save that for another post), while pyDBCLI is a base class for making such tools as long as you have a DB API compatible cursor to query.
pyDBCLI is based on the fantastic cmd.Cmd, so you can extend pretty much exactly as you would would Cmd, except with some extra properties such as cursor and multi_prompt are provided.

In order to make a Cmd based tool behave more like psql I ended up overriding the parseline method with regular expressions to handle escaped commands such as “\d” and “\c”, fuzzing them if not escaped, or un-escaping them if escaped so that Cmd’s unmodified parseline method can handle parsing and dispatching to defined do_* methods.

The other main change was modifying the default method to dispatch command lines to the a query method for querying against the DB API cursor; as well as this default and several other commands will detect an unfinished SQL query and wait for a finishing character (“;” by default) before sending it (or doing anything) else. This is what the multi_prompt property is for, the prompt property is replaced with multi_prompt when a query spans more than one line, and is set back again when the query is finished and executed.

2 example tools are bundled with pyDBCLI, in the extras package:

  • odbc – a tool to query an ODBC exposed data source, using PyODBC; takes PyODBC compatible DSN strings.
  • litecli – a tool to query a SQLite database; SQLite has it’s own CLI tool to do this, which is very well rounded and much better than litecli, but this is provided as a fairly functional example tool.

Tomorrow I’ll discuss the original reason I whipped up pyDBCLI: creating a tool for querying Webtrends, via ODBC, quickly and with more ease.

Squid and Facebook

Lenny (Debian 5.0) and Lucid (Ubuntu 10.04 LTS) both ship with Squid 2.7 as their default Squid version.  If you plan on deploying squid as a web proxy in a modern mixed browser environment you should seriously consider Squid 3.0.  If you can’t (or, like me, you’d already spent some time getting Squid 2.7 set up just right) you are likely to find users reporting that Facebook is returning blank pages or that links within Facebook don’t work.  Looking at Facebook in a browser with debugging tools installed you’ll see lots of cross domain Javascript errors.

At this point you should either upgrade to Squid 3 (which doesn’t exhibit this behaviour) or make the following change to your squid.conf:

  • Add server_http11 on

Facebook uses different methods of shipping the Javascript depending on whether the client has made an HTTP/1.0 or an HTTP/1.1 request.  By default Squid 2.7 alters the request to be HTTP/1.0, so Facebook sends it’s fallback methods to the modern browser, which then proceeds to break.  The change above stops Squid rewriting that part of the request, ensuring that Facebook ships the right content to your users.

Hacking Trac 0.12

We use Trac at Isotoma. Actually that’s a bit of an understatement, we don’t just use it, it’s integral to our project development process.

Trac offers a wiki, ticketing system and svn browser out of the box, but it’s the increasingly featureful plugins API that allows you to adapt, hack and install to fit your purpose.

Home grown tweaks

We’re hopping a couple of versions and upgrading to 0.12 in the next few weeks and with it being a system that everyone uses daily, there’s been more than a few enhancement requests for the new installation.

Here are a couple of the plugins that made it through the requirements gathering process:

Isotoma Theme

Of course natively you can tweak the static content and templates to your hearts content. The TracThemeEngine plugin wraps this up into a neater package so you can customise your install to add extra markup, follow your own style, or even make it look like Alta Vista circa 1997. We plumped for following our own style.

HotKeys Plugin

My muscle memory often causes an involuntary “ctrl+s” when editing wiki entries. One too many save dialogs prompted this plugin which maps scripts to keyboard shortcuts using the JQuery HotKeys plugin (the version of jquery.hotkeys is from John Boxall’s fork on github, which allows inputs and textareas to be bound).

QuickTicket Plugin

RTM like input for quickly adding tickets.

As a side note, this plugin also hijacks a keyboard shortcut (“ctrl+q”  - used by Firefox, amongst others, for “quit”). An abominable accessibility crime to be sure, but we’re in the rare position here that we are our target audience.

WikiTable Macro

Adapted from work by optilude on Trac-Hacks, this macro allows SQL to be included in wiki pages to show query results in a table. We added DSN mappings that you add to trac.ini allowing you to query databases other than the current instance’s.

Third party plugins from Trac-Hacks

The invaluable Trac-Hacks should be the first port of call for looking into expanding on Trac’s functionality. For reference, here are a few of the plugins we’ll be implementing come the roll out:

Batch Modify Plugin

Allows users to modify several tickets together in one shot.

Timing and Estimation Plugin

For estimation and time tracking. We’re using the permissions branch.

Git Plugin

Essential. Enables Git support for the versioning system backend.

Further reading

The documentation on the Edgewall’s Trac site is pretty comprehensive:
http://trac.edgewall.org/wiki/TracDev/PluginDevelopment

For help on the Genshi template language and content parser, again, you could do worse than looking at the documentation:
http://genshi.edgewall.org/

Join the mailing list here:
http://groups.google.com/group/trac-dev/

And finally, 9 times out of 10, someone will have written a plugin for it and submitted it to Trac-Hacks:
http://trac-hacks.org/