Tag Archives: sql

The Third Manifesto Implementers’ Workshop

Earlier this month I went to the Third Manifesto Implementers’ Workshop at Northumbria University in Newcastle. A group of us discussed recent developments in implementing the relational data model.

The relational data model was proposed by E. F. Codd in 1969 in response to the complex, hierarchical, data storage solutions of the time which required programs to be written and compiled for each database query. It was a powerful abstraction, but unfortunately SQL and its implementations missed out on important features, and broke it in fundamental ways. In response to this problem, and the industry’s approach towards object-databases, Chris Date and Hugh Darwen wrote “The Third Manifesto” (TTM) to put forward their ideas on how future database systems should work. I urge you to read their books (even if you’re not interested in the subject) – the language is amazing: precise, concise, comprehensive and easy to read – other technical authors don’t come close.

The relational model treats data as sets of logical propositions and allows them to be queried, manipulated and constrained declaratively. It abstracts away from physical storage and access issues which is why it will still be used a hundred years from now (and why NoSQL discussions like these http://wiki.apache.org/couchdb/EntityRelationship http://www.cmlenz.net/archives/2007/10/couchdb-joins are retrograde). If you’re writing loops to query your data, or having to navigate prescribed connection paths, then your abstractions are feeble and limited.

At the workshop, I talked about my project, Dee, which implements the relational ideas from TTM in Python. You can see my slides here (or here if you have an older browser).

Erwin Smout gave a couple of talks about implementing transition constraints and dispensing with data definition language.

David Livingstone walked us through the RAQUEL architecture – a layered approach along the lines of the OSI network model.

Hugh Darwen discussed the features and implementation of IBM’s Business System 12, one of the first ever relational database systems which had some surprisingly dynamic features, including key inferencing, so that view definitions could keep tabs on their underlying constraints.

Chris Date took us through his latest thoughts on how to update relational views in a generic way. The aim is for database users to be able to treat views and base tables in the same way, for both reading and writing. Lots to think about here, and my to-do list for Dee has grown another section.

Adrian Hudnott discussed a couple of research projects around optimising multiple relational assignments and tracking the source of updates so that transition constraints could be more effective.

Renaud de Landtsheer gave an insight into the work he’s been doing implementing first-order-logic constraints within Oracle databases.

I sat next to Toon Koppelaars (whose name went down well with the Geordies) and then I realised I had his book (Applied Mathematics for Database Professionals) waiting to be read on my desk at work, thanks to the eclectic Isotoma library (and Wes).

It was a packed couple of days with plenty of food for thought. Thank you to David Livingstone and Safwat Mansi for organising and hosting such an enjoyable and interesting event.

Querying Webtrends ODBC from the command line with WebtrendsQT

As I alluded to yesterday, and in my post about SQLAWebtrends, I’ve recently been doing a lot of work with the Webtrends analytics service, concerned mostly with getting data out of it via the old Windows ODBC drivers.

While turn around on new data available from reports could cause Methuselah to yawn, it could still be exceedingly time consuming loading up a spreadsheet app, defining queries in an ODBC query builder, and waiting for data to populate sheets; or at best writing several Python functions to query the last data; I would still have to spend tedious amounts of time tweaking and re-tweaking queries for different reports and/or datasets.

This lead me to make WebtrendsQT, a psql/mysql-like command line query tool for Webtrends using pyODBC.

WebtrendsQT is mostly just the ODBC extra tool provided by pyODBC, with some WT-specific changes. Namely the introduction of a “\p” command, which issues the {Call wtGetProfileList()} stored procedure against the WTSystem schema (via the system_cursor property), returning a list of profiles.
Similarly do_l (the handler for “\l”) instead of listing real schemas, lists the Webtrends ODBC equivalent templates.

do_c (“\c”) will work as you’d expect, taking a “schema” (e.g. template), and changing cursor to point to it, but also takes profile GUID as an optional first option to switch both profile and template (profiles define the data source and which report templates are available).

It took me some time to figure out that PyODBC‘s lovely columns() method wouldn’t work with the Webtrends driver, as some metadata isn’t provided by the driver and causes a segfault. Instead my hack is to use the DB API Cursor.description to get name and type details for columns on a table, unfortunately in order to get this information I need a cursor that specifically targets the table in question; and to get around this I make a simple query against the table that won’t return any information, but will still return a cursor:

@memoized()
def get_columns(self, name):
    columns = [['Column name', 'Type', 'Size',]]
    row = self.cursor.execute(
        'SELECT * FROM %s LIMIT 0' % (name,)
    ).fetchone()
    for r in row.cursor_description:
        columns.append(
            [r[0],
            self.db_types[r[1]],
            r[3],]
        )
    return columns

cursor_description is PyODBC’s special “always available even after query-set has been closed” reference to the cursor.description instance.

Unlike pyDBCLI.extras.odbc, WebtrendsQT takes a set of arguments rather than a single DSN string, due to the ODBC driver requiring a specific set of details to connect.

You most likely just want to install and run the tool under Windows, which if you have any experience with Python on Windows should be easy enough using easy_install or the included setup.py; if however you don’t have any Python-Windows experience and just want to get up and running with WebtrendsQT, the FAQ has a 5 step simple guide, including a pre-rolled pair of Windows scripts, that will install everything and create a batch script with all the Python paths set up to use.
When installed just type wtqt in the cmd.exe Window, provided by the batch script, and away you go.

C:\Users\test\Desktop> wtqt

ERROR: Must have a profile GUID, -p

Usage: wtqt.py [-u <user>] [-p <pass>] -d <system DSN> -h <host> [-P <port>] -t <template> -p <profile>

Options:
  -d, --systemdsn: Predefined system DSN
  -p, --profile : Webtrends profile GUID
  -t, --template : Template/schema
  -h, --host : Webtrends web instance
  -P, --port : Optional server port (default: 80)
  -u, --username: Optional username
  -k, --password: Optional password