There should be more than one way to do it in Python

Python has a philosophy of ‘There should be one– and preferably only one –obvious way to do it’ (http://www.python.org/dev/peps/pep-0020/) rather then Perl’s ‘There’s more than one way to do it’ (Programming Perl Third Edition, Larry Wall et al.). This is great, in theory – it leads to greater consistency between disparate programs and makes it easier for individual programmers to pick up someone else’s code.

The problem comes where the obvious way is not, for whatever reason, the practical way. For example, the obvious way to test if a string begins with another string is to use "string1".startswith("string2"). This is, however, significantly less performant than doing "string1"[7:] == "string2" which means when doing a large number of these tests you have to use this form, despite it not being the obvious method.

Unless you are familiar with Python’s sequence slicing syntax (http://docs.python.org/library/stdtypes.html#typesseq) I do not find "string1"[7:] to be obvious (even though I would expect most programmers to be able to hazard a, probably correct, guess to what it does), which means I would precede that line with a # Check if string1 starts with "string2". When I feel the need to comment on a specific line of code it is usually because I do not think what it is doing is sufficiently obvious, which means it violates Python’s ‘There should be one– and preferably only one –obvious way to do it’.

The problem with ‘middleware’

Both Python’s WSGI and Perl’s PSGI (and presumable Ruby’s Rack, but I have no experience of that) have a concept of ‘middleware’ which is part of a web application which sits between the server interface (WSGI or PSGI) and the application itself. This middleware can act as a filter or manipulate the environment (using PSGI’s terminology) before the application sees it. This makes it great for implementing features such as authorisation and sessions and indeed there are pre-built middlewares for both platforms which will do this.

The problem is that there is no standard for what parts of the environment get set by these useful middlewares which means (for the most part) they cannot be instantly swapped out for an alternative. I think what is needed is a simple definition of the bare minimum API for a given object (i.e. what Java would term an interface) and a defined location within the environment where the object will be found. Obviously objects could implement additional methods to provide bells-and-whistles specific to the implementation which applications can then use at the cost of no longer being able to do a straight swap out of the middleware.

For example a ‘session’ object might implement ‘get(key)’ and ‘set(key,value)’ could be found under ‘session’ in the environment hash. A ‘user’ object (as part of a larger authentication middleware) might implement ‘login_id’ and ‘roles’ attributes and be found under ‘auth.user’ in the environment hash.

An application developer would then be able to choose between just using the published interface standard or using some of the specific bells and whistles of a middleware. Even with the user of the extra features the amount of refactoring involved in switching between middlewares would be limited to just where the extra features had been used. It make switching from a generic authentication middleware to an in-house single signon solution very straight forward, for example.

Comments, suggestions?

Catalyst, Catalyst::Authentication::Credential::Remote and mod_perl

Mod_perl does not pass REMOTE_USER as an environment variable to Catalyst, so the Catalyst::Authentication::Credential::Remote plugin which allows convenient offloading of authentication to apache will not work.

This is easily solved by adding this code to the root auto function:

$c->req->remote_user($c->apache->user) if $c->can('apache');

Now I can offload authentication to apache’s mod_kerb whilst using catalyst for authorisation.

Windows FTS

Windows just helpfully rebooted itself (in order to install updates) because I did not spot that the “Rebooting in 5mins…” dialogue had appeared in the background. The last time is showed (10 minutes before) I only managed to catch it 3 seconds before it was about to restart the system. I find a number of things wrong with this:

1. A dialogue which will result in something happening if immediate action is not taken to prevent it (which is a bad idea for a dialogue in the first place) should draw as much attention as is (sensiabily) possible to itself. Being always on top would be a good start, as would flashing the taskbar. That way it could not be ignored and the user would have to clear the dialogue by selecting the “reboot now” or “postpone” buttons. (Note I said “on top” not “focused” – focus stealing is evil and even a critical dialogue like this should not practice this method of grabbing the user’s attention.)

2. Rebooting the system should require elevated privalidges (in my opinion – and if Windows update is already running with elevated pricalidges why am I not prompted to allow it to have these? I am when most Linux distros wish to install updates). What is the point in preventing programs from installing software, altering critical system files etc without my being harrassed by UAC if a process can reboot the system on a whim?

3. For the love of bob can IE please prompt me to save the tabs when a reboot is being forced upon me, in the same way as when I click on the big red ‘X’ in the top right? I not only lost the tabs I was still reading through, but also the contents of my shopping basket on an e-commerce site. As I now do not have the time to rebuild the contents of the basket (due to venting my frustration at my blog, which is far more fun ;) ) the site in question has lost a sale (for a few pounds shy of £100), thanks to Microsoft.

 On a completely different note: I am currently debating what UI convention (specifically related to button placement in dialogues, at the moment) to follow for a new web app I am thinking about developing in my own time. The choice boils down to:

1. Follow MS Windows’ convention, which is likely to be most familiar to the user

or

2. Follow GNOME style convention, which make logical sense and (following button labeling guidelines, as well as placement) significantly reduces the likelyhood of dialogues in which the choice can be ambiguous as to which button will perform which action.

The fact that this is a web app means that following a non-MS convention may be more easily accepted by unskilled (strictly in the sense of computer use) workers than if it were a stand-alone app which was designed to be used within a Windows environment.  On the flip-side, following MS’ convention would probably decrease the learning curve due to the user’s existing familiarity with Windows-style dialogues.

…and then there were two (posts)

Having survived another day at work, I’ve now gotten round to writing the final few things I missed off this mornings blog post.

One thing I forgot to mention this morning was that, although MSSQL deleted over 1,000 records from a table by a cascaded delete, the output says “4 rows affected” as only four were deleted from the first table. If a higher number had been reported anywhere in the output it might have allerted to us that there was a problem earlier than the customer calling support because their site no longer functioned correctly.

Rant aside, since my last blog post (in May, this is just an extension of this morning’s) my Grandfather, who was formerly a Commando and then a coal miner, died. He’d been ill for sometime but we did not expect him to die quite so suddenly. Fortunately he died peacfully, in A&E where he’d been taken after coughing up some blood at home.

Yesterday Pete wrote about a document on maintainable code he found at work. The document makes some very good points for writing “maintainable code”. However I would dispute the suggestion that “Every function should be most 20 lines of code”. The rule where I work is that a function should be the length necessary to perform its given task, no more and no less. Usually this means that the function will fall well within the 20 line limit suggested, however it is not uncommon for a complex function which performs a very specific task (such as manipulating the contents of a particular input file, from a manufacturer, to fit the database schema)  to be 100 or more lines in length. Setting a hard and fast limit on the length of a region of code, be it an if block, a function/method, a class, etc. is not, in my opinion, conducive to maintainable code.

Another interesting item I saw noted on Planet Compsoc was this BBC article about Lenovo (who made my wonderful T60) preparing to sell laptops with Linux pre-installed on them. At the bottom of the article it says “Analysts believe that approximately 6% of computers users run Linux, similar to the numbers choosing Apple Macs”. I find this fact extreemly interesting as the company I previously worked for, in the holidays, had a statistics analyiser (which I installed) for their web logs, which showed approximately 6% of visitors to the site used Linux. The Mac quotient of Visitors was significantly less than that, however, and a full 90% of Visitors used Windows XP. Another random fact I found interesting was that use of IE 7 and IE 6 to visit the site was evenly split at 45% each. It makes me wonder how many of those have IE 7 simply because Windows Automatic Updates have installed it for them, and how many of the IE 6 users only have that because they never run the Automatic Updates.

Finally; At christmas I undetook the task of re-writing the stock management system I had previously written for my then employer. The re-write was necessary as the system had started out as a very small and simple thing, which had then had bits and pieces botched onto it as and when my boss decided that it would be nifty to have feature X (or Y or, more commonly, X, Y and Z. By lunchtime.). The result, as always with projects which develop like this, was a hideous mess with, for some reason, worked. Until it stopped working. And then something would hit the fan and land on my desk.

As a result I decided to dump the hacked-to-death php code, and re-write it using an MVC framework. I settled on Rails as it promised great productivity and allowing the developer to concentrate on writing functionality while it worried about the nittity-gritty, such as interfacing with the database. I completely re-wrote a system which had taken over 2 years to develop in 3 months, and Rails did deliver on its promises. Since I’ve stuck to the (somewhat enforced) MVC seperation of the Rails framework adding functionality is a doddle, as is maintaining the code. I have, however, found a small flaw in my approach.

The rails URL scheme opperates on the theme of ‘[controller]/[action]/[id]’, where the controller is the name of the controller (duh!), action is the method within that controller which is being called (and is also the name of the view) and id is an identifier (intended for identifing a db record, for example). I am aware this can be hacked somewhat with the Rails cofiguration, but deviating from the intended path for such frameworks often leads to problems down the line when the framework developers decide to fundamentally change the framework such that these hacks no longer work as intended. Anyway, back to the URL scheme. This is all fine and dandy when I have a stock management system with a ‘browse’ controller, which has such actions as ‘list’, ‘view’, ‘pdflist’ and so on, and an ‘edit’ controller which (also) has a ‘list’, ‘edit’, ‘uploadimages’, ‘uploadpdf’ etc. . (I know it looks like the two list actions violated the DRY (Don’t repeat yourself) philosophy, but they operate in fundamentally different ways, the browse one only operates on a specific subset of the database limited, among other things, to just what is in stock.)

My problem is that, although this is fine for a stock management system, I also need to integrate the old parts management system in as well (on the old system this was a HORRIFIC kludge). There are two obvious solutions, neither of which I’m keen on. One is to create a ‘parts’ controller in the existing app, which contains ‘editlist’, ‘viewlist’, ‘edit’, ‘view’, ‘uploadphotos’ etc. . This could possibly extended to move all of the stock stuff into a ‘stock’ controller. I do not like this as it a) feels too much like bolting the thing on, like the old mess which I’m obviously keen to avoid recreating, and b) the controllers would then get very large and the maintainability provided by seperating out these systems will vanish. The second alternative is to create a seperate rails app to do the parts management. As I mentioned I’m trying to integrate these systems, so creating a seperate app for it seems like a bad move towards that end. It would also mean hacking the Rails config to not assume it is at the root url, and setting up the webserver to rewrite urls. It is all hassle I’d like to avoid.

I’m now wondering if I should have use Django instead, where a project (or site) is supposed to be a collection of apps and I suspect that, as a result, the integrated stock and parts management system would be a lot easier to realise. I’m now back into the realm of trying to justify, either way, another rewrite of the system. I will add that Rails has given me some major performance headaches, and I’ve had to re-write portions of my code to not use the Rails helper functions, which I view as bad, as my code now relies of certain aspects of the Rails framerwork not changing, where as the helper functions should (I would hope) be updated to reflect changes made in the future, in order to achieve something of the order of an acceptable performance.

IE is lame

This is an old draft which I never properly wrote up, since I’m unlikely to find the time in the near future to do it I’ve decided to just post as-is. At some point I may edit it to make it a proper post.
The plan:
http://meyerweb.com/eric/css/edge/popups/demo.html

The bug:
http://www.xs4all.nl/~peterned/csshover.html

The spec:
http://www.w3.org/TR/CSS21/selector.html#dynamic-pseudo-classes

Why SVN sucks:
http://svn.haxx.se/users/archive-2004-02/0982.shtml

The solution:
http://users.ox.ac.uk/~oliver/teleworking.html

More bugs in IE (5):
http://www.richinstyle.com/bugs/ie5.html#border

Some interesting links

Here’s some random URLs I thought might be interesting:

Browse Happy
A website explaining why Internet Explorer is unsafe for use on the web. Unlike most other websites of its kind it is not favoring any particular ‘alternative'(read: broadly safe) browser, but instead provides a list of alternatives and a possitive description of each.

sorttable: Make all your tables sortable
This site has a nifty looking piece of Java script which instantly allows any table on the web-page to be sorted by any column by defining it to be of the class ‘sortable’. Since this is Java-script the sorting is done client-side so no need to resubmit the page for re-ordering nor will it whore over the server its running on with lots of needless(at least as far as serving web-pages is concerned) sorts.

apt-get.org: Unofficial APT repositories
A place to share usefull (Unofficial) repositories for Debian.

Simple PHP Blog
The software my original Blog was using – no SQL needed, it’s all stored as text files. Easy to configure and update – just decompress and go. Fantastic! My only gripe is that most of the themes are fixed-width, and the only non fixed-width theme is not configurable wrt colours. Creating themes doesn’t appear to be very straight forward either, unfortunately. Maybe I’ll have to write my own blog software which only uses CSS for theming, so creating a new theme simply means modifying a CSS file… hmm, yet another project I’ll probably never finish.