Perl vs Python speed

I needed to write a quick script to find which mailboxes in “~/Mail” had unread messages in them. I decided to knock it up in Python, but the script was not performing very well:

> time ./checkmail
...
./checkmail 56.17s user 1.86s system 98% cpu 59.181 total

A quick google found a Perl program which did pretty much the same thing (at http://www.perlmonks.org/?node_id=552218 if you’re interested) which ran, unaltered, on the exact same files performed significantly better:

> time perl checkmail2
...
perl checkmail2 16.66s user 1.27s system 99% cpu 18.043 total

Not only did the Perl version take about 1/3 of the time of the Python implementation but it also counted the number of unread messages and displayed it, while my simple Python script break’d on the first match to avoid needlessly looping over the rest of the messages. The Perl version is managing not to fully decode every message through the use of the Mail::MboxParser library however I could not find a way to achieve the same result in a straightforward manner with the standard Python libraries. Indeed, looking at the documented examples in the python docs, http://docs.python.org/library/mailbox.html#examples, it appears this is the suggested way of doing it (in essence all I need to do is examine the ‘status’ header, the example which uses a very similar loop to just examine the ‘subject’ header).

My Python script is here:

#!/usr/bin/env python

import mailbox
from os import listdir, environ, system

MAILHOME=environ['HOME'] + '/Mail'

new_mail_mailboxes=[]
for file in listdir(MAILHOME):
	for message in mailbox.mbox(MAILHOME + '/' + file):
		if message['status']:
			new_mail_mailboxes.append(file)
			break
print "\n".join(new_mail_mailboxes)

Catalyst, Catalyst::Authentication::Credential::Remote and mod_perl

Mod_perl does not pass REMOTE_USER as an environment variable to Catalyst, so the Catalyst::Authentication::Credential::Remote plugin which allows convenient offloading of authentication to apache will not work.

This is easily solved by adding this code to the root auto function:

$c->req->remote_user($c->apache->user) if $c->can('apache');

Now I can offload authentication to apache’s mod_kerb whilst using catalyst for authorisation.

Migrating from fastcgi to cgi

I’ve just migrated all of the sites on my VMS (including this blog) from running on a fastcgi backend to running on a cgi backend. Why, you may ask. Well, although fastcgi is substantially more responsive than plain old cgi (since it keeps the processes running between requests so the process start-up and take-down times are removed) it consumes much more memory (due to keeping the processes around). Nowadays this is not normally a problem but on a virtual machine with only 128MB available and no swap memory usage becomes a big issue.

By moving from fastcgi to “normal” cgi and tweaking my mysql config I have increased the free memory when the machine is idle from 4MB to 80MB and now have lots of headroom if a single process (e.g. DenyHosts, which is idling at 20MB) decides it wants loads of ram before the Linux low memory reaper starts killing processes.

As a slight aside, the bigest memory hogs on my box are Python programs (DenyHosts [~20M] and rss2email [~45M]). I don’t know it this is a fault with Python or the way the scripts are written (or both, not being intimately familiar with Python I don’t know if it encourages conservative use of memory or not).

Coping when languages change their api between minor versions

I have a rails application(yes, I know, I’ve been regretting it for some time), which I wrote over 2 years ago using the (then) stable rails version of 1.2.3 and whatever ruby version was around at the time (Debian Sarge was the OS of choice for the server). Then Debian Etch was released, so I dist-upgraded to that including the new version of ruby (1.8.5) without any major headaches. Now Lenny is the stable version but the version of rails I used originally does not work with the current version of ruby (1.8.7) because, apparently, `[]’ is no longer a valid method for ‘Enumerable::Enumerator’. This error is thrown in the rails libraries themselves, not my code.

There are two obvious solutions, upgrade the version of rails (which involves re-writing large portions of my code due to api changes in rails, rails is now at v2.3 (http://rubyonrails.org/)) or stick with the old version of rails and install an older version of ruby.

I did the latter. Originally I did this by <cringe>holding back the version of ruby, and its dependencies when doing the dist-upgrade from Etch to Lenny</cringe>. (I appologise for the kittens that were inevitably killed by me doing this!) This did, despite the horribleness(is that a word?) of the method, work.

Today I am installing a new server. Not only am I installing Lenny, which makes manually going to fetch the old versions a pain, but the new box is “amd64″ (it’s an Intel, actually, so x86_64 is more accurate but Debian refers to is as amd64) so I can’t just steal the packages from the cache on the old box. Thankfully all this means that I have been forced to install the old version in some sort of sane manner, by installing Etch in a chroot and calling the old rails app from there. Here’s the steps I took:
(prerequisits: debootstrap dchroot)

# mkdir -p /var/chroot/etch # Make the new chroot directory
# debootstrap --arch amd64 etch /var/chroot/etch http://ftp.uk.debian.org/debian/
# mkdir -p /var/chroot/etch/var/rails # Where the rails app is going to live (well, it'll actually live outside the chroot and be mounted here)
# mkdir -p /var/chroot/etch/XXXXXXX # removed to protect the innocent
# mkdir -p /var/chroot/etch/var/run/mysqld # this will be bound outside the chroot so that rails can access the mysql instance on the server

I added the following to /etc/fstab, and mounted them:

# Etch chroot (for rails)
/proc /var/chroot/etch/proc none rw,bind 0 0
/tmp /var/chroot/etch/tmp none rw,bind 0 0
/dev /var/chroot/etch/dev none rw,bind 0 0
/var/rails /var/chroot/etch/var/rails none rw,bind 0 0
XXXXXXX /var/chroot/etch/XXXXXXX none rw,bind 0 0
/var/run/mysqld /var/chroot/etch/var/run/mysqld none rw,bind 0 0

From here I could enter the chroot and install some needed applications:

# chroot /var/chroot/etch
# apt-get install ruby
# apt-get install rubygems
# apt-get install libfcgi-ruby
# apt-get install rake
# gem install -y -v=1.2.3 rails
# gem install -y pdf-writer

Then I can configure dchroot by adding this to /etc/schroot/schroot.conf:

[etch]
description=Debian etch (oldstable) for rails
location=/var/chroot/etch
groups=www-data

And finally a quick change to the lighttpd config which runs the fcgi program:
Old:

"bin-path" => "/var/rails/$app/public/dispatch.fcgi",

New:

"bin-path" => "/usr/bin/dchroot -c etch -d -q /var/rails/$app/public/dispatch.fcgi",

and it all works quite nicely. Now I have a stable Lenny system which I can keep up to date and an etch chroot for the legacy code.

Spot the deliberate mistake…

I just returned from tea to find a computer, which I’d left copying a large amount of data (~300GB worth of backups) from one raid arrray to another, displaying a friendly message:
cp: cannot create directory `./cpool/0': No space left on device.

In order to get to this state I’d done the following:

# mdadm --create -n3 -x1 /dev/md2 /dev/sdc1 /dev/sdd1 /dev/sdf1 missing
(several steps to setup lvm on and format the new device)
# cd /var/lib/backuppc
(some more steps, including adding the new mountpoint to /etc/fstab)
# mount .
# cp -a /mnt/oldbackuppc/pool ./pool
# cp -a /mnt/oldbackuppc/cpool ./cpool

In case you’ve not spotted it, after `mount`ing ‘.’ I needed to `chdir .` to get onto the new mount. As it was I was still on the device that /var is on (~40G) not my nice new 1.8TB raid device!

Kerberised authenticated printing to Windows printers within a 2003 active directory with smbclient

I needed to print to a printer shared via a Windows Server 2003 print server from my GNU/Linux box. Allegedly this should be possible using smbspool, which is provided by samba as a cups back-end to print to such devices. I spent some time looking at it, I was unable to hit upon the right incantation to make it do this. In the end I wrote a short script which uses smbclient and given user’s Kerberos ticket to authenticate, based upon a similar script which used stored credentials to print (to avoid putting username and password in the device URI in CUPS).


#!/bin/bash

# CCLAH 30-March-2009
# Kerberised CUPS printing
# Based upon http://willem.engen.nl/projects/cupssmb/smbc (http://willem.engen.nl/projects/cupssmb/)

if [ "$1" = "" ]; then
# list supported output types
echo 'network smbc "Unknown" "Windows Printer using smbclient"'
exit 0
fi

job="$1"
account="$2"
title="$3"
numcopies="$4"
options="$5"
filename="$6"

if [ "$filename" = "" ]; then
filename=-
fi

# strip protocol from printer
printer=`echo "${DEVICE_URI}" | sed 's/^.*://'`

# Obtain the user's id in order to determine the kerberos cache file name
uid=`id -u $account`

echo "NOTICE: Account: $account uid: $uid" 1>&2

# and print using smbclient
echo "NOTICE: KRB5CCNAME=/tmp/krb5cc_$uid smbclient -k -c \"print ${filename}\" \"${printer}\"" 1>&2

errtxt=`KRB5CCNAME=/tmp/krb5cc_$uid smbclient -k -c "print ${filename}" "${printer}" 2>&1`
ret=${?}

echo "NOTICE: Return value: $ret" 1>&2

#
# Handle errors
# see backend(7) for error codes

# log message
if [ "$ret" = "0" ]; then
echo "$errtxt" | sed 's/^/NOTICE: /' 1>&2
else
echo "$errtxt" | sed 's/^/ERROR: /' 1>&2
fi

# "NT_STATUS_LOGON_FAILURE" -> CUPS_BACKEND_AUTH_REQUIRED
echo "$errtxt" | grep -i 'LOGON_FAILURE' >/dev/null && exit 2
# "NT_STATUS_BAD_NETWORK_NAME" -> CUPS_BACKEND_STOP
echo "$errtxt" | grep -i 'BAD_NETWORK_NAME' >/dev/null && exit 4

# something went wrong, don't know what -> CUPS_BACKEND_FAILED
[ "$ret" != "0" ] && exit 1

echo "NOTICE: Everything OK"

# success! -> CUPS_BACKEND_OK
exit 0

To use: (at least on my Debian box) save as ‘/usr/lib/cups/backend/smbc’ then add a “Windows Printer using smbclient” type printer with a URI of ‘smbc:/// ‘ and the appropriate driver for the printer at the other end. The only problem is, at the moment, if smbclient fails then the script exits status 1 and the cups print queue enters a stopped state which means that a given (unprivileged) user could theoretically craft a print job which would stop printing working for all users on the local machine.

Setting up a Windows-only network printer for Linux/Mac access.

Due to a lack of a sensible place for this documentation, it’s going on my blog:

We have a Canon MF5770 printer/fax machine which only has Windows drivers. I’m only interested in printing, not faxing, and have the following workaround (which requires a Windows XP machine as a go-between):

Install necessary software:

  1. Install Ghostscript, GSview and RedMon from the Ghostscript site
  2. Install “Print services for Unix” (under “Other Network File and Print Services” in Add/Remove windows components.

Add a Postscript printer to work as a go-between:

  1. Start “add printer” wizard
  2. Choose “local printer” and untick “Automatically detect my plug and play printer”
  3. Choose “Create a new port” and the “Redirected Port” type
  4. Use RPT1: as the port name
  5. Choose a Postscript printer – I choose HP Color Laserjet 4550 PS
  6. Call the printer something without a space (I used “mf5770gs”), and no to default
  7. I shared the printer
  8. No to test page
  9. Open the port settings for RPT1: (printer properties – Ports – Port Setttings)
  10. Redirect this port to program: “C:\Program Files\Ghostgum\gsview\gsprint.exe”
  11. Arguments for this program: “-printer “Canon MF5700 Series” -color -“
  12. Run: Hidden

Print a test page – if all is well it should come out.

I did have to add TCP port 515 to the XP firewall exceptions for it to work, but apart from that the system works flawlessly.

SELinux

One of my colleagues gave me a VMWare image to use to test authenticating Linux (CentOS in this case) with Active Directory. Unfortunately the image in question is about 10GB and after the existing images on the machine there was not enough for it in /var (a 17GB partition on a 20GB disk). As I could not find any more space on the existing drive I clearly needed to add another disk to the machine. Three dead disks later I finally found a (250GB! – effectively winning the hard-drive lottery) hard disk which no-one was using. Now just to move /var to the new disk.

I partitioned and formatted an 80GB partition, mounted it and copied the existing contents of /var accross. One edit to fstab later, I rebooted. The disk mounted fine, but various services refused to initialise with “permission denied” errors on /var. I checked the permissions against the old /var and they appeared to be identical. Some head scratching later I decided to go an ask the advice of one of my colleagues. He was equally bemused, but suggested that I tar up the old /var and untar it over the new partition incase the copy had not preserved the permissions (even though it had been told to, and they appeared to be correct). I did this however it had no effect. When I returned to my colleagues office, another one of my colleagues was talking to the first and the first suggested that he take a look. He had a quick glance at the problem and asked if SELinux was enabled. It was. One quick `restorecon -R /var` later everything worked. We then proceded to have a rant from colleague #2 about how Fedora and RHEL now had SELinux in enforcing mode by default where as it used to just warn by default, which was better in a production environment where it needs to be run in warning only mode for a while to check nothing is hitting it that should be allowed. Still it is all good fun.

…and then there were two (posts)

Having survived another day at work, I’ve now gotten round to writing the final few things I missed off this mornings blog post.

One thing I forgot to mention this morning was that, although MSSQL deleted over 1,000 records from a table by a cascaded delete, the output says “4 rows affected” as only four were deleted from the first table. If a higher number had been reported anywhere in the output it might have allerted to us that there was a problem earlier than the customer calling support because their site no longer functioned correctly.

Rant aside, since my last blog post (in May, this is just an extension of this morning’s) my Grandfather, who was formerly a Commando and then a coal miner, died. He’d been ill for sometime but we did not expect him to die quite so suddenly. Fortunately he died peacfully, in A&E where he’d been taken after coughing up some blood at home.

Yesterday Pete wrote about a document on maintainable code he found at work. The document makes some very good points for writing “maintainable code”. However I would dispute the suggestion that “Every function should be most 20 lines of code”. The rule where I work is that a function should be the length necessary to perform its given task, no more and no less. Usually this means that the function will fall well within the 20 line limit suggested, however it is not uncommon for a complex function which performs a very specific task (such as manipulating the contents of a particular input file, from a manufacturer, to fit the database schema)  to be 100 or more lines in length. Setting a hard and fast limit on the length of a region of code, be it an if block, a function/method, a class, etc. is not, in my opinion, conducive to maintainable code.

Another interesting item I saw noted on Planet Compsoc was this BBC article about Lenovo (who made my wonderful T60) preparing to sell laptops with Linux pre-installed on them. At the bottom of the article it says “Analysts believe that approximately 6% of computers users run Linux, similar to the numbers choosing Apple Macs”. I find this fact extreemly interesting as the company I previously worked for, in the holidays, had a statistics analyiser (which I installed) for their web logs, which showed approximately 6% of visitors to the site used Linux. The Mac quotient of Visitors was significantly less than that, however, and a full 90% of Visitors used Windows XP. Another random fact I found interesting was that use of IE 7 and IE 6 to visit the site was evenly split at 45% each. It makes me wonder how many of those have IE 7 simply because Windows Automatic Updates have installed it for them, and how many of the IE 6 users only have that because they never run the Automatic Updates.

Finally; At christmas I undetook the task of re-writing the stock management system I had previously written for my then employer. The re-write was necessary as the system had started out as a very small and simple thing, which had then had bits and pieces botched onto it as and when my boss decided that it would be nifty to have feature X (or Y or, more commonly, X, Y and Z. By lunchtime.). The result, as always with projects which develop like this, was a hideous mess with, for some reason, worked. Until it stopped working. And then something would hit the fan and land on my desk.

As a result I decided to dump the hacked-to-death php code, and re-write it using an MVC framework. I settled on Rails as it promised great productivity and allowing the developer to concentrate on writing functionality while it worried about the nittity-gritty, such as interfacing with the database. I completely re-wrote a system which had taken over 2 years to develop in 3 months, and Rails did deliver on its promises. Since I’ve stuck to the (somewhat enforced) MVC seperation of the Rails framework adding functionality is a doddle, as is maintaining the code. I have, however, found a small flaw in my approach.

The rails URL scheme opperates on the theme of ‘[controller]/[action]/[id]’, where the controller is the name of the controller (duh!), action is the method within that controller which is being called (and is also the name of the view) and id is an identifier (intended for identifing a db record, for example). I am aware this can be hacked somewhat with the Rails cofiguration, but deviating from the intended path for such frameworks often leads to problems down the line when the framework developers decide to fundamentally change the framework such that these hacks no longer work as intended. Anyway, back to the URL scheme. This is all fine and dandy when I have a stock management system with a ‘browse’ controller, which has such actions as ‘list’, ‘view’, ‘pdflist’ and so on, and an ‘edit’ controller which (also) has a ‘list’, ‘edit’, ‘uploadimages’, ‘uploadpdf’ etc. . (I know it looks like the two list actions violated the DRY (Don’t repeat yourself) philosophy, but they operate in fundamentally different ways, the browse one only operates on a specific subset of the database limited, among other things, to just what is in stock.)

My problem is that, although this is fine for a stock management system, I also need to integrate the old parts management system in as well (on the old system this was a HORRIFIC kludge). There are two obvious solutions, neither of which I’m keen on. One is to create a ‘parts’ controller in the existing app, which contains ‘editlist’, ‘viewlist’, ‘edit’, ‘view’, ‘uploadphotos’ etc. . This could possibly extended to move all of the stock stuff into a ‘stock’ controller. I do not like this as it a) feels too much like bolting the thing on, like the old mess which I’m obviously keen to avoid recreating, and b) the controllers would then get very large and the maintainability provided by seperating out these systems will vanish. The second alternative is to create a seperate rails app to do the parts management. As I mentioned I’m trying to integrate these systems, so creating a seperate app for it seems like a bad move towards that end. It would also mean hacking the Rails config to not assume it is at the root url, and setting up the webserver to rewrite urls. It is all hassle I’d like to avoid.

I’m now wondering if I should have use Django instead, where a project (or site) is supposed to be a collection of apps and I suspect that, as a result, the integrated stock and parts management system would be a lot easier to realise. I’m now back into the realm of trying to justify, either way, another rewrite of the system. I will add that Rails has given me some major performance headaches, and I’ve had to re-write portions of my code to not use the Rails helper functions, which I view as bad, as my code now relies of certain aspects of the Rails framerwork not changing, where as the helper functions should (I would hope) be updated to reflect changes made in the future, in order to achieve something of the order of an acceptable performance.

It’s been a while…

I’ve not posted to my blog since the end of May, so after two-and-a-bit months it’s high time wrote something.

Whilst I’ve not been writing, I’ve also not been checking the comments. Due to the amount of spam, I require all comments to be approved by me before appearing on the site, so appologies to all the people who had comments stuck in moderation.

I’ve now been working in my new job for 2 months and it is generally okay. Windows, VisualStudio (2003) and Sourcesafe are all colluding to slowly drive me insane but for the time being I’m keeping the urge to take a Linux LiveCD into work at bay with healthy doses of Ruby and Debian in the evenings.

The one major cock-up I’ve made at work was a MS-SQL script to delete four rows from a table. Another, related, table had been corrupted and every row had been altered to point to the same (one) record in the first table. I had written a script to delete four faulty record and then fix the data in the associated table. Since I was deleting data I, as I make a point to always do, only used the primary key column of the table I was deleting from to ensure only the specific record which needed deleting was dropped. Unfortunately I was not aware of SQLServers ability to cascade delete record, nor was I aware that this feature was in use of the tabels in question. As a result the related table ended up with nothing in it. Whoops! We are waiting for the backup tape to be sent from Derby to Nottingham in order to restore the data to a point before the script was run. Fortunately all scripts which are run on live database servers have to be peer-reviewed, both for syntactic correctness and that they perform the task intended, before they are run so I have someone to share the blame with. I am, as the script writer, ultimately responsible for this mistake (through my own ignorance) however my colleague who reviewed the script should have been aware of the cascade delete and he did not spot the potential problem either. Nevermind.

For the past week I have also been shadowing another colleague who is left the company yesterday to learn about the systems where he was the only person with any knowledge. Last night hosting services, in their infinite wisdom, decided to move all of the servers involved in these systems from one location to an entirely different part of the country. The one thing that could possibly break everything should have now been performed the very night after the last day of the only person who knew these systems! Go go gadget forward planning.
I have a number of other things to write about, but I have to go to work early today in order to be there should the server move cause any problems. Maybe I’ll find time to write some more tonight (I wouldn’t hold your breath, though).