Catalyst, Catalyst::Authentication::Credential::Remote and mod_perl

Mod_perl does not pass REMOTE_USER as an environment variable to Catalyst, so the Catalyst::Authentication::Credential::Remote plugin which allows convenient offloading of authentication to apache will not work.

This is easily solved by adding this code to the root auto function:

$c->req->remote_user($c->apache->user) if $c->can('apache');

Now I can offload authentication to apache’s mod_kerb whilst using catalyst for authorisation.

Coping when languages change their api between minor versions

I have a rails application(yes, I know, I’ve been regretting it for some time), which I wrote over 2 years ago using the (then) stable rails version of 1.2.3 and whatever ruby version was around at the time (Debian Sarge was the OS of choice for the server). Then Debian Etch was released, so I dist-upgraded to that including the new version of ruby (1.8.5) without any major headaches. Now Lenny is the stable version but the version of rails I used originally does not work with the current version of ruby (1.8.7) because, apparently, `[]’ is no longer a valid method for ‘Enumerable::Enumerator’. This error is thrown in the rails libraries themselves, not my code.

There are two obvious solutions, upgrade the version of rails (which involves re-writing large portions of my code due to api changes in rails, rails is now at v2.3 (http://rubyonrails.org/)) or stick with the old version of rails and install an older version of ruby.

I did the latter. Originally I did this by <cringe>holding back the version of ruby, and its dependencies when doing the dist-upgrade from Etch to Lenny</cringe>. (I appologise for the kittens that were inevitably killed by me doing this!) This did, despite the horribleness(is that a word?) of the method, work.

Today I am installing a new server. Not only am I installing Lenny, which makes manually going to fetch the old versions a pain, but the new box is “amd64″ (it’s an Intel, actually, so x86_64 is more accurate but Debian refers to is as amd64) so I can’t just steal the packages from the cache on the old box. Thankfully all this means that I have been forced to install the old version in some sort of sane manner, by installing Etch in a chroot and calling the old rails app from there. Here’s the steps I took:
(prerequisits: debootstrap dchroot)

# mkdir -p /var/chroot/etch # Make the new chroot directory
# debootstrap --arch amd64 etch /var/chroot/etch http://ftp.uk.debian.org/debian/
# mkdir -p /var/chroot/etch/var/rails # Where the rails app is going to live (well, it'll actually live outside the chroot and be mounted here)
# mkdir -p /var/chroot/etch/XXXXXXX # removed to protect the innocent
# mkdir -p /var/chroot/etch/var/run/mysqld # this will be bound outside the chroot so that rails can access the mysql instance on the server

I added the following to /etc/fstab, and mounted them:

# Etch chroot (for rails)
/proc /var/chroot/etch/proc none rw,bind 0 0
/tmp /var/chroot/etch/tmp none rw,bind 0 0
/dev /var/chroot/etch/dev none rw,bind 0 0
/var/rails /var/chroot/etch/var/rails none rw,bind 0 0
XXXXXXX /var/chroot/etch/XXXXXXX none rw,bind 0 0
/var/run/mysqld /var/chroot/etch/var/run/mysqld none rw,bind 0 0

From here I could enter the chroot and install some needed applications:

# chroot /var/chroot/etch
# apt-get install ruby
# apt-get install rubygems
# apt-get install libfcgi-ruby
# apt-get install rake
# gem install -y -v=1.2.3 rails
# gem install -y pdf-writer

Then I can configure dchroot by adding this to /etc/schroot/schroot.conf:

[etch]
description=Debian etch (oldstable) for rails
location=/var/chroot/etch
groups=www-data

And finally a quick change to the lighttpd config which runs the fcgi program:
Old:

"bin-path" => "/var/rails/$app/public/dispatch.fcgi",

New:

"bin-path" => "/usr/bin/dchroot -c etch -d -q /var/rails/$app/public/dispatch.fcgi",

and it all works quite nicely. Now I have a stable Lenny system which I can keep up to date and an etch chroot for the legacy code.

Spot the deliberate mistake…

I just returned from tea to find a computer, which I’d left copying a large amount of data (~300GB worth of backups) from one raid arrray to another, displaying a friendly message:
cp: cannot create directory `./cpool/0': No space left on device.

In order to get to this state I’d done the following:

# mdadm --create -n3 -x1 /dev/md2 /dev/sdc1 /dev/sdd1 /dev/sdf1 missing
(several steps to setup lvm on and format the new device)
# cd /var/lib/backuppc
(some more steps, including adding the new mountpoint to /etc/fstab)
# mount .
# cp -a /mnt/oldbackuppc/pool ./pool
# cp -a /mnt/oldbackuppc/cpool ./cpool

In case you’ve not spotted it, after `mount`ing ‘.’ I needed to `chdir .` to get onto the new mount. As it was I was still on the device that /var is on (~40G) not my nice new 1.8TB raid device!

Kerberised authenticated printing to Windows printers within a 2003 active directory with smbclient

I needed to print to a printer shared via a Windows Server 2003 print server from my GNU/Linux box. Allegedly this should be possible using smbspool, which is provided by samba as a cups back-end to print to such devices. I spent some time looking at it, I was unable to hit upon the right incantation to make it do this. In the end I wrote a short script which uses smbclient and given user’s Kerberos ticket to authenticate, based upon a similar script which used stored credentials to print (to avoid putting username and password in the device URI in CUPS).


#!/bin/bash

# CCLAH 30-March-2009
# Kerberised CUPS printing
# Based upon http://willem.engen.nl/projects/cupssmb/smbc (http://willem.engen.nl/projects/cupssmb/)

if [ "$1" = "" ]; then
# list supported output types
echo 'network smbc "Unknown" "Windows Printer using smbclient"'
exit 0
fi

job="$1"
account="$2"
title="$3"
numcopies="$4"
options="$5"
filename="$6"

if [ "$filename" = "" ]; then
filename=-
fi

# strip protocol from printer
printer=`echo "${DEVICE_URI}" | sed 's/^.*://'`

# Obtain the user's id in order to determine the kerberos cache file name
uid=`id -u $account`

echo "NOTICE: Account: $account uid: $uid" 1>&2

# and print using smbclient
echo "NOTICE: KRB5CCNAME=/tmp/krb5cc_$uid smbclient -k -c \"print ${filename}\" \"${printer}\"" 1>&2

errtxt=`KRB5CCNAME=/tmp/krb5cc_$uid smbclient -k -c "print ${filename}" "${printer}" 2>&1`
ret=${?}

echo "NOTICE: Return value: $ret" 1>&2

#
# Handle errors
# see backend(7) for error codes

# log message
if [ "$ret" = "0" ]; then
echo "$errtxt" | sed 's/^/NOTICE: /' 1>&2
else
echo "$errtxt" | sed 's/^/ERROR: /' 1>&2
fi

# "NT_STATUS_LOGON_FAILURE" -> CUPS_BACKEND_AUTH_REQUIRED
echo "$errtxt" | grep -i 'LOGON_FAILURE' >/dev/null && exit 2
# "NT_STATUS_BAD_NETWORK_NAME" -> CUPS_BACKEND_STOP
echo "$errtxt" | grep -i 'BAD_NETWORK_NAME' >/dev/null && exit 4

# something went wrong, don't know what -> CUPS_BACKEND_FAILED
[ "$ret" != "0" ] && exit 1

echo "NOTICE: Everything OK"

# success! -> CUPS_BACKEND_OK
exit 0

To use: (at least on my Debian box) save as ‘/usr/lib/cups/backend/smbc’ then add a “Windows Printer using smbclient” type printer with a URI of ‘smbc:/// ‘ and the appropriate driver for the printer at the other end. The only problem is, at the moment, if smbclient fails then the script exits status 1 and the cups print queue enters a stopped state which means that a given (unprivileged) user could theoretically craft a print job which would stop printing working for all users on the local machine.

Vista has got to go.

I’ve finally reached the end of my teather with Vista and it has got to go. Why? Not because of UAC, the constant RAM useage of 900MB with nothing else running, the rebooting without prompting me to save whenever windows update feals like it or the fact the a third of my PC games won’t run on it. It’s going because I played two games of minesweeper last night and both times minesweeper crashed (“Minesweeper is not responding”), before I got to the end of the game, in response to me doing nothing more than right-clicking on a square to mark it as a mine (and yes, I did let it send the crash reports off to MS).

On a not entirely unrelated note I spent Monday afternoon slipstreaming SP3 and last Thursday’s emergency hot fix into an OEM XP Pro CD (having discovered I didn’t have an OEM disk at home after taking my boss’s laptop home to reinstall it) using nLite. It was surprising easy and having a CD into which I only have to type the product key and the owner information to get a properly localised British XP install was a very nice experience. One thing I don’t understand is why Microsoft cannot supply localised install disks in the first place. I appreciate that there would be additional cost in producing the different disks but Microsoft is large enough and should be profitable enough to be able to do that. Failing that they could always re-write the installer so that once I tell it I’m in the UK it automatically sets they keyboard, timezone and language preferences (like most Linux distributions) rather that me having to change it in 5 different places.

Oh, incidentally, I am still alive ;) .

SELinux

One of my colleagues gave me a VMWare image to use to test authenticating Linux (CentOS in this case) with Active Directory. Unfortunately the image in question is about 10GB and after the existing images on the machine there was not enough for it in /var (a 17GB partition on a 20GB disk). As I could not find any more space on the existing drive I clearly needed to add another disk to the machine. Three dead disks later I finally found a (250GB! – effectively winning the hard-drive lottery) hard disk which no-one was using. Now just to move /var to the new disk.

I partitioned and formatted an 80GB partition, mounted it and copied the existing contents of /var accross. One edit to fstab later, I rebooted. The disk mounted fine, but various services refused to initialise with “permission denied” errors on /var. I checked the permissions against the old /var and they appeared to be identical. Some head scratching later I decided to go an ask the advice of one of my colleagues. He was equally bemused, but suggested that I tar up the old /var and untar it over the new partition incase the copy had not preserved the permissions (even though it had been told to, and they appeared to be correct). I did this however it had no effect. When I returned to my colleagues office, another one of my colleagues was talking to the first and the first suggested that he take a look. He had a quick glance at the problem and asked if SELinux was enabled. It was. One quick `restorecon -R /var` later everything worked. We then proceded to have a rant from colleague #2 about how Fedora and RHEL now had SELinux in enforcing mode by default where as it used to just warn by default, which was better in a production environment where it needs to be run in warning only mode for a while to check nothing is hitting it that should be allowed. Still it is all good fun.

About time for another post.

Having written nothing on my blog since the 1st of September I feel it’s about time to flex my inability to spell (made worse by the fact that my sister has stolen“borrowed” the dictionary I keep by my computer) again and write something.

Since my last post:

  • I have quit my job.
  • I have started a new job (“IT Services Specialist”) at Loughborough University, part time.
  • I have returned to my job at Startin Tractors for the other half of the week.
  • my sister has been moved to a more secure secure ward – she’s now locked up with the likes of mentally ill prisoners.

Still it’s all good.

I only wish I had something interesting to put here, but I can not think of anything so instead I’m just going to provide a link to http://bash.org, to annoy anyone trying to work at this point.

…and then there were two (posts)

Having survived another day at work, I’ve now gotten round to writing the final few things I missed off this mornings blog post.

One thing I forgot to mention this morning was that, although MSSQL deleted over 1,000 records from a table by a cascaded delete, the output says “4 rows affected” as only four were deleted from the first table. If a higher number had been reported anywhere in the output it might have allerted to us that there was a problem earlier than the customer calling support because their site no longer functioned correctly.

Rant aside, since my last blog post (in May, this is just an extension of this morning’s) my Grandfather, who was formerly a Commando and then a coal miner, died. He’d been ill for sometime but we did not expect him to die quite so suddenly. Fortunately he died peacfully, in A&E where he’d been taken after coughing up some blood at home.

Yesterday Pete wrote about a document on maintainable code he found at work. The document makes some very good points for writing “maintainable code”. However I would dispute the suggestion that “Every function should be most 20 lines of code”. The rule where I work is that a function should be the length necessary to perform its given task, no more and no less. Usually this means that the function will fall well within the 20 line limit suggested, however it is not uncommon for a complex function which performs a very specific task (such as manipulating the contents of a particular input file, from a manufacturer, to fit the database schema)  to be 100 or more lines in length. Setting a hard and fast limit on the length of a region of code, be it an if block, a function/method, a class, etc. is not, in my opinion, conducive to maintainable code.

Another interesting item I saw noted on Planet Compsoc was this BBC article about Lenovo (who made my wonderful T60) preparing to sell laptops with Linux pre-installed on them. At the bottom of the article it says “Analysts believe that approximately 6% of computers users run Linux, similar to the numbers choosing Apple Macs”. I find this fact extreemly interesting as the company I previously worked for, in the holidays, had a statistics analyiser (which I installed) for their web logs, which showed approximately 6% of visitors to the site used Linux. The Mac quotient of Visitors was significantly less than that, however, and a full 90% of Visitors used Windows XP. Another random fact I found interesting was that use of IE 7 and IE 6 to visit the site was evenly split at 45% each. It makes me wonder how many of those have IE 7 simply because Windows Automatic Updates have installed it for them, and how many of the IE 6 users only have that because they never run the Automatic Updates.

Finally; At christmas I undetook the task of re-writing the stock management system I had previously written for my then employer. The re-write was necessary as the system had started out as a very small and simple thing, which had then had bits and pieces botched onto it as and when my boss decided that it would be nifty to have feature X (or Y or, more commonly, X, Y and Z. By lunchtime.). The result, as always with projects which develop like this, was a hideous mess with, for some reason, worked. Until it stopped working. And then something would hit the fan and land on my desk.

As a result I decided to dump the hacked-to-death php code, and re-write it using an MVC framework. I settled on Rails as it promised great productivity and allowing the developer to concentrate on writing functionality while it worried about the nittity-gritty, such as interfacing with the database. I completely re-wrote a system which had taken over 2 years to develop in 3 months, and Rails did deliver on its promises. Since I’ve stuck to the (somewhat enforced) MVC seperation of the Rails framework adding functionality is a doddle, as is maintaining the code. I have, however, found a small flaw in my approach.

The rails URL scheme opperates on the theme of ‘[controller]/[action]/[id]’, where the controller is the name of the controller (duh!), action is the method within that controller which is being called (and is also the name of the view) and id is an identifier (intended for identifing a db record, for example). I am aware this can be hacked somewhat with the Rails cofiguration, but deviating from the intended path for such frameworks often leads to problems down the line when the framework developers decide to fundamentally change the framework such that these hacks no longer work as intended. Anyway, back to the URL scheme. This is all fine and dandy when I have a stock management system with a ‘browse’ controller, which has such actions as ‘list’, ‘view’, ‘pdflist’ and so on, and an ‘edit’ controller which (also) has a ‘list’, ‘edit’, ‘uploadimages’, ‘uploadpdf’ etc. . (I know it looks like the two list actions violated the DRY (Don’t repeat yourself) philosophy, but they operate in fundamentally different ways, the browse one only operates on a specific subset of the database limited, among other things, to just what is in stock.)

My problem is that, although this is fine for a stock management system, I also need to integrate the old parts management system in as well (on the old system this was a HORRIFIC kludge). There are two obvious solutions, neither of which I’m keen on. One is to create a ‘parts’ controller in the existing app, which contains ‘editlist’, ‘viewlist’, ‘edit’, ‘view’, ‘uploadphotos’ etc. . This could possibly extended to move all of the stock stuff into a ‘stock’ controller. I do not like this as it a) feels too much like bolting the thing on, like the old mess which I’m obviously keen to avoid recreating, and b) the controllers would then get very large and the maintainability provided by seperating out these systems will vanish. The second alternative is to create a seperate rails app to do the parts management. As I mentioned I’m trying to integrate these systems, so creating a seperate app for it seems like a bad move towards that end. It would also mean hacking the Rails config to not assume it is at the root url, and setting up the webserver to rewrite urls. It is all hassle I’d like to avoid.

I’m now wondering if I should have use Django instead, where a project (or site) is supposed to be a collection of apps and I suspect that, as a result, the integrated stock and parts management system would be a lot easier to realise. I’m now back into the realm of trying to justify, either way, another rewrite of the system. I will add that Rails has given me some major performance headaches, and I’ve had to re-write portions of my code to not use the Rails helper functions, which I view as bad, as my code now relies of certain aspects of the Rails framerwork not changing, where as the helper functions should (I would hope) be updated to reflect changes made in the future, in order to achieve something of the order of an acceptable performance.

It’s been a while…

I’ve not posted to my blog since the end of May, so after two-and-a-bit months it’s high time wrote something.

Whilst I’ve not been writing, I’ve also not been checking the comments. Due to the amount of spam, I require all comments to be approved by me before appearing on the site, so appologies to all the people who had comments stuck in moderation.

I’ve now been working in my new job for 2 months and it is generally okay. Windows, VisualStudio (2003) and Sourcesafe are all colluding to slowly drive me insane but for the time being I’m keeping the urge to take a Linux LiveCD into work at bay with healthy doses of Ruby and Debian in the evenings.

The one major cock-up I’ve made at work was a MS-SQL script to delete four rows from a table. Another, related, table had been corrupted and every row had been altered to point to the same (one) record in the first table. I had written a script to delete four faulty record and then fix the data in the associated table. Since I was deleting data I, as I make a point to always do, only used the primary key column of the table I was deleting from to ensure only the specific record which needed deleting was dropped. Unfortunately I was not aware of SQLServers ability to cascade delete record, nor was I aware that this feature was in use of the tabels in question. As a result the related table ended up with nothing in it. Whoops! We are waiting for the backup tape to be sent from Derby to Nottingham in order to restore the data to a point before the script was run. Fortunately all scripts which are run on live database servers have to be peer-reviewed, both for syntactic correctness and that they perform the task intended, before they are run so I have someone to share the blame with. I am, as the script writer, ultimately responsible for this mistake (through my own ignorance) however my colleague who reviewed the script should have been aware of the cascade delete and he did not spot the potential problem either. Nevermind.

For the past week I have also been shadowing another colleague who is left the company yesterday to learn about the systems where he was the only person with any knowledge. Last night hosting services, in their infinite wisdom, decided to move all of the servers involved in these systems from one location to an entirely different part of the country. The one thing that could possibly break everything should have now been performed the very night after the last day of the only person who knew these systems! Go go gadget forward planning.
I have a number of other things to write about, but I have to go to work early today in order to be there should the server move cause any problems. Maybe I’ll find time to write some more tonight (I wouldn’t hold your breath, though).