A Year of The Guardian
Many years ago, when Mark Leggott first moved to Prince Edward Island to become University Librarian, we had lunch at the old Interlude restaurant on Kent Street. Mark and I first met in 1994 in St. John’s at the Access conference, and kept bumping into each other at Access ever couple of years, so this was more of a “so, what are you up to these days” lunch than a introductory lunch. And, as is usual with Mark, the conversation was wide-ranging. I casually mentioned, at the tail end of our lunch, that one of my great desires was to see the archive of The Guardian, Prince Edward Island’s newspaper of record, scanned and made available as a digital archive.
Several months later I was happily surprised to find that a project was created at UPEI to do just this: meetings were held with The Guardian, scanners were acquired, software workflow developer, and the long job of scanning and archiving began. Early results — 890 issues wrapped inside a prototype — have been available for several years at IslandNewspapers.ca, and a new version, with a more extensive archive and improved viewer, is in development now. I’ve been poking around this beta for the last month, and figured out enough about how it’s structured to be able to automatically scrape scanned newspaper images out of it (in short, I wrote a PHP script that uses cURL to log into Drupal and then uses the fact that, in Islandora, the image path is standard and contains the date of the issue in question, to grab images for a given range of dates).
I’m a strong believer that projects like this need to have active hacking activity around their fringes to expose new opportunities for home information might be reused, resorted, redisplayed, reimagined. And so I decided that, by way of taking the expanded archive out for a ride, by way of marking The Guardian’s 125 anniversary this year, and by way of thanking the paper for its support of the archiving project, I’d make The Guardian a Christmas present.
I used my script to pull every cover from the 1912 volume of the paper — 304 issues in all — as high-resolution JPEG2000 images. I then converted these images to TIFF images (using Graphic Converter because my local Image Magick install can’t read JPEG2000 images), and then used Image Magick to create a very large composite image using the montage command:
montage 1912*.tiff -tile 16x19 -geometry +20+20 ../1912-montage.tiff
(This says “take all of the TIFF files with names starting with 1912 and make a 16 image wide by 19 image high composite, leaving 20 pixels of vertical and horizontal space between each image).
Generating the composite was surprisingly quick (I sometimes forget just how much raw horsepower modern computers pack). When it was generated I loaded it into Graphic Converter, added text at the bottom with title and credits, and then resized it to a more reasonable 8140 x 13300. Here’s a smaller version of what I ended up with:
Among other things, hidden away inside the composite you’ll find mention of the sinking of the Titanic:
I emailed the resulting TIFF to Kwik Kopy for some experimentation; they called me up the next day to look at some test prints, and after deciding to increase the size by 25% to make the covers slightly more readable, they went to print.
I picked up the result on Friday afternoon; I have a bad head for estimating size, so I was surprised by how huge it was:
I packed the print back in its box and ran it over to The Guardian office on Prince Street where I left it for editor Gary MacDougall. Less than an hour later a nice thank you photo showed up on Twitter:
You can grab the high-resolution image for yourself (100 MB TIFF image) if you’d like to explore it in more detail (headlines are very readable when you zoom in) or print a version for yourself (Kwik Kopy does great work and they’re familiar with the image now so printing additional copies shouldn’t be an issue!).