Earlier this month, by chance, I noticed an ad in The Buzz for a Japanese bookbinding course in Bonshaw this past weekend. I registered right away, as I’ve always been fascinated by bookbinding. After my experiments with perfect-bound books last year, I wanted to try my hand at more sophisticated bookbinding techniques, and Jennifer Brown’s course seemed like just the thing.
It was.
A happy group of about a dozen of us gathered in Bonshaw yesterday – Home of the Fisheman’s Breakfast! – and, over three hours, were walked through two bookbinding techniques. Jennifer is a kind and patient teacher, and her setup was well-outfitted with tools and materials, so all we needed to bring was our creativity and willingness to learn.
We started off making an accordion-fold book: two pieces of cardboard covered with Japanese paper and joined together with accordion-folded paper. This involved a lot of exact folding and persnickety gluing, but was relatively easy to pull off (and, to boot, I learned a lot of good, basic paper-folding skills).
Our second book used “kangxi” binding: we took a two pieces of heavy Japanese paper for the covers, sandwiched 12 sheets of paper inside, punched five holes through the edge with an awl, and then sewed the binding using heavy cord. The geometry of the sewing pattern makes perfect sense once you’ve done it once or twice (and is somewhat perplexing up until that point), and the main challenge of this technique, at least for me, who seldom touches needle and thread, was simply the physics of sewing.
The hole in the cover over a stamp glued on top of a red piece of paper glued to the first page was an after-market-upgrade that I installed once the book was finished.
With the basics of these two techniques under my belt, I’m really excited to make an end-to-end book now, making the paper, printing on the letterpress, and then binding together. Stay tuned.
I have been a fan of, and contributor to, the OpenStreetMap project for several years now, and recently became interesting in the work that’s being done on indoor mapping. To allow me to dip my toe in these waters, I decided to try my hand and creating an indoor layer for Robertson Library.
While the library has patron-focused floor plans on its website, they aren’t georeferenced, and, while useful as wayfingers, as static GIF files, aren’t much use if the goal is integration with other GIS systems to allow for things like annotation, editing, etc.
Fortunately I was able to come up with more accurate floor plans for the library. To do this I first found the URL for the Facilities Management office and then did a targetted Google search for:
site:http://www.upei.ca/facilities pdf robertson
hoping that this would turn up some PDF files for the library floor plan. As it turns out, it did:
As the URL for the “level 3” floor plan was:
- http://www.upei.ca/facilities/files/facilities/floorplans/Robertson%20Library%20Level%203.pdf
I guessed, as it turns out correctly, that the floor plans for levels 1 and 2 would follow the same pattern:
- http://www.upei.ca/facilities/files/facilities/floorplans/Robertson%20Library%20Level%201.pdf
- http://www.upei.ca/facilities/files/facilities/floorplans/Robertson%20Library%20Level%201.pdf
So, now I had geographically accurate floor plans for a three levels of the library.
Next, I grabbed an updated copy of the excellent JOSM editor for OpenStreetMap. I downloaded the OpenStreetMap data for the area around the UPEI campus, and zoomed in on the Robertson Library building, which had earlier (perhaps by me?) been roughed in.
Next, I installed the PicLayer plugin (plugins can be installed from within JOSM from the Preferences | Plugins section) and then, following the guidance on the OpenStreetMap site I copied the level 1 floor plan to the clipboard on my Mac and then selected PicLayer | New picture layer from clipboard. This loaded the floor plan into my current OpenStreetMap view. Selecting the new layer, I click on the “green arrow” icon in the JOSM toolbar to allow me to identify reference points on the layer that I could match with the satellite imagery and roughed-in Robertson Library footprint. Finally, clicking the “red arrow” icon, I dragged the reference points to match the underlying points on the OpenStreetMap layer (adjusting the opacity of the Piclayer helped a lot here), with the result looking like this:
With the georeferenced floor plan in place, matched to the shape of the building itself as it appears on the Bing satellite map, I could then make minor adjustments to accommodate the nooks and crannies of the building that hadn’t been a part of the original OpenStreetMap building object.
Next step: use the floor plan to add interior detail, using the IndoorOSM markup as a guide, and then repeat for the other two levels of the building. Once I’m done, I should be able to create a visualizer similar to this one to expose the new data to the public.
It’s been a long time since I took a look at the state of the art in optical character recognition (OCR): the last time I really paid attention was when Delrina’s Winfax program gained OCR capabilities in 1994 (I used to do DIY OCR by faxing myself things).
Man, has the state of the art ever advanced. And the shiny object attracting my eye tonight was Tesseract, an open source OCR engine that was developed originally at HP labs.
Tesseract has the benefit of being dead simple to install on a Mac with Homebrew; you just:
brew install tesseract
And, blamo, about 8 minutes later your Mac is a powerful OCR machine.
To take Tesseract out for a short ride, I used Robertson Library’s Plustek OpticBook A300 scanner (which is awsomely fast) to scan the 1924 book by D.B. Updike, In the Day’s Work into 44 TIFF files (each 330ppi, and about 8MB in size). And then, proof-positive of how easy it is to use Tesseract, I did:
tesseract printing0008.tif page8
And, about 3 seconds later (yes, it is fast), I had:
On the Planning if Printing ,T must of necessity be,” said Sir Ioshua Reynolds, “ that even works of genius, like every other effect, as they must have their cause, must also have their rules; it cannot be by chance that excellen- cies are produced with any constancy or any certainty, for this is not the nature of chance: but the rules by which men of extraordinary parts—-and such as are called men of genius—- work, are either such as they discover by their own peculiar observations, or of such a nice texture as not easily to admit being expressed in words. Unsubstantial, however, as these rules may seem, and difficult as it may be to convey them in writing, they are still seen and felt in the mind of the artist; and he works from them with as much certainty as if they were embod- ied upon paper. It is true these refined princi- ples cannot always be made palpable, as the [3]
from this:
By my count, there were only 3 errors: “if” instead of “of” in the italic title, an understandable issue with pulling the “I” out of the ornament at the beginning of the paragraph, and Joshua being read as Ioshua.
Back in the mid-1980s I volunteered for a time as a FORTRAN programmer in the Vertebrate Palaeontology department of the Royal Ontario Museum, working with the great and kind curator and palaeontologist Chris McGowan.
Despite being a simple volunteer, I was allowed access to the ROM whenever I liked, and, on occasion, I would find myself in the department after closing, when everyone else had gone home, hacking away on FORTRAN (doing statistical analysis of swordfish larvae).
One night like this I found myself suddenly very hungry. Scrouging around in the departmental fridge, I found the motherlode: a package of chocolate covered almonds hidden away in the freezer. I wolfed down a mouthful. Only. To. Find. That. They were not chocolate covered almonds at all, but rather chocolate covered coffee beans. Espresso beans, in fact.
This, I realized, was my punishment for taking food that wasn’t mine, and I resolved never to cross that line again (a line I have, nonetheless, crossed many times since, often at my peril).
I thought of that night tonight as I arrived at the second floor of Robertson Library, after library closing hours. I picked up a LG E2551 external monitor this afternoon ($139 on sale at Future Shop), and I wanted to get it set up in the office, and figured “after dark” would be a good time to do this. (And, after all, who doesn’t have boyhood dreams about being inside the library after everyone else has gone home for the night!).
I opened the door to room 322 only to find that the light wouldn’t turn on. I reasoned that I must be the victim of an after-hours-electricity-conservation program and wandered around for a while trying to find the master switch. But it evaded me.
Not wanting to waste the trip, I placed a quick call to Chief Librarian Mark Leggott, who, alas had never found himself in the same situation, and couldn’t point me right. A second call, however, to crack Digitization Initiatives & Systems Librarian Don Moses, paid off: “walk into the tiny room beside the circulation desk, he said, and then look on the wall on the left before you turn,” he told me. And, sure enough, there I found:
I hit the switch marked “Upper” and poked my head out to see what had happened: all the lights on the second floor had turned on (the granularity of the lighting panel leaves something to be desired, obviously), including the light in room 322. So, here I am in my tiny light-filled room, lighting up the entire second floor of the library as a result, working away (note to self: buy a floor lamp to avoid having to do this in future).
To avoid being arrested by eagle-eyed security (“breaker, breaker, we’ve got a lights-on in the library, code 79, code 79, swarm… swarm”), I placed a quick call to alert them of my presence.
And now I am ready for hacktion.
Catherine and I started off as neighbours on George Street in Peterborough. I remember clearly the first time I laid eyes on her: she was wearing clothes covered in paint. Over the months of that summer as I began my slow, slow wooing process I experienced her life as a working artist mostly through the sounds of her pounding on some piece of metal or another in her backyard-cum-studio. Fortunately, Catherine accelerated the wooing process – my “five year plan” was much under her patience threshold – and by Thanksgiving we were a couple. We’ve been together ever since – 22 years.
When we moved to Prince Edward Island in 1993, one of the first magical things that happened was that Catherine took up residence in a studio on Victoria Row, a spot that, until the week before we arrived, had been Lester O’Donnell’s law office. She worked there for two years, and then reluctantly gave it up when we moved to the country. When we returned to town and Catherine went looking for a studio again, the magic was obviously still in the air, as she was able to move back in when Ben Stahl, who had been there in the intervening 13 years, moved out.
Almost since the day she moved back in, she has been working on a body of work – in fibre these days, mostly, not metal – related to Prince Edward Island and land use and the environment. Along the way Oliver and I have had a backstage pass to the creation of everything from fabric potato plants to a large wall hangings. I’ve schlepped a dory around, helped her figure out how to scan $20 bills, and, mostly, watched from the sidelines as her studio slowly filled with work.
The rest of you, I am happy to say, will get a chance to see what we’ve been seeing, as in February the Confederation Centre Art Gallery is mounting a show of Catherine’s recent work, Changing Environs. The show opens to the public on February 2, there’s an opening on February 24th, 2013 in the afternoon (Facebook event), and the show will be in place all spring. You should make sure to see it; it is, if I don’t say so myself, rather wonderful.
Just over a year ago, in December of 2011, [[Oliver]] created a Christmas Word Search and I posted it here for all to see. A year later it proved very popular (relatively speaking) over the Christmas 2012 season, garnering 3,055 page views (5.85% of the total traffic to this blog) from 63 countries:
When I was 12 years old (in 1978, when pocket calculators were just coming on the scene) if you told me that I could make up a word search and it would end up on screens around the world I would have thought you were talking magic.
Such is life for today’s adolescent; blows my mind.
Regular readers may recall a project I undertook last month to create a montage of covers of The Guardian newspaper from 1912 to mark the newspaper’s 125th anniversary. It only seemed appropriate that, when 2012 ended, I would do the same thing for the 2012 volume of the paper.
So I wrote some code – you can grab it here and try it out for yourself – to scrape cover thumbnails from the Pressdisplay.com site where they are cached for use in presenting the digital edition of the newspaper, then stitched the images togother using ImageMagick and the result looks like this (you can download a 62 MB JPEG image or look at this version in zoom.it if you want to explore in more detail):
Just in case you missed a point buried in the comments, I’m doing my Hacker in Residence blogging over at http://hack.ruk.ca/ (a server that, despite the domain, is hosted at UPEI and will shortly be blessed with an institutional name).
While I’m usually loathe to fork a blog – part of the appeal, I think, of this space is its wide-ranging coverage of different topics – I decided it better to spare the readership here from the minutiae of discussion of repositories and geopresence archiving and diagrams of my university office.
If, however, you’re into such minutiae, come on over: read the blog, or subscribe to its RSS feed.
So, apparently, Islandora isn’t quite the “point Drupal at some documents and it magically transforms them into a repository” solution I’d been imagining in my dreams. And so my visions of “take a collection of Homburg Invest insolvency PDF files and turn them into a repository” got bogged down, yesterday, in issues like “global deny-apim policies,” exacerbated by the fact that the virtual server I’m using here isn’t a “fresh install,” but rather a clone of an earlier Islandora server with some loose edges left to be tied up. I’m sure I’ll get there with Islandora shortly but, in the meantime, I had some PDF files I wanted to dive into.
Which led me to Evernote, a web app and desktop application that’s as close as you can come to a “repository” without actually calling yourself a repository. I’ve been using Evernote for several years on my Mac, my iPad, my iPod Touch and my Windows Phone to manage my personal documents: I dump all my bank statements, household bills and other ephemera of daily life into Evernote. The app provides me with handy ways of organizing these notes: not only can I tag notes, and organize them into “notebooks,” but Evernote also does a kind of OCR on them to enable full-text search of any text – even scanned images – in any note.
So, for example, if I’m trying to find a receipt for filling up my rental car with gasoline at the Gulf station at Logan Airport in Boston, I can just search for keyword “gulf” and Evernote shows me all the notes containing that keyword:
So, as a before-I-figure-out-Islandora way of searching my collection of 210 documents related to the Homburg Invest insolvency, I decided to try out Evernote as a repository-in-everything-but-name: I simply dragged the PDF files into a new notebook in Evernote, waited a few minutes for the app to sync and OCR them, and then took it out for a ride. A search for keyword “Holman Grand,” for example, shows me all of the PDFs containing one or more references to that hotel project:
It not only displays the list of 16 documents containing the keyword, but it also highlights the keyword inside the PDF files themselves:
What’s even nicer about Evernote is that with a single click I can make any local notebook a shared notebook, with a public URL, and at that URL all of the same searching and organizing features are available in a web interface, and anyone with Evernote on their local machine can add a local version of the notebook to their personal collection:
As a result, you can now search the collection of Homburg Invest insolvency documents in Evernote yourself (the only thing you’ll miss compared to the desktop application is keyword-highlighting inside PDF files).
I’m going to keep plugging away at Islandora, perhaps setting myself up with a Drupal 6 install so I can use the more battle-tested version of Islandora targetted at that version of Drupal; if nothing else, Evernote’s utility will be a good yardstick against which to measure whatever I can come up with there.
In September of 2011, Homburg Invest Inc. and various related companies filed a motion to obtain protection under the Companies’ Creditors Arrangement Act (CCAA). The firm of Samson Bélair/Deloitte & Touche Inc. was appointed by the court as monitor, and has been, ever since, publishing a rich collection of documents related to the companies and the process. In Prince Edward Island, Homburg Invest Inc. is well known for its development of the downtown Holman Grand Hotel, a significant development for many reasons, including the loaning of public money to the effort.
The complex web of companies involved in this action is difficult for the layperson to understand; the sheer breadth of documentation available – 210 as of this writing – should, in theory, many understanding possible, but with the data locked inside PDF files, and written in “inside baseball” investment terminology, this is a challenge.
This situation seems like one tailor-made for a repository driven by Islandora: in theory I should be able to ingest the PDF files, allow them to be searched in myriad ways, and to allow them to be annotated.
As a starting point, I need to harvest the PDF documents from the Deloitte Touche website which is easily done with the help of the wget command (easily installed on OS X if you don’t have it already; generally available pre-installed in most modern Linux distributions).
To grab all of the PDF files in a single go is as easy as:
wget -l 1 -r -A.pdf "http://www.deloitte.com/..."
I use -l 1 (that’s hyphen-el-space-one) to limit the depth of the download to a single level; this prevents wget from scraping other unrelated PDFs from other parts of the site. The -r makes the process recursive (perhaps not technically required in this situation because I’m only using a depth of 1) and the -A.pdf says “limit yourself to PDF files”.
Two minutes and 47 seconds later I have 210 PDF files sitting in a local directory. Next step: get Islandora running and figure out how to ingest the PDF files and add meta-data to them.