Creating an RSS feed of the books you have checked out of the library

I am always forgetting when my library books are due. The PEI Provincial Library sends out handy email reminders, and they have a website that lists checked out items, but I need more (and more obvious) prodding.

The result is opac2rss.pl, a Perl script that automatically connects to the web-based Dynix (aka Epixtech) OPAC and grabs a list of the items I’ve got checked out and the date they are due. It then creates an RSS feed that I can read in my newsreader every morning.

The result looks like this:

The Alibi project in Nova Scotia was invaluable in getting this working, and the O’Reilly book Spidering Hacks was useful in understanding how to use the Perl HTML::TreeBuilder module.

Comments

Buzz's picture
Buzz on August 30, 2004 - 02:30 Permalink

Very cool!

This would way too cool for most libaries to understand, they would rather get the overdue fines.

Jeff Chow's picture
Jeff Chow on August 30, 2004 - 05:08 Permalink

Very neat! We’re in the beta testing stage of a service (www.libraryelf.com) that helps users keeps tabs of their library books — it works similarly to what you are describing except it uses email as the delivery vehicle.

Buzz’s comment about libraries wanting the overdue fines is mostly correct from our experience in trying to market our service to them (even though our service is free for the beta phase). However a few libraries are sensible enough to see beyond the fines and give our service a try.

I’d like to know what are the advantages of RSS over email? Our testers don’t seem to be taking advantage of RSS but everyone has email. If there are distinct advantages to RSS for library email notification then perhaps we should consider adding that to our service as well.

Great work, continued success!

Matt Farra's picture
Matt Farra on August 30, 2004 - 13:33 Permalink

I’m not having any luck getting the script to work with my OPAC. It appears to use a different URL method for accessing my records.

The base URL is:

http://opac.cadl.org/patroninf…

When accessing my record, the above URL changes only slightly, appearing to include a number that might be the session ID?

http://opac.cadl.org/patroninf…

Any idea how I might edit the perl to get it to work on my system? Thanks in advance!

Peter Rukavina's picture
Peter Rukavina on August 30, 2004 - 14:15 Permalink

Matt, the script I wrote is tuned exclusively for Dynix (aka Epixtech) OPACs. Each OPAC has its own URL scheme, and its own set of “screens” for delivering information, and so the script would have to be customized to work with any other OPAC (and, probably, any other library system if there is a different HTML template in use).

Buzz, the head of the PEI Provincial Library said here that “[t]he main goal for the new policy [of charging fines] was to get library books back, not generate lots of revenue.”

Matt Farra's picture
Matt Farra on August 30, 2004 - 15:10 Permalink

Peter, I see what you mean. Do you have any suggestions for how I might go about adapting your script for an OPAC that doesn’t include all of the variables such as username/password directly in the URL? The login screens for my OPAC seem to use forms to authenticate, and the URL stays pretty static in the browser.

art's picture
art on August 30, 2004 - 15:40 Permalink

Brilliant, Peter (as always!) You should post a note here about the work you did with the Amazon API and InterLibrary Loans. There is a circulation protocol called SIP that I have even used, but I don’t know if it gives fines information. If it did, it might be a way to get around variations in OPACs.

Peter Rukavina's picture
Peter Rukavina on August 30, 2004 - 18:13 Permalink

It’s great when your own tools surprise you: Catherine took back two of the books I had checked out. The RSS feed (which I have updating, via cron, every hour), updated appropriately. Neato.

nestor's picture
nestor on August 31, 2004 - 01:11 Permalink

if you want some real fun with your OPAC, get rid of the #focus part and stick &GetXML=true on the end — voila, you get the data on the page as XML which you can then run through your own stylesheets or parse as XML… should make hacks like this much easier in the future.

Leland Johnson's picture
Leland Johnson on August 31, 2004 - 01:35 Permalink

I did the same thing, starting about a year ago with my library’s website. You can see the code at my website, it’s under code/partonman. I originally had it running RSS, but that was before I actually started using RSS, so I reverted to email. Mine renews books, tells you if holds are in, and tells you if you need to return a book. Of course, my library uses a much worse frontend (didn’t generate valid HTML, etc).

Looks nice though. I should really refactor my table reading function to something like you did, but I have my sights on the CPL right now. Please email me, then we could exchange notes.

Paul's picture
Paul on August 31, 2004 - 16:28 Permalink

Yay! I posted something about this to the LazyWeb a while back and you’ve done it!

Paul's picture
Paul on August 31, 2004 - 16:48 Permalink

Well, it seems my local library has diverged from what yours does: the due dates are replaced with a boldface 1 if they have been renewed(?). I’ll see if I can figure out what the difference is.

Jeff Chow's picture
Jeff Chow on September 1, 2004 - 11:19 Permalink

nestor — the &GetXML=true is a neat find!!! I think it’s going to help with the parsing of the ItemsOut and Hold tables for our email reminder service. Any more tricks for Dynix? Or perhaps for III or Sirsi?

carolyne's picture
carolyne on September 1, 2004 - 20:29 Permalink

how about some hints from those of us who don’t know enought to use this, but would like to?
Such “how do you implement it on your own computer”?

Peter Rukavina's picture
Peter Rukavina on September 1, 2004 - 21:52 Permalink

Carolyne: I’m not in a position to do what you request — the code is designed for people with access to a system that can run Perl with some additional modules, and who understand enough about Perl to customize the script to their own situation. Sorry.

SpiceDog's picture
SpiceDog on September 23, 2004 - 19:48 Permalink

Okay, call me paranoid, but your library is publishing
who checks out what, on the web!?! That means Homeland
Security sees when you check out a chemistry book, and
puts you on the no-fly list. Don’t be surprised when they ground your next flight in Maine and haul you off it.

Paul Beard's picture
Paul Beard on January 7, 2005 - 06:02 Permalink

Peter: are you planning on exposing more of how you do what you do? I can run your script, but my library seems to have shifted things around, such that the due date appears in the checked out area and the due date is blank.

and SpiceDog? You need to switch to decaf ;-) you can login to many libraries and check your account: all Peter has done is automate it. You then publish it (in whatever secure but accessible way you choose) so you can access it in a newsreader. It could be on your local system (http://localhost/mybooks.xml). Can’t get more secure than that . .

Donna's picture
Donna on January 29, 2005 - 00:20 Permalink

I’m not sure what to put as the $rssfile. I guess I don’t understand enough about how this works. Would you help me?

Neil S. Verkland's picture
Neil S. Verkland on March 2, 2005 - 00:44 Permalink

I have taken the original script and modified it to work with the GetXML true and to get called as a CGI. This way you don’t store any data anywhere except in the RSS reader. Unfortunately the net effect is that any reader “Windows” rss reader will probably show many instances of the messages from the server because there is a new session ID for every access. We have installed the script on our website to be accessed from the Sun Microsystems Portal server for outr students.

you can try it out:
#!/usr/bin/perl -w

use strict;
use LWP::Simple;
use HTML::TreeBuilder;
use XML::RSS;
use XML::Simple ;##qw(:strict);

my $DEBUG = 0;
my $Q_STRING = “”;
my $opacurl = “http://<your webpac=”” server=”“>/ipac20/ipac.jsp”; ## The main URL
of HIP

## If this file is called as a URL, call like:
## " rel="nofollow">http://foo.com/file.pl?<bar…
$Q_STRING = $ENV{‘QUERY_STRING’};

## IF THERE IS NO QUERY STRING THEN ASSUME COMMANDLINE CALL:
## file.pl <barcode>:<pin>
if (!defined($Q_STRING)) {
$Q_STRING = $ARGV[0];
}

## when in debug mode dump some stuff to a LOGFILE
if ($DEBUG) {
open(LOGFILE, “»” . “/tmp/sunportal.html”) || die “Cannot read LOG file\n”;
printf(LOGFILE “\n”);
printf(LOGFILE “** new request **\n”);
printf(LOGFILE “\n”);
printf(LOGFILE “RAW Q STRING: ‘%s’\n”, $Q_STRING);
}

#Separate the command arguments
my ($barcode,$PIN) = split (‘:’, $Q_STRING);

## when in debug mode dump some stuff to a LOGFILE
if ($DEBUG) {
printf(LOGFILE “BARCODE: ‘%s’\n”, $barcode);
printf(LOGFILE “PIN: ‘%s’\n”, $PIN);
}

###——————————————-
### Get the SessionID
###——————————————-
my $url = “$opacurl?profile=pac&menu=account&submenu=itemsout&GetXML=true”
my $page = get( $url ) or die $!;
my $xmldata = XMLin($page);
my $session = $xmldata→{session};
die “No H.I.P. session could be started” unless($session);

$url=”$opacurl?profile=gmct&menu=account&submenu=itemsouti&session=$session&sec1
=$barcode&sec2=$PIN”

###—————————————————————————————-
### Grab the Items Out page
###—————————————————————————————-
my $item_url = “$url&GetXML=true”
$page = get( $item_url ) or die $!;
$xmldata = XMLin($page);
my $p = $xmldata→{itemsoutdata};
my $secure_ele = $xmldata→{security};
my $name = $secure_ele→{name};
printf(LOGFILE “NAME: ‘%s’\n”, $name) if ($DEBUG);

###—————————————————————————————-
### Start a new RSS object
###—————————————————————————————-
my $rss = new XML::RSS(version => ‘0.91’,encoding => ‘ISO-8859-1’);
$rss→channel(
title => $name . “ ‘s Library Books”,
link => “$url”,
description => ‘Books I have checked out of the library’,
);

###—————————————————————————————-
### Parse Items Out
###—————————————————————————————-
my $list = $p→{itemout};
for my $item (keys(%$list)) {
my ($title,$cko,$due,$lnk);
$title = $list→{$item}→{disptitle};
$lnk = $list→{$item}→{TITLE}→{data}→{link}→{func};
$cko = $list→{$item}→{ckodate};
$due = $list→{$item}→{duedate};

$rss→add_item(
title => $title,
link => “$opacurl?profile=gmct&menu=search&session=$sessi
on&uri=$lnk&source=~!training”,
description => “Checked out: ” . $cko . “
” . “Due back on: <
b>” . $due . “”
);
}

###—————————————————————————————-
### Output to STDOUT for browser who called me.
###—————————————————————————————-

print “Content-type: text/xml\n\n”;
print $rss→as_string;

## when in debug mode dump some stuff to a LOGFILE
if ($DEBUG) {
printf(LOGFILE “Content-type: text/xml\n\n”);
if ($DEBUG>1){
printf(LOGFILE “\n”);
printf(LOGFILE $rss→as_string);
printf(LOGFILE “\n”);
}
printf(LOGFILE “** end request **\n”);
printf(LOGFILE “\n”);
printf(LOGFILE “\n”);
close(LOGFILE);
}

Neil S. Verkland's picture
Neil S. Verkland on March 2, 2005 - 00:47 Permalink

Please add in the original CopyLeft and author detail with a small add for “modified By Neil S. Verkland Mar 1 2005”. Thanks.

Neil S Verkland's picture
Neil S Verkland on March 2, 2005 - 00:52 Permalink

I see that the BLOG writer took out my “<” VAR_NAME “>” references in the following lines:

my $opacurl = “http://”<” your URL here “>”/ipac20/ipac.jsp”; ## The main URL
## http://foo.com/file.pl?“<” secondID “>”:”<” pin “>”
## file.pl “<” secondID “>”:”<” pin “>”

Please look for these lines and add in what is missing

malaga's picture
malaga on March 27, 2005 - 13:49 Permalink

Greetings from Malaga (Spain). Antonio :-)

reid holmes's picture
reid holmes on October 19, 2006 - 18:13 Permalink

I’ve created a yahoo widget that can display a library user’s current books out and holds. It should work for any library that uses the horizon portal. It’s free and can be downloaded at: http://dev.ulti.org/libraryWid…

—reid holmes

Kris's picture
Kris on January 2, 2007 - 21:23 Permalink

I’m looking for a generic widget to remind me about library books as well. I don’t even mind entering the info myself as long as it does what it should. I tried putting the books in my gmail account and sending a reminder, but due to the number of wrong #s I get, I have to keep my phone off for the most part. Please help! My library doesn’t put the dates in the book, but on a receipt and that gets lost. GRRRRR.