Creating an RSS feed of the books you have checked out of the library

Peter Rukavina

I am always forgetting when my library books are due. The PEI Provincial Library sends out handy email reminders, and they have a website that lists checked out items, but I need more (and more obvious) prodding.

The result is opac2rss.pl, a Perl script that automatically connects to the web-based Dynix (aka Epixtech) OPAC and grabs a list of the items I’ve got checked out and the date they are due. It then creates an RSS feed that I can read in my newsreader every morning.

The result looks like this:

The Alibi project in Nova Scotia was invaluable in getting this working, and the O’Reilly book Spidering Hacks was useful in understanding how to use the Perl HTML::TreeBuilder module.

Comments

Submitted by Buzz on

Permalink

Very cool!

This would way too cool for most libaries to understand, they would rather get the overdue fines.

Submitted by Jeff Chow on

Permalink

Very neat! We’re in the beta testing stage of a service (www.libraryelf.com) that helps users keeps tabs of their library books - it works similarly to what you are describing except it uses email as the delivery vehicle.

Buzz’s comment about libraries wanting the overdue fines is mostly correct from our experience in trying to market our service to them (even though our service is free for the beta phase). However a few libraries are sensible enough to see beyond the fines and give our service a try.

I’d like to know what are the advantages of RSS over email? Our testers don’t seem to be taking advantage of RSS but everyone has email. If there are distinct advantages to RSS for library email notification then perhaps we should consider adding that to our service as well.

Great work, continued success!

Submitted by Matt Farra on

Permalink

I’m not having any luck getting the script to work with my OPAC. It appears to use a different URL method for accessing my records.

The base URL is:

http://opac.cadl.org/patroninf…

When accessing my record, the above URL changes only slightly, appearing to include a number that might be the session ID?

http://opac.cadl.org/patroninf…

Any idea how I might edit the perl to get it to work on my system? Thanks in advance!

Submitted by Peter Rukavina on

Permalink

Matt, the script I wrote is tuned exclusively for Dynix (aka Epixtech) OPACs. Each OPAC has its own URL scheme, and its own set of “screens” for delivering information, and so the script would have to be customized to work with any other OPAC (and, probably, any other library system if there is a different HTML template in use).

Buzz, the head of the PEI Provincial Library said here that “[t]he main goal for the new policy [of charging fines] was to get library books back, not generate lots of revenue.”

Submitted by Matt Farra on

Permalink

Peter, I see what you mean. Do you have any suggestions for how I might go about adapting your script for an OPAC that doesn’t include all of the variables such as username/password directly in the URL? The login screens for my OPAC seem to use forms to authenticate, and the URL stays pretty static in the browser.

Submitted by art on

Permalink

Brilliant, Peter (as always!) You should post a note here about the work you did with the Amazon API and InterLibrary Loans. There is a circulation protocol called SIP that I have even used, but I don’t know if it gives fines information. If it did, it might be a way to get around variations in OPACs.

Submitted by Peter Rukavina on

Permalink

It’s great when your own tools surprise you: Catherine took back two of the books I had checked out. The RSS feed (which I have updating, via cron, every hour), updated appropriately. Neato.

Submitted by nestor on

Permalink

if you want some real fun with your OPAC, get rid of the #focus part and stick &GetXML=true on the end — voila, you get the data on the page as XML which you can then run through your own stylesheets or parse as XML… should make hacks like this much easier in the future.

Submitted by Leland Johnson on

Permalink

I did the same thing, starting about a year ago with my library’s website. You can see the code at my website, it’s under code/partonman. I originally had it running RSS, but that was before I actually started using RSS, so I reverted to email. Mine renews books, tells you if holds are in, and tells you if you need to return a book. Of course, my library uses a much worse frontend (didn’t generate valid HTML, etc).

Looks nice though. I should really refactor my table reading function to something like you did, but I have my sights on the CPL right now. Please email me, then we could exchange notes.

Submitted by Paul on

Permalink

Well, it seems my local library has diverged from what yours does: the due dates are replaced with a boldface 1 if they have been renewed(?). I’ll see if I can figure out what the difference is.

Submitted by Jeff Chow on

Permalink

nestor - the &GetXML=true is a neat find!!! I think it’s going to help with the parsing of the ItemsOut and Hold tables for our email reminder service. Any more tricks for Dynix? Or perhaps for III or Sirsi?

Submitted by carolyne on

Permalink

how about some hints from those of us who don’t know enought to use this, but would like to?
Such “how do you implement it on your own computer”?

Submitted by Peter Rukavina on

Permalink

Carolyne: I’m not in a position to do what you request — the code is designed for people with access to a system that can run Perl with some additional modules, and who understand enough about Perl to customize the script to their own situation. Sorry.

Submitted by SpiceDog on

Permalink

Okay, call me paranoid, but your library is publishing
who checks out what, on the web!?! That means Homeland
Security sees when you check out a chemistry book, and
puts you on the no-fly list. Don’t be surprised when they ground your next flight in Maine and haul you off it.

Submitted by Paul Beard on

Permalink

Peter: are you planning on exposing more of how you do what you do? I can run your script, but my library seems to have shifted things around, such that the due date appears in the checked out area and the due date is blank.

and SpiceDog? You need to switch to decaf ;-) you can login to many libraries and check your account: all Peter has done is automate it. You then publish it (in whatever secure but accessible way you choose) so you can access it in a newsreader. It could be on your local system (http://localhost/mybooks.xml). Can’t get more secure than that . .

Submitted by Donna on

Permalink

I’m not sure what to put as the $rssfile. I guess I don’t understand enough about how this works. Would you help me?

Submitted by Neil S. Verkland on

Permalink

I have taken the original script and modified it to work with the GetXML true and to get called as a CGI. This way you don’t store any data anywhere except in the RSS reader. Unfortunately the net effect is that any reader “Windows” rss reader will probably show many instances of the messages from the server because there is a new session ID for every access. We have installed the script on our website to be accessed from the Sun Microsystems Portal server for outr students.

you can try it out:
#!/usr/bin/perl -w

use strict;
use LWP::Simple;
use HTML::TreeBuilder;
use XML::RSS;
use XML::Simple ;##qw(:strict);

my $DEBUG = 0;
my $Q_STRING = “”;
my $opacurl = “http://<your webpac=”” server=”“>/ipac20/ipac.jsp”; ## The main URL
of HIP

## If this file is called as a URL, call like:
## " rel="nofollow">http://foo.com/file.pl?<bar…
$Q_STRING = $ENV{‘QUERY_STRING’};

## IF THERE IS NO QUERY STRING THEN ASSUME COMMANDLINE CALL:
## file.pl <barcode>:<pin>
if (!defined($Q_STRING)) {
$Q_STRING = $ARGV[0];
}

## when in debug mode dump some stuff to a LOGFILE
if ($DEBUG) {
open(LOGFILE, “>>” . “/tmp/sunportal.html”) || die “Cannot read LOG file\n”;
printf(LOGFILE “\n”);
printf(LOGFILE “** new request **\n”);
printf(LOGFILE “\n”);
printf(LOGFILERAW Q STRING: ‘%s’\n”, $Q_STRING);
}

#Separate the command arguments
my ($barcode,$PIN) = split (‘:’, $Q_STRING);

## when in debug mode dump some stuff to a LOGFILE
if ($DEBUG) {
printf(LOGFILEBARCODE: ‘%s’\n”, $barcode);
printf(LOGFILEPIN: ‘%s’\n”, $PIN);
}

###–––––––––—
### Get the SessionID
###–––––––––—
my $url = “$opacurl?profile=pac&menu=account&submenu=itemsout&GetXML=true”
my $page = get( $url ) or die $!;
my $xmldata = XMLin($page);
my $session = $xmldata->{session};
die “No H.I.P. session could be started” unless($session);

$url=”$opacurl?profile=gmct&menu=account&submenu=itemsouti&session=$session&sec1
=$barcode&sec2=$PIN

###–––––––––––––––––––—
### Grab the Items Out page
###–––––––––––––––––––—
my $item_url = “$url&GetXML=true”
$page = get( $item_url ) or die $!;
$xmldata = XMLin($page);
my $p = $xmldata->{itemsoutdata};
my $secure_ele = $xmldata->{security};
my $name = $secure_ele->{name};
printf(LOGFILENAME: ‘%s’\n”, $name) if ($DEBUG);

###–––––––––––––––––––—
### Start a new RSS object
###–––––––––––––––––––—
my $rss = new XML::RSS(version => ‘0.91’,encoding => ‘ISO-8859-1’);
$rss->channel(
title => $name . “ ‘s Library Books”,
link => “$url”,
description => ‘Books I have checked out of the library’,
);

###–––––––––––––––––––—
### Parse Items Out
###–––––––––––––––––––—
my $list = $p->{itemout};
for my $item (keys(%$list)) {
my ($title,$cko,$due,$lnk);
$title = $list->{$item}->{disptitle};
$lnk = $list->{$item}->{TITLE}->{data}->{link}->{func};
$cko = $list->{$item}->{ckodate};
$due = $list->{$item}->{duedate};

$rss->add_item(
title => $title,
link => “$opacurl?profile=gmct&menu=search&session=$sessi
on&uri=$lnk&source=~!training”,
description => “Checked out: ” . $cko . “
” . “Due back on: <
b>” . $due . “”
);
}

###–––––––––––––––––––—
### Output to STDOUT for browser who called me.
###–––––––––––––––––––—

print “Content-type: text/xml\n\n”;
print $rss->as_string;

## when in debug mode dump some stuff to a LOGFILE
if ($DEBUG) {
printf(LOGFILE “Content-type: text/xml\n\n”);
if ($DEBUG>1){
printf(LOGFILE “\n”);
printf(LOGFILE $rss->as_string);
printf(LOGFILE “\n”);
}
printf(LOGFILE “** end request **\n”);
printf(LOGFILE “\n”);
printf(LOGFILE “\n”);
close(LOGFILE);
}

Submitted by Neil S. Verkland on

Permalink

Please add in the original CopyLeft and author detail with a small add for “modified By Neil S. Verkland Mar 1 2005”. Thanks.

Submitted by Neil S Verkland on

Permalink

I see that the BLOG writer took out my “<” VAR_NAME “>” references in the following lines:

my $opacurl = “http://”<” your URL here “>”/ipac20/ipac.jsp”; ## The main URL
## http://foo.com/file.pl?“<” secondID “>”:”<” pin “>”
## file.pl “<” secondID “>”:”<” pin “>”

Please look for these lines and add in what is missing

Submitted by Kris on

Permalink

I’m looking for a generic widget to remind me about library books as well. I don’t even mind entering the info myself as long as it does what it should. I tried putting the books in my gmail account and sending a reminder, but due to the number of wrong #s I get, I have to keep my phone off for the most part. Please help! My library doesn’t put the dates in the book, but on a receipt and that gets lost. GRRRRR.

Add new comment

Plain text

  • Allowed HTML tags: <b> <i> <em> <strong> <blockquote> <code> <ul> <ol> <li>
  • Lines and paragraphs break automatically.

About This Blog

Photo of Peter RukavinaI am . I am a writer, letterpress printer, and a curious person.

To learn more about me, read my /nowlook at my bio, listen to audio I’ve posted, read presentations and speeches I’ve written, or get in touch (peter@rukavina.net is the quickest way). 

You can subscribe to an RSS feed of posts, an RSS feed of comments, or a podcast RSS feed that just contains audio posts. You can also receive a daily digests of posts by email.

Search