Freedom to Iterate

The CBC reports that a Nova Scotia teenager has been charged with programmatically accessing public information from a public Government of Nova Scotia website. This is an issue near and dear to my heart, as the accused was charged for engaging in exactly the sort of thing that I do regularly.

For example, earlier this year I was contacted by a Prince Edward Island journalist who was researching the latest plan by the Government of Prince Edward Island to equip the Island with a high-speed Internet backbone. They’d been looking for the tender documents associated with earlier attempts to solve the same problem, but had been stymied by the fact that these older tender documents were seemingly no longer available online, and their access to information requests had remained unfilled; they wanted to know if I could help.

The dropping-out-of-view of the older tenders appears to have been a side-effect of the Province’s migration to a new website: the legacy tender website contained records of tenders extending back to 2002; when it was replaced with a new section on the new website, older tenders were not included in the migration of the search feature, and so the tender they were looking for was no longer online. The confusing thing for the journalist was that, until very recently, they’d been able to use Google to locate older tender documents, and they were showing in Google search results; this was no longer the case, though, and they were wondering whether the documents were still online in some form.

I agreed to assist them in their search, as I’ve a personal interest both in why we haven’t collectively been able to solve the Internet access issue on PEI despite trying for 25 years, and in open data and transparency and access to information.

What I discovered is that the older tender documents are, in fact, still online, on the legacy government website; they’re just not exposed to the new site’s search.

If you take the URL:$tender

and replace $tender with a number from 1 to 5884 (a limit I found simply by experimenting), you can retrieve information for tenders back to this June 6, 2002 tender for a heavy-duty tire changer.

One of the things computers are very good at is iterating. Which is to say, doing the same thing over and over and over again in a loop.

So, with what I learned about how the legacy tenders are stored, publicly available, I wrote a short computer program to grab all of them:

for ($tender = 1 ; $tender <= 5884 ; $tender++) {
  system("wget "$tender" -O html/$tender.html");

This left me with a collection of HTML documents, one for each tender. And this included the 2002 RFP FOR CONSULTING SERVICES TELECOMMUNICATIONS NETWORK INFRASTRUCTURE, the 2006 REQUEST FOR PROPOSALS FOR PROVINCIAL FIBRE OPTIC BASED NETWORK, and the 2007 REQUEST FOR PROPOSALS FOR OPTO-ELECTRONICS AND RELATED SERVICES. None of these tender documents are accessible through the updated search tool, but they are all still publicly available online, and they references to them continue to appear in Google search results:

Screen shot of Google search results showing PEI tender document.

In the end, then, I was able to supply the journalist with the materials they were looking for, using publicly available copies on the Province’s publicly available webserver. That I happened to access these materials by iterating (“show me tender #1, show me tender #2, show me tender #3, etc.”), with a computer doing the heavy lifting, is immaterial.

I’ve used similar techniques to retrieve information dozens of times over the years. It’s how I was able to provide an RSS feed of Charlottetown Building Permits (which only recently went dark when the City updated its website and made this impossible). It’s how I can continue to provide a visualization of PEI’s electricity load and generation. And it’s how I was able to create an alternative, more flexible corporations search a decade ago when I was otherwise unable to find information about Richard Homburg.

The freedom to retrieve materials that are online at a public URL, whether in a browser, via a script, or through other automated process, is one of the fundamental freedoms of the Internet, and it’s one of the things that distinguish the Internet from all networks that came before it.

The story of the accused teenager in Nova Scotia bears striking similarity to my search for tenders and my search for corporations information; from the CBC story (emphasis mine):

Around the same time, his Grade 3 class adopted an animal at a shelter, receiving an electronic adoption certificate.

That led to a discovery on the classroom computer.

The website had a number at the end, and I was able to change the last digit of the number to a different number and was able to see a certificate for someone else’s animal that they adopted,” he said. “I thought that was interesting.”

The teenager’s current troubles arose because he used the same trick on Nova Scotia’s freedom-of-information portal, downloading about 7,000 freedom-of-information requests.

He says his interest stemmed from the government’s recent labour troubles with teachers.

I wanted more transparency on the teachers’ dispute,” he said.

After a few searches for teacher-related releases on the provincial freedom-of-information portal, he didn’t find what he was looking for.

A lot of them were just simple questions that people were asking. Like some were information about Syrian refugees. Others were about student grades and stuff like that,” he said.

The teen said a single line of code was all it took to get the information. (CBC)

So instead, he decided to download all the files to search later.

I decided these are all transparency documents that the government is displaying. I decided to download all of them just to save,” he said.

He says it took a single line of code and a few hours of computer time to copy 7,000 freedom-of-information requests.

I didn’t do anything to try to hide myself. I didn’t think any of this would be wrong if it’s all public information. Since it was public, I thought it was free to just download, to save,” he said.

The sad irony of this tale is that it was freedom of information documents that the accused downloaded, and in among those files were ones that had not been properly redacted. That is certainly a lapse in security on the part of the government that bears investigation, but the accused bears no responsibility for downloading publicly-available files, whether mistakenly made public or not.

As you can imagine, I feel a great sense of solidarity with the accused, and I’ve reached out to their lawyer with an offer to help in any way I can. Beyond the immediate concern for them, however, I’ve a broader concern about the chilling effect of this heavy-handed action against the freedom to iterate the Nova Scotia government has launched.

They are threatening one of the pillars of a free and open Internet; we need to stand up for that.

A GoFundMe page has been established to support the accused.


David's picture
David on April 20, 2018 - 13:50 Permalink

What an absolutely ludicrous charge - I too have used this exact technique to work around shoddy search engines, to gain access to content whose hyperlink has moved/expired, or to access pages that are not indexed.

My hope here is a judge with enough technical knowledge comes around and realizes that what happened here is ENTIRELY the fault of public servants or the contractors they hire. This is akin to having a secret conversation in a public area and charging someone for listening in on that is not the public's responsibility to protect the government's secrets.

Laurent Beaulieu's picture
Laurent Beaulieu on April 21, 2018 - 21:37 Permalink

I see in the papers today that the Nova Scotia Gov now admits they over reacted before they knew the facts, looks incompetent. Indeed it is chilling to see Governments react this way. Thank you for writing about this and explaining it so well. Having worked in the past on access to information request at the Federal level, this type of mistake can happen if the Public Servant is not careful and does not do a good job. Unfortunately often access to information are seen as nuisance and not given much priority, so mistake can happen, the job is delegated to junior staff with little training.