I don’t have strong opinions about the propriety of the PNP program.

I do, however, have strong opinions about what it means for a government to “release data.”

The Province of PEI “complied” with a court order to release the names of companies that received investment under the Provincial Nominee Program by releasing a PDF file on its website.

PDF, while certainly readable on a variety of digital devices, is not “open data” in any real sense.

To solve that issue, I ran the PDF file through pdftotext and then stripped out the page headers with a text editor using regular expressions.

The result is pnp.txt, a simple ASCII text file with the 1,354 company names from the PDF.

Update: I’ve merged the PNP data with the 2008 data and 933 matches (69% of the PNP companies); you can download this as pnp_plus.csv (comma-delimited ASCII) or pnp_plus.xml (XML). Or search with this tool.

Update: I’ve merged the PNP data with the 2008 data and included the shareholders and officers of the company, getting 2708 matches. Use with caution, as this data is 4 years old now, and shareholders and officers listed were not necessarily in place when PNP investment was received. Download this as pnp_people.csv (comma-delimited ASCII). Or search with this tool.


Oliver's picture
Oliver on November 7, 2012 - 03:26 Permalink

Even though I suppose it’s lack of savvy behind their use of PDF, it bears an unseemly resemblance to the strategic hindrances lawyers and governments are notorious for using. They really ought to get with it.

Joe's picture
Joe on November 7, 2012 - 17:25 Permalink

I thought too that a PDF was a conveniently ‘incovenient’ format…a spreadsheet would have been more user friendly, but I suppose it may not have been released with friendliness to users in mind!