I Paid $166 for Data about Parked Cars

In October of 2016 the Board of Governors of the University of Prince Edward Island approved an updated Access to Information and Protection of Personal Information and Privacy policy, to take effect May 1, 2017 and to be overseen by a new Access to Information and Privacy Office.

As I’m someone with an interest both in data and the university, I decided that it would be a useful exercise to take the new policy out for a ride.

In mid-June of last year I happened to be on campus and had some difficulty finding a place to park my car; this inspired me to look into how UPEI manages its parking lots, and this, in turn, led me to the helpful Parking at UPEI page. From there I found Parking Services, and from there Buy a student parking permit, and ultimately the Student Parking Application Form, which students use to request and pay for a parking permit.

At the end of this journey I decided that it would be interesting to know what vehicles are driven by UPEI students, and so I used that as a basis for my first access request.

I downloaded the Request for Access to Information template, and filled it out. My request was as follows:

I require non-personally-identifying information, for as many years as it is available, from the “Student Parking Application” form, specifically: the application date, make, model, year and province listed on each application for a parking pass contained in the “Vehicle Information” sec6on under subsections Vehicle 1, Vehicle 2 and Vehicle 3.

I require this information in digital form, ideally as a comma-separated ASCII file, but also acceptable as a Microsoft Excel Spreadsheet.

An example of the end product I would receive would look like this:

Application Date,Make,Model,Year,Province
2016-09-01,VW,Jetta,2000,PE
2016-09-03,Hyundai,Elantra,2014,AB
2016-09-04,Honda,Accord,2012,PE
...
2013-10-01,Honda,CRv,2012,PE
2013-11-09,Hyundai,Elantra,2014,AB

and so on, where the “Application Date” column is the calendar date when the applica6on was received (the contents of the “Date” field at the top of the form). If it is not possible to include the application date, the application year is also acceptable.

Before I could submit the request I had to remit the $25 non-refundable application fee to the UPEI Accounting Office. This turned out to be the most confusing part of the process, as there were no instructions provided as to how to do this, outside of an instruction not to put cash in the mail and instead to hand-deliver it to the Financial Services Office. It took me a number of phone calls and dead-end voicemail trees to find the person in that office who could take my credit card over the phone, but once I’d found that person they took my payment quickly and efficiently and emailed me a receipt.

I emailed the request form, along with the payment receipt, to the Access to Information and Privacy Office on June 15, 2017 at 10:30 a.m. and, 14 minutes later, received a reply:

RE: Request Number 4

Your request for access to records related to student parking under the Access to Information and Protection of Personal Information and Privacy policy (the “Policy”) was received by the University of Prince Edward Island (“UPEI”) on June 15, 2017.

We confirm that you have paid the required $25.00 access request fee.  We endeavor to respond to your request under the Policy as quickly as possible.

I assume that means that mine was the fourth request under the new policy.

On June 22, just 7 days later, I received a scanned letter by email in response to my request from the Chief Access to Information and Privacy Officer:

I am responding to your request of June 15, 2017 for access to information under the University of Prince Edward Island Access to Information and Protection of Personal Information
policy (the “Policy”).

While the Policy requires the payment of fees, in this case because the request required less (1) hour of time no additional payment of fees.

Unfortunately, access to the majority of the information that you requested is outside of the scope of the Policy. Section 5 makes the Policy effective from May 1, 2017 going forward. Records created prior to May 1, 2017 will not be released through a request made under the Policy.

Regarding records created after May 1, 2017 I regret to inform you that after completing a search, UPEI has failed to retrieve any records relating to the subject of your request.

To provide context to this outcome, I can advise that as you can see from the UPEI website http://www.upel.ca/facilitiesiservicesjget-student-parking-permit student parking passes can only be purchased beginning August 24th of each year. Therefore, it is not unusual that there have been no applications submitted in this period.

If you have any questions regarding your request, please contact the undersigned.

In other words, the new access policy wasn’t retroactive, and so the data clock started to tick, so to speak, on May 1, 2007.

While this seemed odd, I did understand an overarching logic to the blanket lack of retroactiveness. In a perfect world the response would have been “but as this information is non-personally-identifying, we’re going to make an exception,” but I decided not to press the point immediately.

I waited six months before returning to the file, and then followed up with the office by email:

Per my access request of June 22, 2017, would you now entertain the notion of a re-submission of this request without additional fee, as I was unaware, when I made the original request, that the requests would not cover data retroactive to May 1, 2017.

The next day I received a reply:

While I am sympathetic that you were not aware of that aspect of the Policy, I am unfortunately not able to consider a new submission of the original request.  Access requests are tracked to ensure time lines are respected, and for this reason it would be necessary to start a new request file.  Further, I appreciate from your perspective it was not fruitful as you did not receive any documents; however, it was still necessary for your request to be handled by myself, as well as staff in the Finance and Facilities Management departments in order to process and search to determine that there were in fact no responsive records.

And so I was left to submit an entirely new request; I updated the PDF of my June request (simply changing the dates), paid another $25 fee through the accounting office, and sent everything in on December 21, 2017 by email. Again my request was quickly acknowledged, the same day:

Your request for access to records related to student parking under the Access to Information and Protection of Personal Information and Privacy policy (the “Policy”) was received by the University of Prince Edward Island (“UPEI”) on December 21, 2017.

We confirm that you have paid the required $25.00 access request fee. We endeavor to respond to your request under the Policy as quickly as possible.

Yesterday, January 17, 2018, I received a response to my request, again as a scanned PDF file via email:

I am responding to your request of December 21, 2017 for access to information under the University of Prince Edward Island Access to Information and Protection of Personal Information
Policy (the “Policy”).

We are pleased to be in a position to provide access to parking information specified in your Request. The Policy requires the payment of fees. Your request has now been processed and fees must be paid before access can be provided

You will receive an invoice directly from the UPEI Accounting Office in relation to this request for information. That invoice will contain details on payment. Payment must be made made through that process before documents will be released. Once you have made payment of the invoice please contact the undersigned, and the information will release.

Later in the day I did, indeed, receive an email from the Accounting Office with a scanned invoice for $116, broken down as follows:

  • Locating & Retrieving Records: $96.00
  • Preparing & Handling for disclosure: $20.00

On a second page there was a more detailed breakdown of the costs:

  • Time spent related to the locating and retrieving records: 2.4 hours x $40 = $96.00
  • Time spent preparing and handing record for disclosure: 0.5 hours x $40 = $20.00

The $116 fee, sight-unseen, seemed a little steep. And it did seem odd that the data had been prepared before I’d agreed to the cost. I decided to ask for a taste, so as to ensure that I was getting what I asked for; I sent a follow-up email:

Before I pay the invoice, can you:

  1. Tell me how many records were identified, and what date range they include.
  2. Tell me what file format the records will be provided to me in.
  3. Provide me with a single sample record.

Just 20 minutes later, I received a reply:

The materials contain all requested information located relating to the period between May 1, 2017 and December 21, 2017.  The information will be provided in a single Excel (.xlsx) document listing requested information.  I have copy and pasted the first two rows of that document into the table below.

Screen shot of email with sample data from parking permit spreadsheet

At this point I debated whether to continue or not, but ultimately decided that it would be useful to take the request through to its logical conclusion. So I paid the $116 invoice over the phone by credit card with the Accounting Office, received a receipt by email, and forwarded this email on to the Access office.

Twelve minutes later, I got the data, an Excel spreadsheet with 1066 parking permit records.

The first thing I did upon receiving the file was to clean up the data:

  1. I removed 7 records that were missing a model year for the vehicle.
  2. I converted the model years to four-digit dates.
  3. I trimmed spaces from the ends of the vehicle information rows (it looks like they were exported as a fixed-width file, with spaces for padding).
  4. I removed the anonymous identifier column, as I didn’t require it for my purposes.
  5. I split the “vehicle information” field into 3 fields: year, make and model (this is what I’d originally specified in my request, but they came as one field). In doing so I found there were several records that were missing the model, but I left them in place with only the year and make.

The result was a cleaned up CSV file with 1059 rows.

So what can we learn from this data?

I loaded the CSV file into LibreOffice and did some analysis and made some charts.

Vehicles by Model Year

The most popular model year was 2009, with 110 vehicles; this is also the median model year. The oldest vehicle is a 1984 Harley FLH motorcycle; the newest vehicle is a 2018 Subaru Outback. Here’s a chart showing the number of vehicles by model year:

Chart showing cars by model year

Vehicles by Make

There’s a lot of variability in the vehicle manufacturer column–misspellings, variations like “Chev” and “Chevrolet,” and I didn’t attempt to normalize for that, so keep that in mind.

The 221 Toyotas account for 21% of all parking permits; that’s followed by Honda (183 vehicles, 17%), Hyundai (98 vehicles, 9%) and Ford (94 vehicles, 9%).

The least popular vehicles, with one each, are Cadillac, Hummer, Mercury, Porsche, Smart and Volvo.

Here’s a chart showing all makes with more than 5 vehicles:

Chart showing parking permits by make of vehicle

Here’s a word cloud (larger type size equals more vehicles) that I created with Word Cloud Generator:

Word Cloud of Vehicle Makes

Vehicles by Model

While Toyota is the most popular make of vehicle, the Honda Civic, with 120 cars, 124 if you account for spelling and model variations, is the single most popular model; Civics account for 12% of all parking permits.

Here’s a chart showing the model breakdown by make; again, with models there’s some variability in spelling (is it Chev Cruise, Chev Cruize, Chev Cruz, or Chevy Cruze?).

Toyota

The Corolla is the most popular model of Toyota with 86 vehicles (90 if you account for it being spelled as “Carolla” as well), followed by Matrix (27), Yaris (27), and Camry (20).

Chart showing Toyotas by model

Honda

Beyond the 124 Civics, the most popular Honda models are the Accord (20), the CRV (13) and the Fit (13).

Chart showing Hondas by model

Hyundai

The Hyundai Elantra is the most popular model (43 vehicles if you account for spelling), followed by the Accent (29), Sonata (9) and Santa Fe (8).

Chart showing Hyundai by model

Province, Territory, or State of Registration

767 vehicles (72%) are registered in Prince Edward Island, followed by Nova Scotia (99), New Brunswick (82), Maine (15), and Newfoundland and Labrador (14). There are 33 provinces, territories and states represented; Manitoba, Yukon and Northwest Territories are the only ones missing from Canada.

Vehicles by Province or State of Registration

Date Parking Permit Issued

As outlined on the parking permit page:

Students can purchase a campus parking permit from August 28 to September 18, 2017 at the Parking Kiosk located on the main level of the W.A. Murphy Student Centre. On Monday, August 28 permits will be avalable from 1:00 pm — 3:30 pm. After August 28, permits are available for sale from 9:00 am to 3:30 pm. Permits will be available Monday to Friday until September 18, after which time permits will be available from the Facilities Management Office (in the Central Utility Building) from 9:00 am to 3:30 pm weekdays or until permits sell out.

As expected, then, the first permits for the fall semester, 123 in total, were issued that first sale date, August 28. The next day 59 were issues, and then there was a big uptick on September 6 (169), which was the first day of classes. Almost all passes were issued by September 19.

Chart showing when parking permits were issued

Why would I spend $166 for data about parked cars?

That’s a good question, with several answers.

If we don’t use access policies, we won’t understand them…

Assuming that my second request receiving “request number 6” means that there were only 6 access requests UPEI from May 1 to December 21 of 2017, people aren’t exactly breaking down the door under the aegis of this new policy. That’s a shame: the only way to understand a policy, and to point out opportunities for improving it, is to take the policy out for a ride.

I didn’t know it would cost me $166…

I naively assumed that the data would cost me the initial $25 application fee, plus a nominal amount for what I assumed would be “File > Save as > CSV” from the parking permit system.

That it took someone 2.4 hours to “locate and retrieve” the data, and 30 minutes to “prepare it” suggests that this wasn’t the case, and that some more advanced data manipulation was required. That in itself is a data point, as it suggests that the system underlying the parking permits isn’t built with openness in mind.

Truth be told, when I got the $116 invoice yesterday I experienced some sticker shock, but this was trumped by my desire to finish what I started, so as to end up with a complete case study.

Because this should be open data…

One of the most frequent questions I’m asked by public servants and administrators when I speak to them about open data is “how do we figure out what data to make public,” to which my reply is “everything.”

At that point they start talking about stakeholder engagements and prioritization surveys and I stop listening.

But my “everything” isn’t facetious: the best way to be open with your data is to keep your data in open systems. The world is filled with proprietary data management systems that never anticipated that the data stored within them would need to leave the system, let alone be prepared for public access, so there’s a significant re-engineering challenge lurking behind my “everything.”

The best result from my request would be that UPEI uses my request as a data point itself, and that the process by which the data was extracted for me was generalized such that posting the data every year at the end of September on the UPEI website becomes a matter of course rather than a matter of $166.

It’s Your Turn

Use the Data

Please use the data yourself for whatever purposes you like: I paid $166 for it and we need to wring the value from it! I imagine it would be an excellent data set for introductory programming, visualization, statistics, and poetry courses.

There’s a handy Github repository with the data and associated documentation from my access request; send me pull requests to supplement and improve it.

Make Access Requests

Now that you know that you can, please make access requests of your own, and, if you’re in a position to, advocate for an evolution from a “you must hop over this wall and pay us $166” to a “everything is open; come and get it” approach.

Comments

FAS's picture
FAS on January 18, 2018 - 23:05 Permalink

Any idea how many permit-only and visitor parking spaces exist on campus?

Peter Rukavina's picture
Peter Rukavina on January 19, 2018 - 17:04 Permalink

Still gotta pay off the $166 loan before I can take on new access requests ;-)

Paul Pival's picture
Paul Pival on January 19, 2018 - 11:58 Permalink

Great post, thanks! As soon as I saw your disclaimer that, "I didn’t attempt to normalize for that" I thought, "That's a job for OpenRefine!". Forked and submitted. (or whatever the correct jargon would be). They're sure not paying much attention to accuracy of spelling with those models :-0

Peter Rukavina's picture
Peter Rukavina on January 19, 2018 - 14:13 Permalink

Google Refine Lives Again!

I didn’t know about this.

Bookmarked; thank you.

I can always use “A free, open source, powerful tool for working with messy data” as I eat a lot of messy data.

Josh Biggley's picture
Josh Biggley on January 23, 2018 - 08:04 Permalink

I am infinitely curious whether there is a data architect who is responsible for defining data standards within the public space? The whole "padding spaces" things drives my crazy when I am trying to parse through data. In spite of my official job duties having nothing to do with being a manager (err, janitor?) of data, I spend a significant portion of my day trying to figure out how to work through data sets, define standards for future data, and trying to measure compliance.

Thanks for taking the plunge into the murky depths of this obscure data set!

Peter Rukavina's picture
Peter Rukavina on January 23, 2018 - 11:40 Permalink

There’s a big difference between coding an app for a tightly subscribed universe–issuing parking permits–where, historically, all you need to do is to be able to look up the data internally, and coding an app with a recognition that you’re creating a new pool of data that will be useful both internally and externally. The challenge of open data, once we’ve crossed the perceptual challenge (which, it’s possible to argue, we’ve already done here in PEI) is the engineering challenge.