Parsing Prince Edward Island Legislation: Understandling Styles

School Act Screen ShotI have a longtime interest in the presentation of Prince Edward Island statutes and regulations online: I worked with the Government of PEI on its website for 8 years, and getting this material online was a significant project that ended up taking many more years than it should have, mostly because of technology challenges – the word processor that was used to maintain the material – that had nothing to do with the Internet. And we had to wait for the PDF file format to emerge as a de facto standard way of distributing complex documents on the web for it to really be feasible.

But there’s a limit to PDFs, especially when it comes to programmatic parsing of documents, and so I have an interest in “beyond the PDF” for distributing statutes and regulations. And, handily enough, I have a test case to use: because of my involvement with the PEI Home and School Federation I have more than just a passing interest in the School Act and its regulations, and I’m interested in ways of presenting it and annotating it that would enliven the document and spread it to a wider audience.

To begin this process I requested a Microsoft Word-formatted copy of the School Act from Legislative Counsel’s office, which they were quick to provide. When I opened this file in, however, it was presented to me as a “Read Only” document, meaning that I couldn’t edit it, and I couldn’t see any of its formatting, so I couldn’t understand the way that styles were used in Word to structure it. Fortunately this was quickly resolved by saving it as a native document (File | Save As… | ODF Text Document). Once I did this, then the names of the styles in the document were revealed.

So, for example, the definitions are all assigned the “Definitions” style:

School at in

Looking in the “Format | Styles and Formatting” tool of with the School Act open, the styles listed under “Applied Styles” are as follows:

  • Act Title
  • AmendingSubsection
  • CenteredText
  • Chapter
  • Clause
  • ClauseCont
  • Default
  • Definition
  • DefSidenote
  • Footer
  • Header
  • Part
  • SecSubCont
  • SecSubSidenote
  • Section
  • Subclause
  • Subsection
  • Topic1
  • Topic2

Rearranging that list so that it reflects the hierarchy of the School Act transforms it to:




The only inconsistency in the document appears to be the use of the “Topic1” style for “PART I” at the beginning of the Act, which should, I think, be assigned style “Part.” But otherwise the styling appears consistent enough to allow for automatic parsing of the document. Which will be my next step.