Data Portability and Microformats

This is a work in progress. catch me on twitter @doublej42 if you have suggestions.

What

Locals governments have lots of data, addresses, events, geographical, and even legal information. They often represent this information in proprietary formats such as GIS application that must be installed to view maps on their website or formats that are not easily read by machines such as PDF and images. This is fine  for a human as most, but not all, can view this data. The blind are often bound to what is machine readable so you can think of this as an accessibility issue.

Data portability aims to resolve these issues by making information free, not only to humans but to other computers and their applications.


Why

By providing this information you open up the area to growth by allowing the citizens to provide services for you.

David Eaves says it well when he said

“3. Create new businesses and attract talent: As the city shares more data and uses more open source software new businesses that create services out of this data and that support this software will spring up. More generally, I think this motion, over time could attract talent”

http://eaves.ca/2009/05/14/vancouver-enters-the-age-of-the-open-city/

An example: If a city where to provide two pieces of information, geographical information on all lots in the city and a list of all the business licences registered in the city with the lot number they are registered to some neat applications could be built. Imagine a website that you go to and it tells you what the new business that have opened in your town are, it could take a location, maybe from your mobile phone, and use that to say “I  see that 3 months ago a new restaurant opened up three blocks away why not go and check it out.” It could also link this information to reviews and because it has a direct link to the cities information it would know if the restaurant went out of business (after a year if the licence was not renewed).  Now websites and applications like this do exist but they rely on data from crawling the web and the phone book and as such are not always accurate.

How

Data portability is always growing and changing. For a lot of data there are standards but for some more specific data types such as business licence information. New standards will need to be created.  This is an area where communities need to work closely together, they need to talk to each other about what is needed from this format and they need to remember that they can’t spend years or even months in debate over a standard just start simple and expand.

XML

XML is a core technology that enables most of the technologies on this page. You see it every day you just might not realize it as it is the base technology for xhtml (formerly known as html) and it what describes the layout of this web page.  At its core it’s just a way of representing a list of information. For example if I wanted to represent a list of companies and their license number I would have a file like:

<Companies>

<Company>

<BusinessName>ABC towing</BusinessName>

<LicenseNumber>123456789</LicenseNumber>

</Company>

<Company>

<BusinessName>John’s Campers</BusinessName>

<LicenseNumber>987654321</LicenseNumber>

</Company>

</Companies>

Here we have two companies ABC towing and John’s Campers. A computer can read in this file easily and make it searchable.

Microformats

 

These are a subset of xhtml that you can incorporate into your website. They are “Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards.” As stated on the official website http://microformats.org/

 

There are more microformats that I can list here and new ones are always evolving, a better list can be found at http://microformats.org/wiki/Main_Page#Specifications . Some of the most popular ones are hCard for addresses, hCalendar for events and other date based items and rel=”tag” for tag based grouping of links.

 

Going back to our example of business licences, you can create a page that lists off all this information for a given time period and simply use an hcard to mark it up. By doing this you’ve now made it so that applications like Google maps can scan this information and automatically add the business to Google maps search and users with the operator addon for firefox (https://addons.mozilla.org/en-US/firefox/addon/4106) can add the information directly to outlooks address book without having to type a single key stroke.
Example   hcard

<p class="vcard">

<a class="url fn org" City Hall</a><br/>

<span class="adr"> <span class="street-address">123 Main Street</span><br/>

 <span class="locality">MyCity</span>, <span class="region">BC</span>

<span class="postal-code">A9A 1A1</span></span><br/>

<span class="tel"><strong>Telephone:</strong>250-754-4251</span>

</p>

 


APIs

Another way to allow access to your data is to create an Application Programming Interface to your data. And API is a way for other sites, applications and services to access your information.  For example a business licence API might have a remote command that returns a list of licenses given a date range.

 

Many sites out there already have API’s that you can use such as

·         Twitter

o   http://apiwiki.twitter.com/

·         Google

o   http://code.google.com

·         Facebook

o   http://developers.facebook.com/

·         Many many more

o   http://www.programmableweb.com/apitag/government

 

There are a few technologies that make these API’s possible but knowing these is more the real of the team in IT.

·         SOAP

o   http://www.programmableweb.com/apitag/government

·         REST

o   http://en.wikipedia.org/wiki/Representational_State_Transfer

·         JSON

o   http://www.json.org/

o   JSON is a great way to transfer data if you want something with less overhead than XML and is easily read directly by a webpage. For example if you provide a JSON feed of your current weather, any website that wants to show the current weather needs only write some javascript to run on the client computer and the client will always have the newest weather.

·         GEOJSON

o   http://geojson.org

o   JSON for geographic data

·         KML

o   http://code.google.com/apis/kml/documentation/

o   XML for geographic data

·         SVG

o   http://en.wikipedia.org/wiki/SVG

o   Anytime you want to represent a line or Vector

·         Many many others

 

OpenID

This one is little different, it’s not a way to share your information it’s a way to use a specific piece of shared data, specially a username and password (or other type of authentication). OpenID (http://openid.net/) is a way to accept logins from other sources to authenticate your users. It is not a complete replacement for user accounts but it does make things easier for the end user.

 

How it works:

A user arrives at your site to sign up for a service, maybe register for a course. The user clicks on one of the provided sign in options, Yahoo, Google, Myspace, AOL or types in their login provider such as their blog. For now we will assume the user clicked Google but they all work the same. Because the user had checked their gmail they were already signed into Google. This is the first time the user has visited your site so Google asks them if they want to sign into your site and gives the domain of your site. The user says yes and Google sends them back to your site but also tells you a special URL that represents the user. This URL will never change so you can now store this as the username of the person that signed up.  At any point the user can come back, click Google and they are instantly able to see their courses without ever having to sign in. Also because you are never accepting a username or password you never have to worry about securing it, If they change their email password with Google it is changed on every site that they use that Google account for.

 

Providers

These are people who’s accounts

·         Google

·         Yahoo

·         Livejournal.com

·         Myopenid.com

·         Myspace

·         AOL

·         Many many more

o   http://openid.net/get/

·         Microsoft (someday)

o   Microsoft has said they plan on offering openID for a long time now they are just slow to implement. Here is hoping hotmail and live.com will join in soon.

 

 

The City of Nanaimo will be providing an openID  provider for use only with city services.

 

Relying party

A replying party is any site that uses openID to login. Some examples are:

·         Livejournal.com

·         Blogspot.com

·         Stackoverflow.com

·         Facebook (maybe)

o   They say they accept openID but I have been unable to get it working so if anyone that reads this gets it to work please let me know.

o   http://developers.facebook.com/news.php?blog=1&story=246

·         Many others

o   The list is growing but other than non AOL blogs no one has jumped on. I think this is because they want you to create an account. You can still require and account and use openID, it just means they don’t need a username and password for their account.

·         And soon the City of Nanaimo.

o   if any other cities join this movement please update this page.

Links:
    http://www.wolframalpha.com/ An interesting use of open data to answer most questions with a number answer. If it has the data it will try to answer it.

 http://wiki.civiccommons.com/ "The Civic Commons network is an effort to provide a permanent, sustainable organization to assist public agencies in the adoption of open systems and collaborative technologies, and to coordinate the co-creation of these technologies among agencies to ensure interoperability and shareability. "
Comments