What is geocoding?
Since people and things are located somewhere on the surface of the planet, having a precise numerical position for them allows a lot of powerful applications. In the real world you will be often given addresses and not coordinates, so you will have to use geocoding to convert a verbal description to latitude and longitude.
"Via della Farfalla 32, 00155 Roma" => [lat: 41.895, lon: 12.585]
How does geocoding work?
The same address can be written in a variety of ways. Note the components in different order, language and nuance:
Via della Farfalla 32, 00155 Roma via farfalla 32 RM 32 Farfalla street, Rome, p.o. 00155
The address (1 in diagram below) is firstly converted to a structured format (each geocoder uses its own), in which the name of the street, civic number, locality and postal code are explicitly separated (2). This data structure is in turn interpolated with a Geographic Information System (GIS) to find the coordinates (3).
Imagine GIS as a big dataset containing geometric shapes representing real places like roads, intersections and buildings. Each shape is associated with one structured address and is geometrically defined by a set of nodes (latitude and longitude). So if you know an address you can obtain coordinates and if you know the coordinates you can obtain an address (this last process is called reverse geocoding). Let’s see an example of such a shape in OpenStreetMap XML data format:
<tag k=”name“ v=”Via della Farfalla“/>
That was Via della Farfalla, a residential street comprising a number of node points (nd). Each node contains in turn the coordinates and metadata such as contributor and latest update:
<node id=”302010593“ visible=”true“ version=”2“ changeset=”7364009“ timestamp=”2011-02-22T14:43:46Z“ user=”Davio“ uid=”217070“ lat=”41.8958580“ lon=”12.5854057“/>
The fun part is that these informations are directly available at web addresses. Check both the street and the specific node. If you feel brave you can download the world data dumps both in XML and linked data.
Even if the data is publicly available (thank you OSM!!!), the algorithms for optimal geocoding are quite complex. Luckily there are web APIs that offer free geocoding, at some common conditions:
- Registration: to use the service you must sign up and obtain an API key (a password)
- Rate limiting: you can geocode a fixed number of addresses per minute, or a fixed total number of addresses
- Data license: you can not save or use commercially the response provided by the goecoder
Geocoding services operate on data coming from no-profits (i.e. OpenStreetMap) or companies (i.e. Google). Of course you can expect corporate data (and so the algorithms) to be higher in quality, but the collaborative resources are getting better and better. Lately a number of open source geocoders have been published and promise very well.
Example of geocoding with an API
The procedure is really similar across vendors. After registering, you get an API key. To obtain coordinates of a place you send an http request following similar syntaxes: