Unlock Places is a service which searches across many different sources of data, mostly opens geographic data, and provides the user with points or with detailed shapes, footprints, bounding boxes, representing the places that they're searching for. Unlock also provides the Unlock text service which is a geoparsing service which extracts location information from documents and leaves them geotagged with the most likely locations. We developed this software in partnership with the Language Technology Group (LTG) at the School of Informatics in Edinburgh, so quite a serious dedicated research group of computational linguists.
The language technology group weren't particularly specialists in geographic information, so EDINA came to them and helped develop different algorithms for being able to pick the best known locations, and that's when we wanted to start adding more data sources to Unlock Places to provide better coverage. Particularly, to provide worldwide coverage so that we could enable the service to be used by researchers around the world, not just in the UK.
Unlock began based on Ordnance Survey's MasterMap data. EDINA provided this license for educational use only, and a lot of work was done extracting more semantic detail from MasterMap that the Ordnance Survey products don't necessarily provide, looking at names to figure out features types and returning more information to the user. So it provided a very rich search of shapes of rivers, detailed outlines of towns and so on. But, of course, what it was missing was the ability to be reused in other applications, to have the data and republish it as part of an academic publication and so on. Soon after I started at EDINA, we started looking at different open data sources to add to the service, the first of which is GeoNames which is a worldwide public domain data set of many millions of points worldwide.
And we also incorporated very soon after the launch of Ordnance Survey Open Data, we were one of the first people to reuse that data in an application. This Ordnance Survey Open Data for the UK provides us with detailed boundaries for political areas. It provides post code look up and geocoding. So it's great to be able to offer that without registration, to anybody, so really turn Unlock into a more free and open service. And quite recently we added to the service Natural Earth which is another public domain project that provides quite detailed political boundaries worldwide.
So EDINA's collaboration with the Language Technology Group began as a series of projects, mostly looking at some historic archived text minings, taking 19th Century parliamentary reports and population reports, digitizing them, and then extracting and geo-referencing the content, sort of GeoEnabling archival collections.
LTG discovered that the quality of the extraction of place names from a document is much higher the more place names that you can successfully identify. And in a lot of the historical cases, place names were being missed, because they weren't in the contemporary gazetteers, the names had changed. As a result of this, we started looking at what ways to augment contemporary gazetteer data sources with deeper historic-rich data.
It's being used in various different ways. It's being used by a map search and ranking service that various national libraries use to publish their map collections. And it's also being used by several web services mainly for post code geocodings such as being able to use a search by post code and get the approximate location or to a geocode collection of records based on the post codes. Obviously, the further back in history, the more problematic that becomes. But post code geocoding will get you a rough level of accuracy precision. We've used Unlock within the Chalice Project, the historic text mining project. We used it to align some of the historic names in the English Place Name Survey with contemporary sources with GeoNames, and also Ordnance Survey Open Data.
Another project that's used Unlock has taken transcripts of interviews covering a certain geographic area and used Unlock Text to pick out locations of likely places within the transcripts and geolocate the centres of activity. That same process is being gone through with parliamentary transcripts again to sort of show a focus of where the parliamentary discussion is happening. The Unlock text service is also being used, plugged into, institutional repositories so a new learning resource or a publication goes into a repository and Unlock Text is experimentally used to pick out the locations and provide more metadata for the users and make the content more searchable and more linkable across geography and provide the sort of missing links between archival collections.