Geocoding and disambiguation issues
The collected data refer to biographic events in the careers of several clerics ("Domherren") active in Electoral Mainz. The first geocoding attempt (v1) was carried out in QGIS, using the MMQIGS plug-in and the Open Street Map / Nominatim web service for geodata. For 281 entries in the input file, the geocoding API found 1641 results, attributing alternative coordinates to several non-unique place names. This problem generally occurs when geocoding European towns which have "colonial twins" in North America, South Africa or Australia.
View raw file of 1st geocoding attempt: Domherren_v1.geojson
Improved geocoding based on enriched CSV table
For the second geocoding test, the input data needed to be enriched and cleaned. In order to improve the automated place-matching, a separate table column named
modern_region was introduced to specify in which modern countries (e.g. Germany and France) the places ought to be located. In cases where the country was not immediately clear,
Europe was added to at least exclude overseas locations.
In MMQGIS, both the
country fields could thus be filled with input data to retrieve better geocoding results (cf. QGIS screenshot above). Out of 281 table entries, 215 could now be matched with a unique location. The entries not geocoded did not contain spatial information in the first place.
In addition, the four date columns (
end) specifying the time-frames of events in the original CSV table were merged into one
display date column in order to make creating a chronologically categorised map easier. The more detailed date information, however, has been kept for display in the map labels.
Finally, a column to count the overall occurence of each place name in the data set was added. These integers can be used as a
weight to define in which size point geometries ought to be displayed in maps.
Locating abolished institutions and destroyed buildings
One issue that even enriching the data could not solve, however, is that many early modern (religious) institutions no longer exist today, and that the buildings associated with these institutions have been destroyed. As modern geocoding APIs do not include geodata of past structures, the locations of such places need to be reconstructed from primary sources and secondary works. In the case of our Mainz dataset, neither the Google nor Geonames APIs could find the locations of former Stift St. Viktor and Stift Mariengreden. Assigning approximate geodata was possible thanks to information provided by Institut für Geschichtliche Landeskunde at Mainz University: