]> git.openstreetmap.org Git - nominatim.git/commitdiff
document what country_osm_grid does
authormarc tobias <mtmail@gmx.net>
Wed, 28 Nov 2018 17:57:17 +0000 (18:57 +0100)
committermarc tobias <mtmail@gmx.net>
Thu, 29 Nov 2018 16:06:04 +0000 (17:06 +0100)
data-sources/country-grid/README.md [new file with mode: 0644]
data-sources/country-grid/country_grid.sql [moved from sql/country_grid.sql with 100% similarity]
data-sources/country-grid/mexico.quad.png [new file with mode: 0644]
docs/CMakeLists.txt
docs/mkdocs.yml

diff --git a/data-sources/country-grid/README.md b/data-sources/country-grid/README.md
new file mode 100644 (file)
index 0000000..c94929a
--- /dev/null
@@ -0,0 +1,77 @@
+# Fallback Country Boundaries
+
+Each place is assigned a `country_code` and partition. Partitions derive from `country_code`.
+
+Nominatim imports two pre-generated files
+
+   * `data/country_name.sql` (country code, name, default language, partition)
+   * `data/country_osm_grid.sql` (country code, geometry)
+
+before creating places in the database. This helps with fast lookups and missing data (e.g. if the data the user wants to import doesn't contain any country places).
+
+The number of countries in the world can change (South Sudan created 2011, Germany reunification), so can their boundaries. This document explain how the pre-generated files can be updated.
+
+
+
+## Country code
+
+Each place is assigned a two letter country_code based on its location, e.g. `gb` for Great Britain. Or `NULL` if no suitable country is found (usually it's in open water then).
+
+In `sql/functions.sql: get_country_code(geometry)` the place's center is checked against
+
+   1. country places already imported from the user's data file. Places are imported by rank low-to-high. Lowest rank 2 is countries so most places should be matched. Still the data file might be incomplete.
+   2. if unmatched: OSM grid boundaries
+   3. if still unmatched: OSM grid boundaries, but allow a small distance
+
+
+
+## Partitions
+
+Each place is assigned partition, which is a number 0..250. 0 is fallback/other.
+
+During place indexing (`sql/functions.sql: placex_insert()`) a place is assigned the partition based on its country code (`sql/functions.sql: get_partition(country_code)`). It checks in the `country_name` table.
+
+Most countries have their own parition, some share a partition. Thus partition counts vary greatly.
+
+Several database tables are split by partition to allow queries to run against less indices and improve caching.
+
+   * `location_area_large_<partition>`
+   * `search_name_<partition>`
+   * `location_road_<partition>`
+
+
+
+
+
+## Data files
+
+### `data/country_name.sql`
+
+Export from existing database table plus manual changes. `country_default_language_code` most taken from [https://wiki.openstreetmap.org/wiki/Nominatim/Country_Codes](), see `utils/country_languages.php`.
+
+
+
+### `data/country_osm_grid.sql`
+
+`country_grid.sql` merges territories by country. Then uses `function.sql: quad_split_geometry` to split each country into multiple [Quadtree](https://en.wikipedia.org/wiki/Quadtree) polygons for faster point-in-polygon lookups.
+
+To visualize one country as geojson feature collection, e.g. for loading into [geojson.io](http://geojson.io/):
+
+```
+-- http://www.postgresonline.com/journal/archives/267-Creating-GeoJSON-Feature-Collections-with-JSON-and-PostGIS-functions.html
+
+SELECT row_to_json(fc)
+FROM (
+  SELECT 'FeatureCollection' As type, array_to_json(array_agg(f)) As features
+  FROM (
+    SELECT 'Feature' As type,
+    ST_AsGeoJSON(lg.geometry)::json As geometry,
+    row_to_json((country_code, area)) As properties
+    FROM country_osm_grid As lg where country_code='mx'
+  ) As f
+) As fc;
+```
+
+`cat /tmp/query.sql | psql -At nominatim > /tmp/mexico.quad.geojson`
+
+![mexico](mexico.quad.png)
diff --git a/data-sources/country-grid/mexico.quad.png b/data-sources/country-grid/mexico.quad.png
new file mode 100644 (file)
index 0000000..61c1280
Binary files /dev/null and b/data-sources/country-grid/mexico.quad.png differ
index 68af5429257b2501858f0b9fce0ccdd02ce02160..87bb3cd534c4aba676aade3f4c45c94847d2249d 100644 (file)
@@ -14,6 +14,8 @@ ADD_CUSTOM_TARGET(doc
    COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_CURRENT_SOURCE_DIR}/index.md ${CMAKE_CURRENT_BINARY_DIR}/index.md
    COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_CURRENT_SOURCE_DIR}/extra.css ${CMAKE_CURRENT_BINARY_DIR}/extra.css
    COMMAND ${CMAKE_COMMAND} -E create_symlink ${PROJECT_SOURCE_DIR}/data-sources/us-tiger/README.md ${CMAKE_CURRENT_BINARY_DIR}/data-sources/US-Tiger.md
+   COMMAND ${CMAKE_COMMAND} -E create_symlink ${PROJECT_SOURCE_DIR}/data-sources/country-grid/README.md ${CMAKE_CURRENT_BINARY_DIR}/data-sources/Country-Grid.md
+   COMMAND ${CMAKE_COMMAND} -E create_symlink ${PROJECT_SOURCE_DIR}/data-sources/country-grid/mexico.grid.png ${CMAKE_CURRENT_BINARY_DIR}/data-sources/mexico.grid.png
    COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Centos-7.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Centos-7.md
    COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-16.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-16.md
    COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/bash2md.sh ${PROJECT_SOURCE_DIR}/vagrant/Install-on-Ubuntu-18.sh ${CMAKE_CURRENT_BINARY_DIR}/appendix/Install-on-Ubuntu-18.md
index 1a690e7b096427eca08d387123c8559758fe6375..271fd207e09e0346fd5d7b34bb5c3035c74dab6f 100644 (file)
@@ -23,6 +23,7 @@ pages:
     - 'External Data Sources':
         - 'Overview' : 'data-sources/overview.md'
         - 'US Census (Tiger)': 'data-sources/US-Tiger.md'
+        - 'Country Grid': 'data-sources/Country-Grid.md'
     - 'Appendix':
         - 'Installation on CentOS 7' : 'appendix/Install-on-Centos-7.md'
         - 'Installation on Ubuntu 16' : 'appendix/Install-on-Ubuntu-16.md'