From: Sarah Hoffmann Date: Tue, 12 Oct 2021 21:07:41 +0000 (+0200) Subject: docs: move import style description to customize section X-Git-Tag: v4.0.0~17^2~7 X-Git-Url: https://git.openstreetmap.org/nominatim.git/commitdiff_plain/a3f8a097a19b3f46f0a75f13ee0710c769a6def2 docs: move import style description to customize section --- diff --git a/docs/admin/Import.md b/docs/admin/Import.md index 88d3ba5b..befe989d 100644 --- a/docs/admin/Import.md +++ b/docs/admin/Import.md @@ -160,15 +160,15 @@ Nominatim normally sets up a full search database containing administrative boundaries, places, streets, addresses and POI data. There are also other import styles available which only read selected data: -* **settings/import-admin.style** +* **admin** Only import administrative boundaries and places. -* **settings/import-street.style** +* **street** Like the admin style but also adds streets. -* **settings/import-address.style** +* **address** Import all data necessary to compute addresses down to house number level. -* **settings/import-full.style** +* **full** Default style that also includes points of interest. -* **settings/import-extratags.style** +* **extratags** Like the full style but also adds most of the OSM tags into the extratags column. @@ -191,8 +191,8 @@ full | 54h | 640 GB | 330 GB extratags | 54h | 650 GB | 340 GB You can also customize the styles further. -A [description of the style format](../develop/Import.md#configuring-the-import) -can be found in the development section. +A [description of the style format](../customize/Import-Styles.md) +can be found in the customization guide. ## Initial import of the data diff --git a/docs/customize/Import-Styles.md b/docs/customize/Import-Styles.md new file mode 100644 index 00000000..f319284a --- /dev/null +++ b/docs/customize/Import-Styles.md @@ -0,0 +1,153 @@ +## Configuring the Import + +Which OSM objects are added to the database and which of the tags are used +can be configured via the import style configuration file. This +is a JSON file which contains a list of rules which are matched against every +tag of every object and then assign the tag its specific role. + +The style to use is given by the `NOMINATIM_IMPORT_STYLE` configuration +option. There are a number of default styles, which are explained in detail +in the [Import section](../admin/Import/#filtering-imported-data). These +standard styles may be reference by their name. + +You can also create your own custom syle. Put the style file into your +project directory and then set `NOMINATIM_IMPORT_STYLE` to the name of the file. +It is always recommended to start with one of the standard styles and customize +those. You find the standard styles under the name `import-.style` +in the standard Nominatim configuration path (usually `/etc/nominatim` or +`/usr/local/etc/nominatim`). + +The remainder of the page describes the format of the file. + +### Configuration Rules + +A single rule looks like this: + +```json +{ + "keys" : ["key1", "key2", ...], + "values" : { + "value1" : "prop", + "value2" : "prop1,prop2" + } +} +``` + +A rule first defines a list of keys to apply the rule to. This is always a list +of strings. The string may have four forms. An empty string matches against +any key. A string that ends in an asterisk `*` is a prefix match and accordingly +matches against any key that starts with the given string (minus the `*`). A +suffix match can be defined similarly with a string that starts with a `*`. Any +other string constitutes an exact match. + +The second part of the rules defines a list of values and the properties that +apply to a successful match. Value strings may be either empty, which +means that they match any value, or describe an exact match. Prefix +or suffix matching of values is not possible. + +For a rule to match, it has to find a valid combination of keys and values. The +resulting property is that of the matched values. + +The rules in a configuration file are processed sequentially and the first +match for each tag wins. + +A rule where key and value are the empty string is special. This defines the +fallback when none of the rules match. The fallback is always used as a last +resort when nothing else matches, no matter where the rule appears in the file. +Defining multiple fallback rules is not allowed. What happens in this case, +is undefined. + +### Tag Properties + +One or more of the following properties may be given for each tag: + +* `main` + + A principal tag. A new row will be added for the object with key and value + as `class` and `type`. + +* `with_name` + + When the tag is a principal tag (`main` property set): only really add a new + row, if there is any name tag found (a reference tag is not sufficient, see + below). + +* `with_name_key` + + When the tag is a principal tag (`main` property set): only really add a new + row, if there is also a name tag that matches the key of the principal tag. + For example, if the main tag is `bridge=yes`, then it will only be added as + an extra row, if there is a tag `bridge:name[:XXX]` for the same object. + If this property is set, all other names that are not domain-specific are + ignored. + +* `fallback` + + When the tag is a principal tag (`main` property set): only really add a new + row, when no other principal tags for this object have been found. Only one + fallback tag can win for an object. + +* `operator` + + When the tag is a principal tag (`main` property set): also include the + `operator` tag in the list of names. This is a special construct for an + out-dated tagging practise in OSM. Fuel stations and chain restaurants + in particular used to have the name of the chain tagged as `operator`. + These days the chain can be more commonly found in the `brand` tag but + there is still enough old data around to warrant this special case. + +* `name` + + Add tag to the list of names. + +* `ref` + + Add tag to the list of names as a reference. At the moment this only means + that the object is not considered to be named for `with_name`. + +* `address` + + Add tag to the list of address tags. If the tag starts with `addr:` or + `is_in:`, then this prefix is cut off before adding it to the list. + +* `postcode` + + Add the value as a postcode to the address tags. If multiple tags are + candidate for postcodes, one wins out and the others are dropped. + +* `country` + + Add the value as a country code to the address tags. The value must be a + two letter country code, otherwise it is ignored. If there are multiple + tags that match, then one wins out and the others are dropped. + +* `house` + + If no principle tags can be found for the object, still add the object with + `class`=`place` and `type`=`house`. Use this for address nodes that have no + other function. + +* `interpolation` + + Add this object as an address interpolation (appears as `class`=`place` and + `type`=`houses` in the database). + +* `extra` + + Add tag to the list of extra tags. + +* `skip` + + Skip the tag completely. Useful when a custom default fallback is defined + or to define exceptions to rules. + +A rule can define as many of these properties for one match as it likes. For +example, if the property is `"main,extra"` then the tag will open a new row +but also have the tag appear in the list of extra tags. + +### Changing the Style of Existing Databases + +There is normally no issue changing the style of a database that is already +imported and now kept up-to-date with change files. Just be aware that any +change in the style applies to updates only. If you want to change the data +that is already in the database, then a reimport is necessary. diff --git a/docs/customize/Overview.md b/docs/customize/Overview.md index b86a7164..7070be92 100644 --- a/docs/customize/Overview.md +++ b/docs/customize/Overview.md @@ -3,9 +3,10 @@ work for most standard installations. If you have special requirements, there are many places where the configuration can be adapted. This chapter describes the following configurable parts: -* [Global Settings](Settings.md) - detailed description of all parameters that +* [Global Settings](Settings.md) has a detailed description of all parameters that can be set in your local `.env` configuration -* [Tokenizers](Tokenizers.md) - describe the configuration of the module +* [Import styles](Import-Styles.md) explains how to write your own import style. +* [Tokenizers](Tokenizers.md) describes the configuration of the module responsible for analysing and indexing names * [Special Phrases](Special-Phrases.md) are common nouns or phrases that can be used in search to identify a class of places diff --git a/docs/develop/Import.md b/docs/develop/Import.md index c9612bb3..0f98dafc 100644 --- a/docs/develop/Import.md +++ b/docs/develop/Import.md @@ -25,146 +25,3 @@ motorway bridge. In OSM, this would be a way which is tagged with `highway=motorway` and `bridge=yes`. This way would appear in the `place` table once with `class` of `highway` and once with a `class` of `bridge`. Thus the *unique key* for `place` is (`osm_type`, `osm_id`, `class`). - -## Configuring the Import - -How tags are interpreted and assigned to the different `place` columns can be -configured via the import style configuration file (`NOMINATIM_IMPORT_STYLE`). This -is a JSON file which contains a list of rules which are matched against every -tag of every object and then assign the tag its specific role. - -### Configuration Rules - -A single rule looks like this: - -```json -{ - "keys" : ["key1", "key2", ...], - "values" : { - "value1" : "prop", - "value2" : "prop1,prop2" - } -} -``` - -A rule first defines a list of keys to apply the rule to. This is always a list -of strings. The string may have four forms. An empty string matches against -any key. A string that ends in an asterisk `*` is a prefix match and accordingly -matches against any key that starts with the given string (minus the `*`). A -suffix match can be defined similarly with a string that starts with a `*`. Any -other string constitutes an exact match. - -The second part of the rules defines a list of values and the properties that -apply to a successful match. Value strings may be either empty, which -means that they match any value, or describe an exact match. Prefix -or suffix matching of values is not possible. - -For a rule to match, it has to find a valid combination of keys and values. The -resulting property is that of the matched values. - -The rules in a configuration file are processed sequentially and the first -match for each tag wins. - -A rule where key and value are the empty string is special. This defines the -fallback when none of the rules match. The fallback is always used as a last -resort when nothing else matches, no matter where the rule appears in the file. -Defining multiple fallback rules is not allowed. What happens in this case, -is undefined. - -### Tag Properties - -One or more of the following properties may be given for each tag: - -* `main` - - A principal tag. A new row will be added for the object with key and value - as `class` and `type`. - -* `with_name` - - When the tag is a principal tag (`main` property set): only really add a new - row, if there is any name tag found (a reference tag is not sufficient, see - below). - -* `with_name_key` - - When the tag is a principal tag (`main` property set): only really add a new - row, if there is also a name tag that matches the key of the principal tag. - For example, if the main tag is `bridge=yes`, then it will only be added as - an extra row, if there is a tag `bridge:name[:XXX]` for the same object. - If this property is set, all other names that are not domain-specific are - ignored. - -* `fallback` - - When the tag is a principal tag (`main` property set): only really add a new - row, when no other principal tags for this object have been found. Only one - fallback tag can win for an object. - -* `operator` - - When the tag is a principal tag (`main` property set): also include the - `operator` tag in the list of names. This is a special construct for an - out-dated tagging practise in OSM. Fuel stations and chain restaurants - in particular used to have the name of the chain tagged as `operator`. - These days the chain can be more commonly found in the `brand` tag but - there is still enough old data around to warrant this special case. - -* `name` - - Add tag to the list of names. - -* `ref` - - Add tag to the list of names as a reference. At the moment this only means - that the object is not considered to be named for `with_name`. - -* `address` - - Add tag to the list of address tags. If the tag starts with `addr:` or - `is_in:`, then this prefix is cut off before adding it to the list. - -* `postcode` - - Add the value as a postcode to the address tags. If multiple tags are - candidate for postcodes, one wins out and the others are dropped. - -* `country` - - Add the value as a country code to the address tags. The value must be a - two letter country code, otherwise it is ignored. If there are multiple - tags that match, then one wins out and the others are dropped. - -* `house` - - If no principle tags can be found for the object, still add the object with - `class`=`place` and `type`=`house`. Use this for address nodes that have no - other function. - -* `interpolation` - - Add this object as an address interpolation (appears as `class`=`place` and - `type`=`houses` in the database). - -* `extra` - - Add tag to the list of extra tags. - -* `skip` - - Skip the tag completely. Useful when a custom default fallback is defined - or to define exceptions to rules. - -A rule can define as many of these properties for one match as it likes. For -example, if the property is `"main,extra"` then the tag will open a new row -but also have the tag appear in the list of extra tags. - -There are a number of pre-defined styles in the `settings/` directory. It is -advisable to start from one of these styles when defining your own. - -### Changing the Style of Existing Databases - -There is normally no issue changing the style of a database that is already -imported and now kept up-to-date with change files. Just be aware that any -change in the style applies to updates only. If you want to change the data -that is already in the database, then a reimport is necessary. diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index c4579036..d00e0499 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -25,9 +25,10 @@ pages: - 'Troubleshooting' : 'admin/Faq.md' - 'Customization Guide': - 'Overview': 'customize/Overview.md' + - 'Import Styles': 'customize/Import-Styles.md' - 'Configuration Settings': 'customize/Settings.md' - - 'Special Phrases': 'customize/Special-Phrases.md' - 'Tokenizers' : 'customize/Tokenizers.md' + - 'Special Phrases': 'customize/Special-Phrases.md' - 'External data: US housenumbers from TIGER': 'customize/Tiger.md' - 'External data: Postcodes': 'customize/Postcodes.md' - 'Developers Guide':