From 9934421442bf1a815096ef06ffa3f8c9ea11c0ac Mon Sep 17 00:00:00 2001 From: Sarah Hoffmann Date: Tue, 26 Oct 2021 09:37:57 +0200 Subject: [PATCH] make word count computation part of the import Accurate word counts are now essential when using the ICU tokenizer and don't hurt for the legacy one. Adds about an hour import time. --- docs/admin/Import.md | 15 +-------------- nominatim/clicmd/setup.py | 2 ++ 2 files changed, 3 insertions(+), 14 deletions(-) diff --git a/docs/admin/Import.md b/docs/admin/Import.md index 576c0097..7ebebde3 100644 --- a/docs/admin/Import.md +++ b/docs/admin/Import.md @@ -271,20 +271,7 @@ reverse query, e.g. `http://localhost:8088/reverse.php?lat=27.1750090510034&lon= To run Nominatim via webservers like Apache or nginx, please read the [Deployment chapter](Deployment.md). -## Tuning the database - -Accurate word frequency information for search terms helps PostgreSQL's query -planner to make the right decisions. Recomputing them can improve the performance -of forward geocoding in particular under high load. To recompute word counts run: - -```sh -nominatim refresh --word-counts -``` - -This will take a couple of hours for a full planet installation. You can -also defer that step to a later point in time when you realise that -performance becomes an issue. Just make sure that updates are stopped before -running this function. +## Adding search through category phrases If you want to be able to search for places by their type through [special phrases](https://wiki.openstreetmap.org/wiki/Nominatim/Special_Phrases) diff --git a/nominatim/clicmd/setup.py b/nominatim/clicmd/setup.py index 27847920..07dacbb4 100644 --- a/nominatim/clicmd/setup.py +++ b/nominatim/clicmd/setup.py @@ -125,6 +125,8 @@ class SetupAll: freeze.drop_update_tables(conn) tokenizer.finalize_import(args.config) + LOG.warning('Recompute word counts') + tokenizer.update_statistics() webdir = args.project_dir / 'website' LOG.warning('Setup website at %s', webdir) -- 2.39.5