]> git.openstreetmap.org Git - nominatim.git/commit
restrict partial word counting to names of reasoanble length
authorSarah Hoffmann <lonvia@denofr.de>
Fri, 2 Jul 2021 13:05:17 +0000 (15:05 +0200)
committerSarah Hoffmann <lonvia@denofr.de>
Sun, 4 Jul 2021 08:28:28 +0000 (10:28 +0200)
commitc32551b4e0978d2bd26b3fe6997a722562b3565b
tree21debf24776c9c19091c3f8e39c865b768e7fb22
parente85f7e7aa9b9c297b6b5f266d811c935af8cbb9e
restrict partial word counting to names of reasoanble length

The partial word count does not split names to save a bit of time.
The result is that it might enounter unreasonably long names
which in truth consist of multiple words. No accurate statistics
are needed so simply restrict the count to words shorter than
75 characters.
nominatim/tokenizer/legacy_icu_tokenizer.py