Sarah Hoffmann [Sat, 17 Apr 2021 09:07:04 +0000 (11:07 +0200)]
add support index when continuing import at index phase
Indexing scans the placex table sequentially during indexing
on the initial import. That is okay because we know that all
rows need to be processed anywhere. When continuing the import,
however, a large part might already be indexed, so that the
process spends a lot of time going through rows that are no
longer of interest. Create a supporting index for all unindexed
rows to speed up the scan. This is the same index as used later
for updates.
Sarah Hoffmann [Thu, 1 Apr 2021 12:29:34 +0000 (14:29 +0200)]
use non-key index to speed up housenumber search
On Postgresql versions 11+ add an index to speed up the lookup
of housenumbers for terms found in search_name. This is really
just a band-aid around the query planer's interpretation of the
query.
Sarah Hoffmann [Mon, 29 Mar 2021 10:06:51 +0000 (12:06 +0200)]
allow sorting by housenumbers for rare street names
Usually we don't narrow down search results by house number when
only a street name is given because there may be a lot of rows
to cross check when the street name is very frequent. However,
when it is known to be rare, the housenumber check may be done
anyway.
Sarah Hoffmann [Sun, 21 Mar 2021 15:47:22 +0000 (16:47 +0100)]
avoid division by zero in progress meter
On Windows systems the timer may not be accurate enough to measure
the time between init() and done(). Avoid computing statistics with
a diff time of 0 in such cases.
AntoJvlt [Sat, 20 Mar 2021 17:55:08 +0000 (18:55 +0100)]
Ported functions for the import of special phrases from php to python.
- the command is now --import-special-phrases
- the output is not an sql file anymore, data are directly imported to the database.
- the little part on the documentation (section data import) has been modified.
Sarah Hoffmann [Tue, 16 Mar 2021 21:13:33 +0000 (22:13 +0100)]
bdd: run all setup via nominatim Python library
Drops all calls to PHP utility functions. nominatim cli functions
are used where possible, to stay as close to the final code as
possible with the tests.
By removing the PHP calls, the test code now only uses osm2pgsql and
the database module from the build directory.
Sarah Hoffmann [Thu, 11 Mar 2021 19:34:21 +0000 (20:34 +0100)]
higher penalty for special searches
Adds a general higher penalty for special search term and an
additional one if the term is anywhere but the beginning or the
end. Also housenumbers and special searches together are less
likely.
Sarah Hoffmann [Thu, 11 Mar 2021 16:14:46 +0000 (17:14 +0100)]
fix result splitting for last search group
When we are in the final iteration of the search groups, it is not
possible to further delay the results. Unconditionally use the
results with the best rank instead.
Sarah Hoffmann [Thu, 11 Mar 2021 14:03:36 +0000 (15:03 +0100)]
give preference to full words in address, too
Full word terms are already preferred for the name part. Adding
only one-word partials to the address, makes it impossible to
give a similar preference for the address part. Each term adds
a rank penalty. The problem here is that we interpret the query
forwards and backwards. Having different penalty systems for
name and address means that the same term ends up with different
penalties and that often leads to interpretations of the wrong
direction being in the way.
Sarah Hoffmann [Tue, 2 Mar 2021 20:26:13 +0000 (21:26 +0100)]
automatic migration from 3.6 release
Adds a 'admin --migrate' command that checks for the current
database version and runs any necessary migrations. Also
has migrations going back to 3.6.
Sarah Hoffmann [Thu, 4 Mar 2021 09:55:24 +0000 (10:55 +0100)]
port index creation to python
Also switches to jinja-based preprocessing, which allows to
simplify the SQL files. Use 'if not exists' where possible
so that the step can be rerun to fix missing indexes.
Sarah Hoffmann [Wed, 3 Mar 2021 16:37:22 +0000 (17:37 +0100)]
introduce jinja2 for preprocessing SQL
Replaces various hand-crafted replacements of varying format with
a single Jinja2 templating mechanism. Allows full access to
configuration if necessary.