From: Sarah Hoffmann Date: Sun, 26 Apr 2020 08:20:30 +0000 (+0200) Subject: Merge pull request #1754 from mtmail/nominatim-db-tests-against-postgres X-Git-Tag: v3.5.0~30 X-Git-Url: https://git.openstreetmap.org/nominatim.git/commitdiff_plain/65ee7a80025fae93287960a6cb1f4026f99cd7f3?hp=a5d0657d9b7a7d975082d2d65a20918c1ca5b108 Merge pull request #1754 from mtmail/nominatim-db-tests-against-postgres Nominatim::DB tests against separate postgresql database --- diff --git a/README.md b/README.md index 1ae0dab0..7a75fe93 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -[![Build Status](https://travis-ci.org/openstreetmap/Nominatim.svg?branch=master)](https://travis-ci.org/openstreetmap/Nominatim) +[![Build Status](https://travis-ci.org/osm-search/Nominatim.svg?branch=master)](https://travis-ci.org/osm-search/Nominatim) Nominatim ========= diff --git a/VAGRANT.md b/VAGRANT.md index 0cab24fa..4c8eb724 100644 --- a/VAGRANT.md +++ b/VAGRANT.md @@ -141,7 +141,7 @@ No. Long running Nominatim installations will differ once new import features (o bug fixes) get added since those usually only get applied to new/changed data. Also this document skips the optional Wikipedia data import which affects ranking -of search results. See [Nominatim installation](http://nominatim.org/release-docs/latest/Installation) for details. +of search results. See [Nominatim installation](https://nominatim.org/release-docs/latest/admin/Installation) for details. ##### Why Ubuntu? Can I test CentOS/Fedora/CoreOS/FreeBSD? diff --git a/docs/admin/Advanced-Installations.md b/docs/admin/Advanced-Installations.md new file mode 100644 index 00000000..b22d9a61 --- /dev/null +++ b/docs/admin/Advanced-Installations.md @@ -0,0 +1,109 @@ +# Advanced installations + +This page contains instructions for setting up multiple countries in +your Nominatim database. It is assumed that you have already successfully +installed the Nominatim software itself, if not return to the +[installation page](Installation.md). + +## Importing multiple regions + +To import multiple regions in your database, you need to configure and run `utils/import_multiple_regions.sh` file. This script will set up the update directory which has the following structure: + +```bash +update +    ├── europe +    │   ├── andorra +    │   │   └── sequence.state +    │   └── monaco +    │   └── sequence.state +    └── tmp + ├── combined.osm.pbf + └── europe + ├── andorra-latest.osm.pbf + └── monaco-latest.osm.pbf + + +``` + +The `sequence.state` files will contain the sequence ID, which will be used by pyosmium to get updates. The tmp folder is used for import dump. + +### Configuring multiple regions + +The file `import_multiple_regions.sh` needs to be edited as per your requirement: + +1. List of countries. eg: + + COUNTRIES="europe/monaco europe/andorra" + +2. Path to Build directory. eg: + + NOMINATIMBUILD="/srv/nominatim/build" + +3. Path to Update directory. eg: + + UPDATEDIR="/srv/nominatim/update" + +4. Replication URL. eg: + + BASEURL="https://download.geofabrik.de" + DOWNCOUNTRYPOSTFIX="-latest.osm.pbf" + +!!! tip + If your database already exists and you want to add more countries, replace the setting up part + `${SETUPFILE} --osm-file ${UPDATEDIR}/tmp/combined.osm.pbf --all 2>&1` + with `${UPDATEFILE} --import-file ${UPDATEDIR}/tmp/combined.osm.pbf 2>&1`. + +### Setting up multiple regions + +Run the following command from your Nominatim directory after configuring the file. + + bash ./utils/import_multiple_regions.sh + +!!! danger "Important" + This file uses osmium-tool. It must be installed before executing the import script. + Installation instructions can be found [here](https://osmcode.org/osmium-tool/manual.html#installation). + +## Updating multiple regions + +To import multiple regions in your database, you need to configure and run ```utils/update_database.sh```. +This uses the update directory set up while setting up the DB. + +### Configuring multiple regions + +The file `update_database.sh` needs to be edited as per your requirement: + +1. List of countries. eg: + + COUNTRIES="europe/monaco europe/andorra" + +2. Path to Build directory. eg: + + NOMINATIMBUILD="/srv/nominatim/build" + +3. Path to Update directory. eg: + + UPDATEDIR="/srv/nominatim/update" + +4. Replication URL. eg: + + BASEURL="https://download.geofabrik.de" + DOWNCOUNTRYPOSTFIX="-updates" + +5. Followup can be set according to your installation. eg: For Photon, + + FOLLOWUP="curl http://localhost:2322/nominatim-update" + + will handle the indexing. + +### Updating the database + +Run the following command from your Nominatim directory after configuring the file. + + bash ./utils/update_database.sh + +This will get diffs from the replication server, import diffs and index the database. The default replication server in the script([Geofabric](https://download.geofabrik.de)) provides daily updates. + +## Verification and further setup + +Instructions for import verification and other details like importing Wikidata can be found in [import and update page](Import-and-Update.md) + diff --git a/docs/admin/Import-and-Update.md b/docs/admin/Import-and-Update.md index 554633ae..0d1bb027 100644 --- a/docs/admin/Import-and-Update.md +++ b/docs/admin/Import-and-Update.md @@ -318,6 +318,5 @@ compatibility reasons, Osmosis is not required to run this - it uses pyosmium behind the scenes.) If you have imported multiple country extracts and want to keep them -up-to-date, have a look at the script in -[issue #60](https://github.com/openstreetmap/Nominatim/issues/60). - +up-to-date, [Advanced installations section](Advanced-Installations.md) contains instructions +to set up and update multiple country extracts. \ No newline at end of file diff --git a/docs/api/Search.md b/docs/api/Search.md index 688d7e0c..c18655dc 100644 --- a/docs/api/Search.md +++ b/docs/api/Search.md @@ -92,8 +92,12 @@ comma-separated list of language codes. * `countrycodes=[,][,]...` Limit search results to one or more countries. `` must be the -ISO 3166-1alpha2 code, e.g. `gb` for the United Kingdom, `de` for Germany. +[ISO 3166-1alpha2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) code, +e.g. `gb` for the United Kingdom, `de` for Germany. +Each place in Nominatim is assigned to one country code based +on `admin_level=2` tags, in rare cases to none (for example in +international waters outside any country). * `exclude_place_ids= array('label' => 'Caravan Site', 'frequency' => 183, 'icon' => 'accommodation_caravan_park'), 'amenity:bus_station' => array('label' => 'Bus Station', 'frequency' => 181, 'icon' => 'transport_bus_station'), 'amenity:kindergarten' => array('label' => 'Kindergarten', 'frequency' => 179), - 'highway:construction' => array('label' => 'Construction', 'frequency' => 176), + 'highway:construction' => array('label' => 'Construction', 'frequency' => 176, 'simplelabel' => 'road'), 'amenity:atm' => array('label' => 'Atm', 'frequency' => 172, 'icon' => 'money_atm2'), 'amenity:emergency_phone' => array('label' => 'Emergency Phone', 'frequency' => 164), 'waterway:lock' => array('label' => 'Lock', 'frequency' => 146), diff --git a/lib/Shell.php b/lib/Shell.php new file mode 100644 index 00000000..59c4473b --- /dev/null +++ b/lib/Shell.php @@ -0,0 +1,80 @@ +baseCmd = $sBaseCmd; + $this->aParams = array(); + $this->aEnv = null; // null = use the same environment as the current PHP process + + $this->stdoutString = null; + + foreach ($aParams as $sParam) { + $this->addParams($sParam); + } + } + + public function addParams(...$aParams) + { + foreach ($aParams as $sParam) { + if (isset($sParam) && $sParam !== null && $sParam !== '') { + array_push($this->aParams, $sParam); + } + } + return $this; + } + + public function addEnvPair($sKey, $sVal) + { + if (isset($sKey) && $sKey && isset($sVal)) { + if (!isset($this->aEnv)) $this->aEnv = $_ENV; + $this->aEnv = array_merge($this->aEnv, array($sKey => $sVal), $_ENV); + } + return $this; + } + + public function escapedCmd() + { + $aEscaped = array_map(function ($sParam) { + return $this->escapeParam($sParam); + }, array_merge(array($this->baseCmd), $this->aParams)); + + return join(' ', $aEscaped); + } + + public function run() + { + $sCmd = $this->escapedCmd(); + // $aEnv does not need escaping, proc_open seems to handle it fine + + $aFDs = array( + 0 => array('pipe', 'r'), + 1 => STDOUT, + 2 => STDERR + ); + $aPipes = null; + $hProc = @proc_open($sCmd, $aFDs, $aPipes, null, $this->aEnv); + if (!is_resource($hProc)) { + throw new \Exception('Unable to run command: ' . $sCmd); + } + + fclose($aPipes[0]); // no stdin + + $iStat = proc_close($hProc); + return $iStat; + } + + + + private function escapeParam($sParam) + { + if (preg_match('/^-*\w+$/', $sParam)) return $sParam; + return escapeshellarg($sParam); + } +} diff --git a/lib/cmd.php b/lib/cmd.php index 77878c15..72b66608 100644 --- a/lib/cmd.php +++ b/lib/cmd.php @@ -1,5 +1,6 @@ addParams('--port', $aDSNInfo['port']); + $oCmd->addParams('--dbname', $aDSNInfo['database']); if (isset($aDSNInfo['hostspec']) && $aDSNInfo['hostspec']) { - $sCMD .= ' -h ' . escapeshellarg($aDSNInfo['hostspec']); + $oCmd->addParams('--host', $aDSNInfo['hostspec']); } if (isset($aDSNInfo['username']) && $aDSNInfo['username']) { - $sCMD .= ' -U ' . escapeshellarg($aDSNInfo['username']); + $oCmd->addParams('--username', $aDSNInfo['username']); } - $aProcEnv = null; - if (isset($aDSNInfo['password']) && $aDSNInfo['password']) { - $aProcEnv = array_merge(array('PGPASSWORD' => $aDSNInfo['password']), $_ENV); + if (isset($aDSNInfo['password'])) { + $oCmd->addEnvPair('PGPASSWORD', $aDSNInfo['password']); } if (!$bVerbose) { - $sCMD .= ' -q'; + $oCmd->addParams('--quiet'); } if ($bfatal && !$bIgnoreErrors) { - $sCMD .= ' -v ON_ERROR_STOP=1'; + $oCmd->addParams('-v', 'ON_ERROR_STOP=1'); } + $aDescriptors = array( 0 => array('pipe', 'r'), 1 => STDOUT, 2 => STDERR ); $ahPipes = null; - $hProcess = @proc_open($sCMD, $aDescriptors, $ahPipes, null, $aProcEnv); + $hProcess = @proc_open($oCmd->escapedCmd(), $aDescriptors, $ahPipes, null, $oCmd->aEnv); if (!is_resource($hProcess)) { fail('unable to start pgsql'); } @@ -193,23 +195,3 @@ function runSQLScript($sScript, $bfatal = true, $bVerbose = false, $bIgnoreError fail("pgsql returned with error code ($iReturn)"); } } - - -function runWithEnv($sCmd, $aEnv) -{ - $aFDs = array( - 0 => array('pipe', 'r'), - 1 => STDOUT, - 2 => STDERR - ); - $aPipes = null; - $hProc = @proc_open($sCmd, $aFDs, $aPipes, null, $aEnv); - if (!is_resource($hProc)) { - fail('unable to run command:' . $sCmd); - } - - fclose($aPipes[0]); // no stdin - - $iStat = proc_close($hProc); - return $iStat; -} diff --git a/lib/setup/SetupClass.php b/lib/setup/SetupClass.php index 7c1c628e..385eff70 100755 --- a/lib/setup/SetupClass.php +++ b/lib/setup/SetupClass.php @@ -3,6 +3,7 @@ namespace Nominatim\Setup; require_once(CONST_BasePath.'/lib/setup/AddressLevelParser.php'); +require_once(CONST_BasePath.'/lib/Shell.php'); class SetupFunctions { @@ -51,7 +52,7 @@ class SetupFunctions } // setting member variables based on command line options stored in $aCMDResult - $this->bQuiet = $aCMDResult['quiet']; + $this->bQuiet = isset($aCMDResult['quiet']) && $aCMDResult['quiet']; $this->bVerbose = $aCMDResult['verbose']; //setting default values which are not set by the update.php array @@ -76,7 +77,7 @@ class SetupFunctions $this->bEnableDiffUpdates = false; } - $this->bDrop = $aCMDResult['drop']; + $this->bDrop = isset($aCMDResult['drop']) && $aCMDResult['drop']; } public function createDB() @@ -88,19 +89,23 @@ class SetupFunctions fail('database already exists ('.CONST_Database_DSN.')'); } - $sCreateDBCmd = 'createdb -E UTF-8' - .' -p '.escapeshellarg($this->aDSNInfo['port']) - .' '.escapeshellarg($this->aDSNInfo['database']); + $oCmd = (new \Nominatim\Shell('createdb')) + ->addParams('-E', 'UTF-8') + ->addParams('-p', $this->aDSNInfo['port']); + if (isset($this->aDSNInfo['username'])) { - $sCreateDBCmd .= ' -U '.escapeshellarg($this->aDSNInfo['username']); + $oCmd->addParams('-U', $this->aDSNInfo['username']); + } + if (isset($this->aDSNInfo['password'])) { + $oCmd->addEnvPair('PGPASSWORD', $this->aDSNInfo['password']); } - if (isset($this->aDSNInfo['hostspec'])) { - $sCreateDBCmd .= ' -h '.escapeshellarg($this->aDSNInfo['hostspec']); + $oCmd->addParams('-h', $this->aDSNInfo['hostspec']); } + $oCmd->addParams($this->aDSNInfo['database']); - $result = $this->runWithPgEnv($sCreateDBCmd); - if ($result != 0) fail('Error executing external command: '.$sCreateDBCmd); + $result = $oCmd->run(); + if ($result != 0) fail('Error executing external command: '.$oCmd->escapedCmd()); } public function connect() @@ -174,39 +179,49 @@ class SetupFunctions { info('Import data'); - $osm2pgsql = CONST_Osm2pgsql_Binary; - if (!file_exists($osm2pgsql)) { + if (!file_exists(CONST_Osm2pgsql_Binary)) { echo "Check CONST_Osm2pgsql_Binary in your local settings file.\n"; echo "Normally you should not need to set this manually.\n"; - fail("osm2pgsql not found in '$osm2pgsql'"); + fail("osm2pgsql not found in '".CONST_Osm2pgsql_Binary."'"); } - $osm2pgsql .= ' -S '.escapeshellarg(CONST_Import_Style); + $oCmd = new \Nominatim\Shell(CONST_Osm2pgsql_Binary); + $oCmd->addParams('--style', CONST_Import_Style); if (!is_null(CONST_Osm2pgsql_Flatnode_File) && CONST_Osm2pgsql_Flatnode_File) { - $osm2pgsql .= ' --flat-nodes '.escapeshellarg(CONST_Osm2pgsql_Flatnode_File); - } - - if (CONST_Tablespace_Osm2pgsql_Data) - $osm2pgsql .= ' --tablespace-slim-data '.escapeshellarg(CONST_Tablespace_Osm2pgsql_Data); - if (CONST_Tablespace_Osm2pgsql_Index) - $osm2pgsql .= ' --tablespace-slim-index '.escapeshellarg(CONST_Tablespace_Osm2pgsql_Index); - if (CONST_Tablespace_Place_Data) - $osm2pgsql .= ' --tablespace-main-data '.escapeshellarg(CONST_Tablespace_Place_Data); - if (CONST_Tablespace_Place_Index) - $osm2pgsql .= ' --tablespace-main-index '.escapeshellarg(CONST_Tablespace_Place_Index); - $osm2pgsql .= ' -lsc -O gazetteer --hstore --number-processes 1'; - $osm2pgsql .= ' -C '.escapeshellarg($this->iCacheMemory); - $osm2pgsql .= ' -P '.escapeshellarg($this->aDSNInfo['port']); + $oCmd->addParams('--flat-nodes', CONST_Osm2pgsql_Flatnode_File); + } + if (CONST_Tablespace_Osm2pgsql_Data) { + $oCmd->addParams('--tablespace-slim-data', CONST_Tablespace_Osm2pgsql_Data); + } + if (CONST_Tablespace_Osm2pgsql_Index) { + $oCmd->addParams('--tablespace-slim-index', CONST_Tablespace_Osm2pgsql_Index); + } + if (CONST_Tablespace_Place_Data) { + $oCmd->addParams('--tablespace-main-data', CONST_Tablespace_Place_Data); + } + if (CONST_Tablespace_Place_Index) { + $oCmd->addParams('--tablespace-main-index', CONST_Tablespace_Place_Index); + } + $oCmd->addParams('--latlong', '--slim', '--create'); + $oCmd->addParams('--output', 'gazetteer'); + $oCmd->addParams('--hstore'); + $oCmd->addParams('--number-processes', 1); + $oCmd->addParams('--cache', $this->iCacheMemory); + $oCmd->addParams('--port', $this->aDSNInfo['port']); + if (isset($this->aDSNInfo['username'])) { - $osm2pgsql .= ' -U '.escapeshellarg($this->aDSNInfo['username']); + $oCmd->addParams('--username', $this->aDSNInfo['username']); + } + if (isset($this->aDSNInfo['password'])) { + $oCmd->addEnvPair('PGPASSWORD', $this->aDSNInfo['password']); } if (isset($this->aDSNInfo['hostspec'])) { - $osm2pgsql .= ' -H '.escapeshellarg($this->aDSNInfo['hostspec']); + $oCmd->addParams('--host', $this->aDSNInfo['hostspec']); } - $osm2pgsql .= ' -d '.escapeshellarg($this->aDSNInfo['database']).' '.escapeshellarg($sOSMFile); - - $this->runWithPgEnv($osm2pgsql); + $oCmd->addParams('--database', $this->aDSNInfo['database']); + $oCmd->addParams($sOSMFile); + $oCmd->run(); if (!$this->sIgnoreErrors && !$this->oDB->getRow('select * from place limit 1')) { fail('No Data'); @@ -529,39 +544,48 @@ class SetupFunctions public function index($bIndexNoanalyse) { - $sBaseCmd = CONST_BasePath.'/nominatim/nominatim.py' - .' -d '.escapeshellarg($this->aDSNInfo['database']) - .' -P '.escapeshellarg($this->aDSNInfo['port']) - .' -t '.escapeshellarg($this->iInstances); + $oBaseCmd = (new \Nominatim\Shell(CONST_BasePath.'/nominatim/nominatim.py')) + ->addParams('--database', $this->aDSNInfo['database']) + ->addParams('--port', $this->aDSNInfo['port']) + ->addParams('--threads', $this->iInstances); + if (!$this->bQuiet) { - $sBaseCmd .= ' -v'; + $oBaseCmd->addParams('-v'); } if ($this->bVerbose) { - $sBaseCmd .= ' -v'; + $oBaseCmd->addParams('-v'); } if (isset($this->aDSNInfo['hostspec'])) { - $sBaseCmd .= ' -H '.escapeshellarg($this->aDSNInfo['hostspec']); + $oBaseCmd->addParams('--host', $this->aDSNInfo['hostspec']); } if (isset($this->aDSNInfo['username'])) { - $sBaseCmd .= ' -U '.escapeshellarg($this->aDSNInfo['username']); + $oBaseCmd->addParams('--user', $this->aDSNInfo['username']); + } + if (isset($this->aDSNInfo['password'])) { + $oBaseCmd->addEnvPair('PGPASSWORD', $this->aDSNInfo['password']); } info('Index ranks 0 - 4'); - $iStatus = $this->runWithPgEnv($sBaseCmd.' -R 4'); + $oCmd = (clone $oBaseCmd)->addParams('--maxrank', 4); + echo $oCmd->escapedCmd(); + + $iStatus = $oCmd->run(); if ($iStatus != 0) { fail('error status ' . $iStatus . ' running nominatim!'); } if (!$bIndexNoanalyse) $this->pgsqlRunScript('ANALYSE'); info('Index ranks 5 - 25'); - $iStatus = $this->runWithPgEnv($sBaseCmd.' -r 5 -R 25'); + $oCmd = (clone $oBaseCmd)->addParams('--minrank', 5, '--maxrank', 25); + $iStatus = $oCmd->run(); if ($iStatus != 0) { fail('error status ' . $iStatus . ' running nominatim!'); } if (!$bIndexNoanalyse) $this->pgsqlRunScript('ANALYSE'); info('Index ranks 26 - 30'); - $iStatus = $this->runWithPgEnv($sBaseCmd.' -r 26'); + $oCmd = (clone $oBaseCmd)->addParams('--minrank', 26); + $iStatus = $oCmd->run(); if ($iStatus != 0) { fail('error status ' . $iStatus . ' running nominatim!'); } @@ -753,21 +777,21 @@ class SetupFunctions { if (!file_exists($sFilename)) fail('unable to find '.$sFilename); - $sCMD = 'psql' - .' -p '.escapeshellarg($this->aDSNInfo['port']) - .' -d '.escapeshellarg($this->aDSNInfo['database']); + $oCmd = (new \Nominatim\Shell('psql')) + ->addParams('--port', $this->aDSNInfo['port']) + ->addParams('--dbname', $this->aDSNInfo['database']); + if (!$this->bVerbose) { - $sCMD .= ' -q'; + $oCmd->addParams('--quiet'); } if (isset($this->aDSNInfo['hostspec'])) { - $sCMD .= ' -h '.escapeshellarg($this->aDSNInfo['hostspec']); + $oCmd->addParams('--host', $this->aDSNInfo['hostspec']); } if (isset($this->aDSNInfo['username'])) { - $sCMD .= ' -U '.escapeshellarg($this->aDSNInfo['username']); + $oCmd->addParams('--username', $this->aDSNInfo['username']); } - $aProcEnv = null; if (isset($this->aDSNInfo['password'])) { - $aProcEnv = array_merge(array('PGPASSWORD' => $this->aDSNInfo['password']), $_ENV); + $oCmd->addEnvPair('PGPASSWORD', $this->aDSNInfo['password']); } $ahGzipPipes = null; if (preg_match('/\\.gz$/', $sFilename)) { @@ -776,12 +800,14 @@ class SetupFunctions 1 => array('pipe', 'w'), 2 => array('file', '/dev/null', 'a') ); - $hGzipProcess = proc_open('zcat '.escapeshellarg($sFilename), $aDescriptors, $ahGzipPipes); + $oZcatCmd = new \Nominatim\Shell('zcat', $sFilename); + + $hGzipProcess = proc_open($oZcatCmd->escapedCmd(), $aDescriptors, $ahGzipPipes); if (!is_resource($hGzipProcess)) fail('unable to start zcat'); $aReadPipe = $ahGzipPipes[1]; fclose($ahGzipPipes[0]); } else { - $sCMD .= ' -f '.escapeshellarg($sFilename); + $oCmd->addParams('--file', $sFilename); $aReadPipe = array('pipe', 'r'); } $aDescriptors = array( @@ -790,7 +816,8 @@ class SetupFunctions 2 => array('file', '/dev/null', 'a') ); $ahPipes = null; - $hProcess = proc_open($sCMD, $aDescriptors, $ahPipes, null, $aProcEnv); + + $hProcess = proc_open($oCmd->escapedCmd(), $aDescriptors, $ahPipes, null, $oCmd->aEnv); if (!is_resource($hProcess)) fail('unable to start pgsql'); // TODO: error checking while (!feof($ahPipes[1])) { @@ -831,21 +858,6 @@ class SetupFunctions return $sSql; } - private function runWithPgEnv($sCmd) - { - if ($this->bVerbose) { - echo "Execute: $sCmd\n"; - } - - $aProcEnv = null; - - if (isset($this->aDSNInfo['password'])) { - $aProcEnv = array_merge(array('PGPASSWORD' => $this->aDSNInfo['password']), $_ENV); - } - - return runWithEnv($sCmd, $aProcEnv); - } - /** * Drop table with the given name if it exists. * diff --git a/nominatim/nominatim.py b/nominatim/nominatim.py index 14643770..0db0777d 100755 --- a/nominatim/nominatim.py +++ b/nominatim/nominatim.py @@ -304,7 +304,7 @@ class Indexer(object): else: ready, _, _ = select.select(self.threads, [], []) - assert(False, "Unreachable code") + assert False, "Unreachable code" def nominatim_arg_parser(): diff --git a/settings/address-levels.json b/settings/address-levels.json index 9f32fc98..10cbf307 100644 --- a/settings/address-levels.json +++ b/settings/address-levels.json @@ -63,7 +63,11 @@ "sea" : [4, 0] }, "waterway" : { - "" : [17, 0] + "river" : [19, 0], + "stream" : [22, 0], + "ditch" : [22, 0], + "drain" : [22, 0], + "" : [20, 0] }, "highway" : { "" : 26, diff --git a/test/bdd/api/reverse/addressdetails.feature b/test/bdd/api/reverse/addressdetails.feature new file mode 100644 index 00000000..5aa3846b --- /dev/null +++ b/test/bdd/api/reverse/addressdetails.feature @@ -0,0 +1,10 @@ +@APIDB +Feature: Reverse addressdetails + Tests for addressdetails in reverse queries + + #github #1763 + Scenario: Correct translation of highways under construction + When sending jsonv2 reverse coordinates -34.0290514,-53.5832235 + Then result addresses contain + | road | + | Ruta 9 Coronel Leonardo Olivera | diff --git a/test/php/Nominatim/ShellTest.php b/test/php/Nominatim/ShellTest.php new file mode 100644 index 00000000..d0222ee1 --- /dev/null +++ b/test/php/Nominatim/ShellTest.php @@ -0,0 +1,120 @@ +expectException('ArgumentCountError'); + $this->expectExceptionMessage('Too few arguments to function'); + $oCmd = new \Nominatim\Shell(); + + + $oCmd = new \Nominatim\Shell('wc', '-l', 'file.txt'); + $this->assertSame( + "wc -l 'file.txt'", + $oCmd->escapedCmd() + ); + } + + public function testaddParams() + { + $oCmd = new \Nominatim\Shell('grep'); + $oCmd->addParams('-a', 'abc') + ->addParams(10); + + $this->assertSame( + 'grep -a abc 10', + $oCmd->escapedCmd(), + 'no escaping needed, chained' + ); + + $oCmd = new \Nominatim\Shell('grep'); + $oCmd->addParams(); + $oCmd->addParams(null); + $oCmd->addParams(''); + + $this->assertEmpty($oCmd->aParams); + $this->assertSame('grep', $oCmd->escapedCmd(), 'empty params'); + + $oCmd = new \Nominatim\Shell('echo', '-n', 0); + $this->assertSame( + 'echo -n 0', + $oCmd->escapedCmd(), + 'zero param' + ); + + $oCmd = new \Nominatim\Shell('/path with space/do.php'); + $oCmd->addParams('-a', ' b '); + $oCmd->addParams('--flag'); + $oCmd->addParams('two words'); + $oCmd->addParams('v=1'); + + $this->assertSame( + "'/path with space/do.php' -a ' b ' --flag 'two words' 'v=1'", + $oCmd->escapedCmd(), + 'escape whitespace' + ); + + $oCmd = new \Nominatim\Shell('grep'); + $oCmd->addParams(';', '|more&', '2>&1'); + + $this->assertSame( + "grep ';' '|more&' '2>&1'", + $oCmd->escapedCmd(), + 'escape shell characters' + ); + } + + public function testaddEnvPair() + { + $oCmd = new \Nominatim\Shell('date'); + + $oCmd->addEnvPair('one', 'two words') + ->addEnvPair('null', null) + ->addEnvPair(null, 'null') + ->addEnvPair('empty', '') + ->addEnvPair('', 'empty'); + + $this->assertEquals( + array('one' => 'two words', 'empty' => ''), + $oCmd->aEnv + ); + + $oCmd->addEnvPair('one', 'overwrite'); + $this->assertEquals( + array('one' => 'overwrite', 'empty' => ''), + $oCmd->aEnv + ); + } + + public function testClone() + { + $oCmd = new \Nominatim\Shell('wc', '-l', 'file.txt'); + $oCmd2 = clone $oCmd; + $oCmd->addParams('--flag'); + $oCmd2->addParams('--flag2'); + + $this->assertSame( + "wc -l 'file.txt' --flag", + $oCmd->escapedCmd() + ); + + $this->assertSame( + "wc -l 'file.txt' --flag2", + $oCmd2->escapedCmd() + ); + } + + public function testRun() + { + $oCmd = new \Nominatim\Shell('echo'); + + $this->assertSame(0, $oCmd->run()); + + // var_dump($sStdout); + } +} diff --git a/utils/import_multiple_regions.sh b/utils/import_multiple_regions.sh new file mode 100644 index 00000000..83323c2e --- /dev/null +++ b/utils/import_multiple_regions.sh @@ -0,0 +1,91 @@ +#!/bin/bash -xv + +# Script to set up Nominatim database for multiple countries + +# Steps to follow: + +# *) Get the pbf files from server + +# *) Set up sequence.state for updates + +# *) Merge the pbf files into a single file. + +# *) Setup nominatim db using 'setup.php --osm-file' + +# Hint: +# +# Use "bashdb ./update_database.sh" and bashdb's "next" command for step-by-step +# execution. + +# ****************************************************************************** + +touch2() { mkdir -p "$(dirname "$1")" && touch "$1" ; } + +# ****************************************************************************** +# Configuration section: Variables in this section should be set according to your requirements + +# REPLACE WITH LIST OF YOUR "COUNTRIES": + +COUNTRIES="europe/monaco europe/andorra" + +# SET TO YOUR NOMINATIM build FOLDER PATH: + +NOMINATIMBUILD="/srv/nominatim/build" +SETUPFILE="$NOMINATIMBUILD/utils/setup.php" +UPDATEFILE="$NOMINATIMBUILD/utils/update.php" + +# SET TO YOUR update FOLDER PATH: + +UPDATEDIR="/srv/nominatim/update" + +# SET TO YOUR replication server URL: + +BASEURL="https://download.geofabrik.de" +DOWNCOUNTRYPOSTFIX="-latest.osm.pbf" + +# End of configuration section +# ****************************************************************************** + +COMBINEFILES="osmium merge" + +mkdir -p ${UPDATEDIR} +cd ${UPDATEDIR} +rm -rf tmp +mkdir -p tmp +cd tmp + +for COUNTRY in $COUNTRIES; +do + + echo "====================================================================" + echo "$COUNTRY" + echo "====================================================================" + DIR="$UPDATEDIR/$COUNTRY" + FILE="$DIR/configuration.txt" + DOWNURL="$BASEURL/$COUNTRY$DOWNCOUNTRYPOSTFIX" + IMPORTFILE=$COUNTRY$DOWNCOUNTRYPOSTFIX + IMPORTFILEPATH=${UPDATEDIR}/tmp/${IMPORTFILE} + FILENAME=${COUNTRY//[\/]/_} + + + touch2 $IMPORTFILEPATH + wget ${DOWNURL} -O $IMPORTFILEPATH + + touch2 ${DIR}/sequence.state + pyosmium-get-changes -O $IMPORTFILEPATH -f ${DIR}/sequence.state -v + + COMBINEFILES="${COMBINEFILES} ${IMPORTFILEPATH}" + echo $IMPORTFILE + echo "====================================================================" +done + + +echo "${COMBINEFILES} -o combined.osm.pbf" +${COMBINEFILES} -o combined.osm.pbf + +echo "====================================================================" +echo "Setting up nominatim db" +${SETUPFILE} --osm-file ${UPDATEDIR}/tmp/combined.osm.pbf --all 2>&1 + +# ${UPDATEFILE} --import-file ${UPDATEDIR}/tmp/combined.osm.pbf 2>&1 +echo "====================================================================" \ No newline at end of file diff --git a/utils/update.php b/utils/update.php index c1dc2ab9..d03cbed6 100644 --- a/utils/update.php +++ b/utils/update.php @@ -65,30 +65,52 @@ if ($iCacheMemory + 500 > getTotalMemoryMB()) { $iCacheMemory = getCacheMemoryMB(); echo "WARNING: resetting cache memory to $iCacheMemory\n"; } -$sOsm2pgsqlCmd = CONST_Osm2pgsql_Binary.' -klas --number-processes 1 -C '.$iCacheMemory.' -O gazetteer -S '.CONST_Import_Style.' -d '.$aDSNInfo['database'].' -P '.$aDSNInfo['port']; -if (isset($aDSNInfo['username']) && $aDSNInfo['username']) { - $sOsm2pgsqlCmd .= ' -U ' . $aDSNInfo['username']; -} + +$oOsm2pgsqlCmd = (new \Nominatim\Shell(CONST_Osm2pgsql_Binary)) + ->addParams('--hstore') + ->addParams('--latlong') + ->addParams('--append') + ->addParams('--slim') + ->addParams('--number-processes', 1) + ->addParams('--cache', $iCacheMemory) + ->addParams('--output', 'gazetteer') + ->addParams('--style', CONST_Import_Style) + ->addParams('--database', $aDSNInfo['database']) + ->addParams('--port', $aDSNInfo['port']); + if (isset($aDSNInfo['hostspec']) && $aDSNInfo['hostspec']) { - $sOsm2pgsqlCmd .= ' -H ' . $aDSNInfo['hostspec']; + $oOsm2pgsqlCmd->addParams('--host', $aDSNInfo['hostspec']); +} +if (isset($aDSNInfo['username']) && $aDSNInfo['username']) { + $oOsm2pgsqlCmd->addParams('--user', $aDSNInfo['username']); } -$aProcEnv = null; if (isset($aDSNInfo['password']) && $aDSNInfo['password']) { - $aProcEnv = array_merge(array('PGPASSWORD' => $aDSNInfo['password']), $_ENV); + $oOsm2pgsqlCmd->addEnvPair('PGPASSWORD', $aDSNInfo['password']); } - if (!is_null(CONST_Osm2pgsql_Flatnode_File) && CONST_Osm2pgsql_Flatnode_File) { - $sOsm2pgsqlCmd .= ' --flat-nodes '.CONST_Osm2pgsql_Flatnode_File; + $oOsm2pgsqlCmd->addParams('--flat-nodes', CONST_Osm2pgsql_Flatnode_File); } -$sIndexCmd = CONST_BasePath.'/nominatim/nominatim.py'; -if (!$aResult['quiet']) { - $sIndexCmd .= ' -v'; -} + +$oIndexCmd = (new \Nominatim\Shell(CONST_BasePath.'/nominatim/nominatim.py')) + ->addParams('--database', $aDSNInfo['database']) + ->addParams('--port', $aDSNInfo['port']) + ->addParams('--threads', $aResult['index-instances']); + if ($aResult['verbose']) { - $sIndexCmd .= ' -v'; + $oIndexCmd->addParams('--verbose'); +} +if (isset($aDSNInfo['hostspec']) && $aDSNInfo['hostspec']) { + $oIndexCmd->addParams('--host', $aDSNInfo['hostspec']); +} +if (isset($aDSNInfo['username']) && $aDSNInfo['username']) { + $oIndexCmd->addParams('--username', $aDSNInfo['username']); +} +if (isset($aDSNInfo['password']) && $aDSNInfo['password']) { + $oIndexCmd->addEnvPair('PGPASSWORD', $aDSNInfo['password']); } + if ($aResult['init-updates']) { // sanity check that the replication URL is correct $sBaseState = file_get_contents(CONST_Replication_Url.'/state.txt'); @@ -104,9 +126,11 @@ if ($aResult['init-updates']) { echo "in your local settings file.\n\n"; fail('CONST_Pyosmium_Binary not configured'); } + $aOutput = 0; - $sCmd = CONST_Pyosmium_Binary.' --help'; - exec($sCmd, $aOutput, $iRet); + $oCMD = new \Nominatim\Shell(CONST_Pyosmium_Binary, '--help'); + exec($oCMD->escapedCmd(), $aOutput, $iRet); + if ($iRet != 0) { echo "Cannot execute pyosmium-get-changes.\n"; echo "Make sure you have pyosmium installed correctly\n"; @@ -132,8 +156,11 @@ if ($aResult['init-updates']) { // get the appropriate state id $aOutput = 0; - $sCmd = CONST_Pyosmium_Binary.' -D '.$sWindBack.' --server '.CONST_Replication_Url; - exec($sCmd, $aOutput, $iRet); + $oCMD = (new \Nominatim\Shell(CONST_Pyosmium_Binary)) + ->addParams('--start-date', $sWindBack) + ->addParams('--server', CONST_Replication_Url); + + exec($oCMD->escapedCmd(), $aOutput, $iRet); if ($iRet != 0 || $aOutput[0] == 'None') { fail('Error running pyosmium tools'); } @@ -158,7 +185,11 @@ if ($aResult['check-for-updates']) { fail('Updates not set up. Please run ./utils/update.php --init-updates.'); } - system(CONST_BasePath.'/utils/check_server_for_updates.py '.CONST_Replication_Url.' '.$aLastState['sequence_id'], $iRet); + $oCmd = (new \Nominatim\Shell(CONST_BasePath.'/utils/check_server_for_updates.py')) + ->addParams(CONST_Replication_Url) + ->addParams($aLastState['sequence_id']); + $iRet = $oCmd->run(); + exit($iRet); } @@ -171,12 +202,12 @@ if (isset($aResult['import-diff']) || isset($aResult['import-file'])) { } // Import the file - $sCMD = $sOsm2pgsqlCmd.' '.$sNextFile; - echo $sCMD."\n"; - $iErrorLevel = runWithEnv($sCMD, $aProcEnv); + $oCMD = (clone $oOsm2pgsqlCmd)->addParams($sNextFile); + echo $oCMD->escapedCmd()."\n"; + $iRet = $oCMD->run(); - if ($iErrorLevel) { - fail("Error from osm2pgsql, $iErrorLevel\n"); + if ($iRet) { + fail("Error from osm2pgsql, $iRet\n"); } // Don't update the import status - we don't know what this file contains @@ -223,11 +254,13 @@ if ($sContentURL) { if ($bHaveDiff) { // import generated change file - $sCMD = $sOsm2pgsqlCmd.' '.$sTemporaryFile; - echo $sCMD."\n"; - $iErrorLevel = runWithEnv($sCMD, $aProcEnv); - if ($iErrorLevel) { - fail("osm2pgsql exited with error level $iErrorLevel\n"); + + $oCMD = (clone $oOsm2pgsqlCmd)->addParams($sTemporaryFile); + echo $oCMD->escapedCmd()."\n"; + + $iRet = $oCMD->run(); + if ($iRet) { + fail("osm2pgsql exited with error level $iRet\n"); } } @@ -310,19 +343,11 @@ if ($aResult['recompute-word-counts']) { } if ($aResult['index']) { - $sCmd = $sIndexCmd - .' -d '.$aDSNInfo['database'] - .' -P '.$aDSNInfo['port'] - .' -t '.$aResult['index-instances'] - .' -r '.$aResult['index-rank']; - if (isset($aDSNInfo['hostspec']) && $aDSNInfo['hostspec']) { - $sCmd .= ' -H ' . $aDSNInfo['hostspec']; - } - if (isset($aDSNInfo['username']) && $aDSNInfo['username']) { - $sCmd .= ' -U ' . $aDSNInfo['username']; - } + $oCmd = (clone $oIndexCmd) + ->addParams('--minrank', $aResult['index-rank']); - runWithEnv($sCmd, $aProcEnv); + // echo $oCmd->escapedCmd()."\n"; + $oCmd->run(); $oDB->exec('update import_status set indexed = true'); } @@ -354,22 +379,17 @@ if ($aResult['import-osmosis'] || $aResult['import-osmosis-all']) { // if (strpos(CONST_Replication_Url, 'download.geofabrik.de') !== false && CONST_Replication_Update_Interval < 86400) { fail('Error: Update interval too low for download.geofabrik.de. ' . - "Please check install documentation (http://nominatim.org/release-docs/latest/Import-and-Update#setting-up-the-update-process)\n"); + "Please check install documentation (https://nominatim.org/release-docs/latest/admin/Import-and-Update#setting-up-the-update-process)\n"); } $sImportFile = CONST_InstallPath.'/osmosischange.osc'; - $sCMDDownload = CONST_Pyosmium_Binary.' --server '.CONST_Replication_Url.' -o '.$sImportFile.' -s '.CONST_Replication_Max_Diff_size; - $sCMDImport = $sOsm2pgsqlCmd.' '.$sImportFile; - $sCMDIndex = $sIndexCmd - .' -d '.$aDSNInfo['database'] - .' -P '.$aDSNInfo['port'] - .' -t '.$aResult['index-instances']; - if (isset($aDSNInfo['hostspec']) && $aDSNInfo['hostspec']) { - $sCMDIndex .= ' -H ' . $aDSNInfo['hostspec']; - } - if (isset($aDSNInfo['username']) && $aDSNInfo['username']) { - $sCMDIndex .= ' -U ' . $aDSNInfo['username']; - } + + $oCMDDownload = (new \Nominatim\Shell(CONST_Pyosmium_Binary)) + ->addParams('--server', CONST_Replication_Url) + ->addParams('--outfile', $sImportFile) + ->addParams('--size', CONST_Replication_Max_Diff_size); + + $oCMDImport = (clone $oOsm2pgsqlCmd)->addParams($sImportFile); while (true) { $fStartTime = time(); @@ -399,11 +419,13 @@ if ($aResult['import-osmosis'] || $aResult['import-osmosis-all']) { $fCMDStartTime = time(); $iNextSeq = (int) $aLastState['sequence_id']; unset($aOutput); - echo "$sCMDDownload -I $iNextSeq\n"; + + $oCMD = (clone $oCMDDownload)->addParams('--start-id', $iNextSeq); + echo $oCMD->escapedCmd()."\n"; if (file_exists($sImportFile)) { unlink($sImportFile); } - exec($sCMDDownload.' -I '.$iNextSeq, $aOutput, $iResult); + exec($oCMD->escapedCmd(), $aOutput, $iResult); if ($iResult == 3) { echo 'No new updates. Sleeping for '.CONST_Replication_Recheck_Interval." sec.\n"; @@ -419,7 +441,8 @@ if ($aResult['import-osmosis'] || $aResult['import-osmosis-all']) { // get the newest object from the diff file $sBatchEnd = 0; $iRet = 0; - exec(CONST_BasePath.'/utils/osm_file_date.py '.$sImportFile, $sBatchEnd, $iRet); + $oCMD = new \Nominatim\Shell(CONST_BasePath.'/utils/osm_file_date.py', $sImportFile); + exec($oCMD->escapedCmd(), $sBatchEnd, $iRet); if ($iRet == 5) { echo "Diff file is empty. skipping import.\n"; if (!$aResult['import-osmosis-all']) { @@ -435,9 +458,11 @@ if ($aResult['import-osmosis'] || $aResult['import-osmosis-all']) { // Import the file $fCMDStartTime = time(); - echo $sCMDImport."\n"; + + + echo $oCMDImport->escapedCmd()."\n"; unset($sJunk); - $iErrorLevel = runWithEnv($sCMDImport, $aProcEnv); + $iErrorLevel = $oCMDImport->run(); if ($iErrorLevel) { echo "Error executing osm2pgsql: $iErrorLevel\n"; exit($iErrorLevel); @@ -462,11 +487,11 @@ if ($aResult['import-osmosis'] || $aResult['import-osmosis-all']) { // Index file if (!$aResult['no-index']) { - $sThisIndexCmd = $sCMDIndex; + $oThisIndexCmd = clone($oIndexCmd); $fCMDStartTime = time(); - echo "$sThisIndexCmd\n"; - $iErrorLevel = runWithEnv($sThisIndexCmd, $aProcEnv); + echo $oThisIndexCmd->escapedCmd()."\n"; + $iErrorLevel = $oThisIndexCmd->run(); if ($iErrorLevel) { echo "Error: $iErrorLevel\n"; exit($iErrorLevel); diff --git a/utils/update_database.sh b/utils/update_database.sh new file mode 100644 index 00000000..75d0de5d --- /dev/null +++ b/utils/update_database.sh @@ -0,0 +1,80 @@ +#!/bin/bash -xv + +# Derived from https://gist.github.com/RhinoDevel/8a35ebd2a08166f328eca01ab005c6de and edited to work with Pyosmium +# Related to https://github.com/osm-search/Nominatim/issues/1683 + +# Steps being followed: + +# *) Get the diff file from server +# 1) pyosmium-get-changes (with -f sequence.state for getting sequenceNumber) + +# *) Import diff +# 1) utils/update.php --import-diff + +# *) Index for all the countries at the end + +# Hint: +# +# Use "bashdb ./update_database.sh" and bashdb's "next" command for step-by-step +# execution. + +# ****************************************************************************** + +# REPLACE WITH LIST OF YOUR "COUNTRIES": +# + + +COUNTRIES="europe/monaco europe/andorra" + +# SET TO YOUR NOMINATIM build FOLDER PATH: +# +NOMINATIMBUILD="/srv/nominatim/build" +UPDATEFILE="$NOMINATIMBUILD/utils/update.php" + +# SET TO YOUR update data FOLDER PATH: +# +UPDATEDIR="/srv/nominatim/update" + +UPDATEBASEURL="https://download.geofabrik.de" +UPDATECOUNTRYPOSTFIX="-updates" + +# If you do not use Photon, let Nominatim handle (re-)indexing: +# +FOLLOWUP="$UPDATEFILE --index" +# +# If you use Photon, update Photon and let it handle the index +# (Photon server must be running and must have been started with "-database", +# "-user" and "-password" parameters): +# +#FOLLOWUP="curl http://localhost:2322/nominatim-update" + +# ****************************************************************************** + + +for COUNTRY in $COUNTRIES; +do + + echo "====================================================================" + echo "$COUNTRY" + echo "====================================================================" + DIR="$UPDATEDIR/$COUNTRY" + FILE="$DIR/sequence.state" + BASEURL="$UPDATEBASEURL/$COUNTRY$UPDATECOUNTRYPOSTFIX" + FILENAME=${COUNTRY//[\/]/_} + + # mkdir -p ${DIR} + cd ${DIR} + + echo "Attempting to get changes" + pyosmium-get-changes -o ${DIR}/${FILENAME}.osc.gz -f ${FILE} --server $BASEURL -v + + echo "Attempting to import diffs" + ${NOMINATIMBUILD}/utils/update.php --import-diff ${DIR}/${FILENAME}.osc.gz + rm ${DIR}/${FILENAME}.osc.gz + +done + +echo "====================================================================" +echo "Reindexing" +${FOLLOWUP} +echo "====================================================================" \ No newline at end of file diff --git a/website/lookup.php b/website/lookup.php index 39a17ebd..7675ae13 100644 --- a/website/lookup.php +++ b/website/lookup.php @@ -81,5 +81,7 @@ $bShowPolygons = ''; $aExcludePlaceIDs = array(); $sMoreURL = ''; +logEnd($oDB, $hLog, 1); + $sOutputTemplate = ($sOutputFormat == 'jsonv2') ? 'json' : $sOutputFormat; include(CONST_BasePath.'/lib/template/search-'.$sOutputTemplate.'.php');