From affe1300d9941c87b21c2bcadfdf0803247d5531 Mon Sep 17 00:00:00 2001 From: Sarah Hoffmann Date: Sun, 4 Jul 2021 10:44:58 +0200 Subject: [PATCH] add warning about experimental nature of ICU tokenizer --- docs/admin/Tokenizers.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/docs/admin/Tokenizers.md b/docs/admin/Tokenizers.md index a4d6aa0d..782d50b8 100644 --- a/docs/admin/Tokenizers.md +++ b/docs/admin/Tokenizers.md @@ -9,11 +9,11 @@ different configuration options. This sections describes the tokenizers and how they can be configured. !!! important -The use of a tokenizer is tied to a database installation. You need to choose -and configure the tokenizer before starting the initial import. Once the import -is done, you cannot switch to another tokenizer anymore. Reconfiguring the -chosen tokenizer is very limited as well. See the comments in each tokenizer -section. + The use of a tokenizer is tied to a database installation. You need to choose + and configure the tokenizer before starting the initial import. Once the import + is done, you cannot switch to another tokenizer anymore. Reconfiguring the + chosen tokenizer is very limited as well. See the comments in each tokenizer + section. ## Legacy tokenizer @@ -44,6 +44,10 @@ normalization functions are hard-coded. ## ICU tokenizer +!!! danger + This tokenizer is currently in active development and still subject + to backwards-incompatible changes. + The ICU tokenizer uses the [ICU library](http://site.icu-project.org/) to normalize names and queries. It also offers configurable decomposition and abbreviation handling. -- 2.43.2