A thesaurus dictionary (sometimes abbreviated as TZ) is a collection of words that include relationships between words and phrases, such as broader terms (BT), narrower terms (NT), preferred terms, non-preferred terms, and related terms. A thesaurus dictionary replaces all non-preferred terms by one preferred term and, optionally, preserves the original terms for indexing as well. A thesaurus dictionary is an extension of the synonym dictionary with added phrase support.
1 2 | supernovae stars : sn crab nebulae : crab |
Run the following statement to create the TZ:
// Hard-coded or plaintext AK and SK are risky. For security purposes, encrypt your AK and SK and store them in the configuration file or environment variables.
1 2 3 4 5 6 | CREATE TEXT SEARCH DICTIONARY thesaurus_astro ( TEMPLATE = thesaurus, DictFile = thesaurus_astro, Dictionary = pg_catalog.english_stem, FILEPATH = 'obs://bucket_name/path accesskey=ak secretkey=sk region=rg' ); |
The full name of the thesaurus dictionary file is thesaurus_astro.ths, and the dictionary is stored in the obs://bucket_name/path accesskey=ak secretkey=sk region=rg directory. pg_catalog.english_stem is the subdictionary (a Snowball English stemmer) used for input normalization. The subdictionary has its own configuration (for example, stop words), which is not shown here. For details about the syntax and parameters for creating a TZ, see CREATE TEXT SEARCH DICTIONARY.
1 2 3 | ALTER TEXT SEARCH CONFIGURATION english ALTER MAPPING FOR asciiword, asciihword, hword_asciipart WITH thesaurus_astro, english_stem; |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | SELECT plainto_tsquery('english','supernova star'); plainto_tsquery ----------------- 'sn' (1 row) SELECT to_tsvector('english','supernova star'); to_tsvector ------------- 'sn':1 (1 row) SELECT to_tsquery('english','''supernova star'''); to_tsquery ------------ 'sn' (1 row) |
supernova star matches supernovae stars in thesaurus_astro because the english_stem stemmer is specified in the thesaurus_astro definition. The stemmer removed e and s.
1 2 3 4 5 6 7 8 9 10 11 | supernovae stars : sn supernovae stars ALTER TEXT SEARCH DICTIONARY thesaurus_astro ( DictFile = thesaurus_astro, FILEPATH = 'file:///home/dicts/'); SELECT plainto_tsquery('english','supernova star'); plainto_tsquery ----------------------------- 'sn' & 'supernova' & 'star' (1 row) |