Simple Dictionary

A Simple dictionary operates by converting the input token to lower case and checking it against a list of stop words. If the token is found in the list, an empty array will be returned, causing the token to be discarded. If it is not found, the lower-cased form of the word is returned as the normalized lexeme. In addition, you can set Accept to false for Simple dictionaries (default: true) to report non-stop-words as unrecognized, allowing them to be passed on to the next dictionary in the list.

Precautions

Procedure

  1. Create a Simple dictionary.

    1
    2
    3
    4
    CREATE TEXT SEARCH DICTIONARY public.simple_dict (
         TEMPLATE = pg_catalog.simple,
         STOPWORDS = english
    );
    

    english.stop is the full name of a file of stop words. For details about the syntax and parameters for creating a Simple dictionary, see CREATE TEXT SEARCH DICTIONARY.

  2. Use the Simple dictionary.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    SELECT ts_lexize('public.simple_dict','YeS');
     ts_lexize 
    -----------
     {yes}
    (1 row)
    
    SELECT ts_lexize('public.simple_dict','The');
     ts_lexize 
    -----------
     {}
    (1 row)
    

  3. Set Accept=false so that the Simple dictionary returns NULL instead of a lower-cased non-stop word.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    ALTER TEXT SEARCH DICTIONARY public.simple_dict ( Accept = false );
    SELECT ts_lexize('public.simple_dict','YeS');
     ts_lexize 
    -----------
    
    (1 row)
    
    SELECT ts_lexize('public.simple_dict','The');
     ts_lexize 
    -----------
     {}
    (1 row)