Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Lu, Huayi <luhuayi@huawei.com> Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
11 KiB
Parsing Queries
GaussDB(DWS) provides functions to_tsquery and plainto_tsquery for converting a query to the tsquery data type. to_tsquery offers access to more features than plainto_tsquery, but is less forgiving about its input.
to_tsquery
to_tsquery converts a query to the tsquery data type.
to_tsquery([ config regconfig, ] querytext text) returns tsquery
to_tsquery creates a tsquery value from querytext, which must consist of single tokens separated by the Boolean operators & (AND), | (OR), and ! (NOT). These operators can be grouped using parentheses. The input to to_tsquery must follow the general rules for tsquery input, as described in Text Search Types. The difference is that while basic tsquery input takes the tokens at face value, to_tsquery normalizes each token to a lexeme using the specified or default configuration, and discards any tokens that are stop words according to the configuration. For example:
1 2 3 4 5 | SELECT to_tsquery('english', 'The & Fat & Rats'); to_tsquery --------------- 'fat' & 'rat' (1 row) |
As in basic tsquery input, weight(s) can be attached to each lexeme to restrict it to match only tsvector lexemes of those weight(s). For example:
1 2 3 4 5 | SELECT to_tsquery('english', 'Fat | Rats:AB'); to_tsquery ------------------ 'fat' | 'rat':AB (1 row) |
In addition, the asterisk (*) can be attached to a lexeme to specify prefix matching.
1 2 3 4 5 | SELECT to_tsquery('supern:*A & star:A*B'); to_tsquery -------------------------- 'supern':*A & 'star':*AB (1 row) |
Such a lexeme will match any word having the specified string and weight in a tsquery.
plainto_tsquery
plainto_tsquery transforms unformatted text querytext to tsquery. The text is parsed and normalized much as for to_tsvector, then the & (AND) Boolean operator is inserted between surviving words.
plainto_tsquery([ config regconfig, ] querytext text) returns tsquery
For example:
1 2 3 4 5 | SELECT plainto_tsquery('english', 'The Fat Rats'); plainto_tsquery ----------------- 'fat' & 'rat' (1 row) |
Note that plainto_tsquery cannot recognize Boolean operators, weight labels, or prefix-match labels in its input:
1 2 3 4 5 | SELECT plainto_tsquery('english', 'The Fat & Rats:C'); plainto_tsquery --------------------- 'fat' & 'rat' & 'c' (1 row) |
Here, all the input punctuation was discarded as being space symbols.