There are three separate approaches to pattern matching provided by the database: the traditional SQL LIKE operator, the more recent SIMILAR TO operator, and POSIX-style regular expressions. Besides these basic operators, functions can be used to extract or replace matching substrings and to split a string at matching locations.
Checks whether the string matches the pattern string following LIKE. If the string matches the supplied pattern, the LIKE expression returns true (the NOT LIKE expression returns false). Otherwise, the LIKE expression returns false (the NOT LIKE expression returns true).
When standard_conforming_strings is set to off, any backslashes you write in literal string constants will need to be doubled. Therefore, writing a pattern matching a single backslash is actually going to write four backslashes in the statement. You can avoid this by selecting a different escape character by using ESCAPE, so that the backslash is no longer a special character of LIKE. But the backslash is still the special character of the character text analyzer, so you still need two backslashes. You can also select no escape character by writing ESCAPE ''. This effectively disables the escape mechanism, which makes it impossible to turn off the special meaning of underscore and percent signs in the pattern.
1 2 3 4 5 | SELECT 'abc' LIKE 'abc' AS RESULT; result ----------- t (1 row) |
1 2 3 4 5 | SELECT 'abc' LIKE 'a%' AS RESULT; result ----------- t (1 row) |
1 2 3 4 5 | SELECT 'abc' LIKE '_b_' AS RESULT; result ----------- t (1 row) |
1 2 3 4 5 | SELECT 'abc' LIKE 'c' AS RESULT; result ----------- f (1 row) |
The SIMILAR TO operator determines whether to match a given string based on its own pattern and returns true or false. It is similar to LIKE, except that it interprets the pattern using the SQL standard's definition of a regular expression.
Metacharacter |
Description |
---|---|
| |
Specifies alternation (either of two alternatives). |
* |
Specifies repetition of the previous item zero or more times. |
+ |
Specifies repetition of the previous item one or more times. |
? |
Specifies repetition of the previous item zero or one time. |
{m} |
Specifies repetition of the previous item exactly m times. |
{m,} |
Specifies repetition of the previous item m or more times. |
{m,n} |
Specifies repetition of the previous item at least m times and does not exceed n times. |
() |
Specifies that parentheses () can be used to group items into a single logical item. |
[...] |
Specifies a character class, just as in POSIX regular expressions. |
If a large number of characters are repeatedly matched in the SIMILAR TO regular expression, the statement fails to be executed and error "invalid regular expression: regular expression is too complex" is reported due to the recursion size restriction. In this case, increase the value of max_stack_depth.
The substring(string from pattern for escape) function can be used to intercept a substring that matches an SQL regular expression.
1 2 3 4 5 | SELECT 'abc' SIMILAR TO 'abc' AS RESULT; result ----------- t (1 row) |
1 2 3 4 5 | SELECT 'abc' SIMILAR TO 'a' AS RESULT; result ----------- f (1 row) |
1 2 3 4 5 | SELECT 'abc' SIMILAR TO '%(b|d)%' AS RESULT; result ----------- t (1 row) |
1 2 3 4 5 | SELECT 'abc' SIMILAR TO '(b|c)%' AS RESULT; result ----------- f (1 row) |
A regular expression is a character sequence that is an abbreviated definition of a set of strings (a regular set). If a string is a member of a regular expression described by a regular expression, the string matches the regular expression. POSIX regular expressions provide a more powerful means for pattern matching than the LIKE and SIMILAR TO operators. Table 2 lists all available operators for POSIX regular expression pattern matching.
Operator |
Description |
Example |
---|---|---|
~ |
Matches regular expression, which is case-sensitive. |
'thomas' ~ '.*thomas.*' |
~* |
Matches regular expression, which is case-insensitive. |
'thomas' ~* '.*Thomas.*' |
! ~ |
Does not match regular expression, which is case-sensitive. |
'thomas' !~ '.*Thomas.*' |
! ~* |
Does not match regular expression, which is case-insensitive. |
'thomas' !~* '.*vadim.*' |
Metacharacter |
Description |
---|---|
^ |
Specifies the match starting with a string. |
$ |
Specifies the match at the end of a string. |
. |
Matches any single character. |
The regular expression split functions ignore zero-length matches, which occur at the beginning or end of a string or after the previous match. This is contrary to the strict definition of regular expression matching. The latter is implemented by regexp_matches, but the former is usually the most commonly used behavior in practice.
1 2 3 4 5 | SELECT 'abc' ~ 'Abc' AS RESULT; result -------- f (1 row) |
1 2 3 4 5 | SELECT 'abc' ~* 'Abc' AS RESULT; result -------- t (1 row) |
1 2 3 4 5 | SELECT 'abc' !~ 'Abc' AS RESULT; result -------- t (1 row) |
1 2 3 4 5 | SELECT 'abc'!~* 'Abc' AS RESULT; result -------- f (1 row) |
1 2 3 4 5 | SELECT 'abc' ~ '^a' AS RESULT; result -------- t (1 row) |
1 2 3 4 5 | SELECT 'abc' ~ '(b|d)'AS RESULT; result -------- t (1 row) |
1 2 3 4 5 | SELECT 'abc' ~ '^(b|c)'AS RESULT; result -------- f (1 row) |
Although most regular expression searches can be executed quickly, the time and memory for regular expression processing can still be manually controlled. It is not recommended that you accept the regular expression search mode from the non-security mode source. If you must do this, you are advised to add the statement timeout limit. The search with the SIMILAR TO mode has the same security risks as the SIMILAR TO provides many capabilities that are the same as those of the POSIX- style regular expression. The LIKE search is much simpler than the other two options. Therefore, it is more secure to accept the non-secure mode source search.