doc-exports/docs/dws/dev/dws_06_0093.html
Lu, Huayi a24ca60074 DWS DEVELOPER 811 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Lu, Huayi <luhuayi@huawei.com>
Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
2023-01-19 13:37:49 +00:00

82 lines
11 KiB
HTML

<a name="EN-US_TOPIC_0000001098670904"></a><a name="EN-US_TOPIC_0000001098670904"></a>
<h1 class="topictitle1">Parsing Queries</h1>
<div id="body8662426"><p id="EN-US_TOPIC_0000001098670904__en-us_topic_0059779203_p166391454117"><span id="EN-US_TOPIC_0000001098670904__text402633298">GaussDB(DWS)</span> provides functions <strong id="EN-US_TOPIC_0000001098670904__en-us_topic_0058965654_b842352706114733">to_tsquery</strong> and <strong id="EN-US_TOPIC_0000001098670904__en-us_topic_0058965654_b842352706114736">plainto_tsquery</strong> for converting a query to the <strong id="EN-US_TOPIC_0000001098670904__en-us_topic_0058965654_b842352706114747">tsquery</strong> data type. <strong id="EN-US_TOPIC_0000001098670904__en-us_topic_0058965654_b84235270611484">to_tsquery</strong> offers access to more features than <strong id="EN-US_TOPIC_0000001098670904__en-us_topic_0058965654_b84235270611488">plainto_tsquery</strong>, but is less forgiving about its input.</p>
<pre class="screen" id="EN-US_TOPIC_0000001098670904__scbe9397d2cf147b4b6d429c0388d9cdb"><strong id="EN-US_TOPIC_0000001098670904__a472ad0c69f1949a69ac2000f0fb6e21d">to_tsquery([ config regconfig, ] querytext text) returns tsquery</strong></pre>
<p id="EN-US_TOPIC_0000001098670904__aba1a3954bda6410898fa9d39489a1bd3"><strong id="EN-US_TOPIC_0000001098670904__b842352706114842">to_tsquery</strong> creates a <strong id="EN-US_TOPIC_0000001098670904__b842352706114846">tsquery</strong> value from <strong id="EN-US_TOPIC_0000001098670904__b842352706114849">querytext</strong>, which must consist of single tokens separated by the Boolean operators <strong id="EN-US_TOPIC_0000001098670904__b842352706114858">&amp;</strong> (AND), <strong id="EN-US_TOPIC_0000001098670904__b84235270611490">|</strong> (OR), and <strong id="EN-US_TOPIC_0000001098670904__b84235270611493">!</strong> (NOT). These operators can be grouped using parentheses. In other words, the input to <strong id="EN-US_TOPIC_0000001098670904__en-us_topic_0058965654_b842352706115025">to_tsquery</strong> must already follow the general rules for <strong id="EN-US_TOPIC_0000001098670904__en-us_topic_0058965654_b842352706115030">tsquery</strong> input, as described in <a href="dws_06_0018.html">Text Search Types</a>. The difference is that while basic <strong id="EN-US_TOPIC_0000001098670904__b842352706115144">tsquery</strong> input takes the tokens at face value, <strong id="EN-US_TOPIC_0000001098670904__b842352706115147">to_tsquery</strong> normalizes each token to a lexeme using the specified or default configuration, and discards any tokens that are stop words according to the configuration. For example:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098670904__s006208443ece4bd2b2b256fa71f6d124"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">to_tsquery</span><span class="p">(</span><span class="s1">'english'</span><span class="p">,</span><span class="w"> </span><span class="s1">'The &amp; Fat &amp; Rats'</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">to_tsquery</span><span class="w"> </span>
<span class="c1">---------------</span>
<span class="w"> </span><span class="s1">'fat'</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="s1">'rat'</span><span class="w"></span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001098670904__a015dc2e053e24714b043a04db78886b4">As in basic <strong id="EN-US_TOPIC_0000001098670904__b842352706115223">tsquery</strong> input, <strong id="EN-US_TOPIC_0000001098670904__b842352706115229">weight(s)</strong> can be attached to each lexeme to restrict it to match only <strong id="EN-US_TOPIC_0000001098670904__b842352706115234">tsvector</strong> lexemes of those <strong id="EN-US_TOPIC_0000001098670904__b842352706115232">weight(s)</strong>. For example:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098670904__s946d37facc114c0fb767ce07bdb7c15a"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">to_tsquery</span><span class="p">(</span><span class="s1">'english'</span><span class="p">,</span><span class="w"> </span><span class="s1">'Fat | Rats:AB'</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">to_tsquery</span><span class="w"> </span>
<span class="c1">------------------</span>
<span class="w"> </span><span class="s1">'fat'</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="s1">'rat'</span><span class="p">:</span><span class="n">AB</span><span class="w"></span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001098670904__a412d28b154994ea78cf874d5fc811fb0">Also, the asterisk (*) can be attached to a lexeme to specify prefix matching:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098670904__s4ace06608d9c492989fd903575408649"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">to_tsquery</span><span class="p">(</span><span class="s1">'supern:*A &amp; star:A*B'</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">to_tsquery</span><span class="w"> </span>
<span class="c1">--------------------------</span>
<span class="w"> </span><span class="s1">'supern'</span><span class="p">:</span><span class="o">*</span><span class="n">A</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="s1">'star'</span><span class="p">:</span><span class="o">*</span><span class="n">AB</span><span class="w"></span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001098670904__a057cc73018994b6fa35e2ed9798e5e2e">Such a lexeme will match any word having the specified string and weight in a <strong id="EN-US_TOPIC_0000001098670904__b604939408202050">tsquery</strong>.</p>
<pre class="screen" id="EN-US_TOPIC_0000001098670904__s92ae10ff8d7546149b0d8723f6b26c31"><strong id="EN-US_TOPIC_0000001098670904__adec522a25fe04bc1aae1b1656db763df">plainto_tsquery([ config regconfig, ] querytext text) returns tsquery</strong></pre>
<p id="EN-US_TOPIC_0000001098670904__a4acac1205ba74d5d960d834a1d07e915"><strong id="EN-US_TOPIC_0000001098670904__b842352706115513">plainto_tsquery</strong> transforms unformatted text <strong id="EN-US_TOPIC_0000001098670904__b842352706115516">querytext</strong> to <strong id="EN-US_TOPIC_0000001098670904__b842352706115518">tsquery</strong>. The text is parsed and normalized much as for <strong id="EN-US_TOPIC_0000001098670904__b842352706115542">to_tsvector</strong>, then the <strong id="EN-US_TOPIC_0000001098670904__b842352706115538">&amp;</strong> (AND) Boolean operator is inserted between surviving words.</p>
<p id="EN-US_TOPIC_0000001098670904__add5e99c5f5534994a3fa8199cbdd820b">For example:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098670904__s370a833d4293479fb9b44925b49cce14"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">plainto_tsquery</span><span class="p">(</span><span class="s1">'english'</span><span class="p">,</span><span class="w"> </span><span class="s1">'The Fat Rats'</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">plainto_tsquery</span><span class="w"> </span>
<span class="c1">-----------------</span>
<span class="w"> </span><span class="s1">'fat'</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="s1">'rat'</span><span class="w"></span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001098670904__a37e1e987cdf14471b33773f455163784">Note that <strong id="EN-US_TOPIC_0000001098670904__b84235270611563">plainto_tsquery</strong> cannot recognize Boolean operators, weight labels, or prefix-match labels in its input:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001098670904__s0623f94ca3ed43609b9396b0966b2eda"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">plainto_tsquery</span><span class="p">(</span><span class="s1">'english'</span><span class="p">,</span><span class="w"> </span><span class="s1">'The Fat &amp; Rats:C'</span><span class="p">);</span><span class="w"></span>
<span class="w"> </span><span class="n">plainto_tsquery</span><span class="w"> </span>
<span class="c1">---------------------</span>
<span class="w"> </span><span class="s1">'fat'</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="s1">'rat'</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="s1">'c'</span><span class="w"></span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span><span class="w"></span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001098670904__a792f79814d82455684a5b0d5e47b6c24">Here, all the input punctuation was discarded as being space symbols.</p>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_06_0091.html">Controlling Text Search</a></div>
</div>
</div>