doc-exports/docs/dws/dev/dws_06_0085.html
Lu, Huayi e6fa411af0 DWS DEV 830.201 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Lu, Huayi <luhuayi@huawei.com>
Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
2024-05-16 07:24:04 +00:00

75 lines
9.3 KiB
HTML

<a name="EN-US_TOPIC_0000001188110544"></a><a name="EN-US_TOPIC_0000001188110544"></a>
<h1 class="topictitle1">Basic Text Matching</h1>
<div id="body8662426"><p id="EN-US_TOPIC_0000001188110544__a3a5f96b856ec47d5a4b76e8c04ff51eb">Full text search in <span id="EN-US_TOPIC_0000001188110544__text1018521066">GaussDB(DWS)</span> is based on the match operator <strong id="EN-US_TOPIC_0000001188110544__en-us_topic_0085031665_en-us_topic_0058965815_b842352706143755">@@</strong>, which returns <strong id="EN-US_TOPIC_0000001188110544__en-us_topic_0085031665_b31268278229">true</strong> if a <strong id="EN-US_TOPIC_0000001188110544__en-us_topic_0085031665_en-us_topic_0058965815_b842352706143759">tsvector</strong> (document) matches a <strong id="EN-US_TOPIC_0000001188110544__en-us_topic_0085031665_en-us_topic_0058965815_b84235270614383">tsquery</strong> (query). It does not matter which data type is written first:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188110544__s668dffc8df2246d2907533b4bb8be836"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="s1">'a fat cat sat on a mat and ate a fat rat'</span><span class="p">::</span><span class="n">tsvector</span><span class="w"> </span><span class="o">@@</span><span class="w"> </span><span class="s1">'cat &amp; rat'</span><span class="p">::</span><span class="n">tsquery</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="k">RESULT</span><span class="p">;</span>
<span class="w"> </span><span class="k">result</span>
<span class="c1">----------</span>
<span class="w"> </span><span class="n">t</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188110544__s22b6b4f720854bad9e517563cad93a12"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="s1">'fat &amp; cow'</span><span class="p">::</span><span class="n">tsquery</span><span class="w"> </span><span class="o">@@</span><span class="w"> </span><span class="s1">'a fat cat sat on a mat and ate a fat rat'</span><span class="p">::</span><span class="n">tsvector</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="k">RESULT</span><span class="p">;</span>
<span class="w"> </span><span class="k">result</span>
<span class="c1">----------</span>
<span class="w"> </span><span class="n">f</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span><span class="w"> </span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001188110544__aa91a969a8f9a4428a407eac966fb5bb4">As the above example suggests, a <strong id="EN-US_TOPIC_0000001188110544__b842352706144944">tsquery</strong> is not raw text, any more than a <strong id="EN-US_TOPIC_0000001188110544__b842352706144949">tsvector</strong> is. A tsquery contains search terms, which must be already-normalized lexemes, and may combine multiple terms using <strong id="EN-US_TOPIC_0000001188110544__b84235270614500">AND</strong>, <strong id="EN-US_TOPIC_0000001188110544__b84235270614502">OR</strong>, and <strong id="EN-US_TOPIC_0000001188110544__b84235270614504">NOT</strong> operators. For details, see <a href="dws_06_0018.html">Text Search Types</a>. There are functions <strong id="EN-US_TOPIC_0000001188110544__b842352706145045">to_tsquery</strong> and <strong id="EN-US_TOPIC_0000001188110544__b842352706145049">plainto_tsquery</strong> that are helpful in converting user-written text into a proper tsquery, for example by normalizing words appearing in the text. Similarly, <strong id="EN-US_TOPIC_0000001188110544__b842352706145110">to_tsvector</strong> is used to parse and normalize a document string. So in practice a text search match would look more like this:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188110544__s0609509fca04482f949c2abc657fc911"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">to_tsvector</span><span class="p">(</span><span class="s1">'fat cats ate fat rats'</span><span class="p">)</span><span class="w"> </span><span class="o">@@</span><span class="w"> </span><span class="n">to_tsquery</span><span class="p">(</span><span class="s1">'fat &amp; rat'</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="k">RESULT</span><span class="p">;</span>
<span class="k">result</span>
<span class="c1">----------</span>
<span class="w"> </span><span class="n">t</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001188110544__ae5ea3fe83d114ed1a8f3ce80f72a77df">Observe that this match would not succeed if written as follows:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188110544__s20fe230cad904146b02ab7b92fd2e506"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="s1">'fat cats ate fat rats'</span><span class="p">::</span><span class="n">tsvector</span><span class="w"> </span><span class="o">@@</span><span class="w"> </span><span class="n">to_tsquery</span><span class="p">(</span><span class="s1">'fat &amp; rat'</span><span class="p">)</span><span class="k">AS</span><span class="w"> </span><span class="k">RESULT</span><span class="p">;</span>
<span class="k">result</span>
<span class="c1">----------</span>
<span class="w"> </span><span class="n">f</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001188110544__adf158e901ff541a08d032e28163dbb03">In the preceding match, no normalization of the word <strong id="EN-US_TOPIC_0000001188110544__b842352706145243">rats</strong> will occur. Therefore, <strong id="EN-US_TOPIC_0000001188110544__b842352706145254">rats</strong> does not match <strong id="EN-US_TOPIC_0000001188110544__b842352706145256">rat</strong>.</p>
<p id="EN-US_TOPIC_0000001188110544__a26d89563ac454838b16d042568682b12">The <strong id="EN-US_TOPIC_0000001188110544__b842352706145313">@@</strong> operator also supports text input, allowing explicit conversion of a text string to <strong id="EN-US_TOPIC_0000001188110544__b842352706145316">tsvector</strong> or <strong id="EN-US_TOPIC_0000001188110544__b842352706145319">tsquery</strong> to be skipped in simple cases. The variants available are:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188110544__s98d145b7a41e440697a3df9015a4816a"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><span class="n">tsvector</span><span class="w"> </span><span class="o">@@</span><span class="w"> </span><span class="n">tsquery</span>
<span class="n">tsquery</span><span class="w"> </span><span class="o">@@</span><span class="w"> </span><span class="n">tsvector</span>
<span class="nb">text</span><span class="w"> </span><span class="o">@@</span><span class="w"> </span><span class="n">tsquery</span>
<span class="nb">text</span><span class="w"> </span><span class="o">@@</span><span class="w"> </span><span class="nb">text</span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001188110544__a262f466ca9224e418e2b15510caf680b">The form <strong id="EN-US_TOPIC_0000001188110544__b1865124091717">text @@ tsquery</strong> is equivalent to <strong id="EN-US_TOPIC_0000001188110544__b686574014171">to_tsvector(text) @@ tsquery</strong>. The form <strong id="EN-US_TOPIC_0000001188110544__b286614402172">text @@ text</strong> is equivalent to <strong id="EN-US_TOPIC_0000001188110544__b17866154020175">to_tsvector(text) @@ plainto_tsquery(text)</strong>.</p>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_06_0082.html">Introduction</a></div>
</div>
</div>