doc-exports/docs/dws/dev/dws_06_0105.html
Lu, Huayi e6fa411af0 DWS DEV 830.201 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Lu, Huayi <luhuayi@huawei.com>
Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
2024-05-16 07:24:04 +00:00

78 lines
8.7 KiB
HTML

<a name="EN-US_TOPIC_0000001188270516"></a><a name="EN-US_TOPIC_0000001188270516"></a>
<h1 class="topictitle1">Simple Dictionary</h1>
<div id="body1561195448344"><p id="EN-US_TOPIC_0000001188270516__p188122449501">A <strong id="EN-US_TOPIC_0000001188270516__b168661710143">Simple</strong> dictionary operates by converting the input token to lower case and checking it against a list of stop words. If the token is found in the list, an empty array will be returned, causing the token to be discarded. If it is not found, the lower-cased form of the word is returned as the normalized lexeme. In addition, you can set <strong id="EN-US_TOPIC_0000001188270516__b539613414264">Accept</strong> to <strong id="EN-US_TOPIC_0000001188270516__b20178125182613">false</strong> for <strong id="EN-US_TOPIC_0000001188270516__b869653472714">Simple</strong> dictionaries (default: <strong id="EN-US_TOPIC_0000001188270516__b9732101274">true</strong>) to report non-stop-words as unrecognized, allowing them to be passed on to the next dictionary in the list.</p>
<div class="section" id="EN-US_TOPIC_0000001188270516__section1750055382816"><h4 class="sectiontitle">Precautions</h4><ul id="EN-US_TOPIC_0000001188270516__ul1714725117371"><li id="EN-US_TOPIC_0000001188270516__li126961800117">Most types of dictionaries rely on dictionary configuration files. The name of a configuration file can only be lowercase letters, digits, and underscores (_).</li><li id="EN-US_TOPIC_0000001188270516__li992044017418">A dictionary cannot be created in <strong id="EN-US_TOPIC_0000001188270516__b2912104273014">pg_temp</strong> mode.</li><li id="EN-US_TOPIC_0000001188270516__li10147751193715">Dictionary configuration files must be stored in UTF-8 encoding. They will be translated to the actual database encoding, if that is different, when they are read into the server.</li><li id="EN-US_TOPIC_0000001188270516__li13238454123714">Generally, a session will read a dictionary configuration file only once, when it is first used within the session. To modify a configuration file, run the <strong id="EN-US_TOPIC_0000001188270516__b15668141163518">ALTER TEXT SEARCH DICTIONARY</strong> statement to update and reload the file.</li></ul>
</div>
<div class="section" id="EN-US_TOPIC_0000001188270516__section75460100182"><h4 class="sectiontitle">Procedure</h4><ol id="EN-US_TOPIC_0000001188270516__ol1780938123215"><li id="EN-US_TOPIC_0000001188270516__li580914818324"><span>Create a <strong id="EN-US_TOPIC_0000001188270516__b106739406359">Simple</strong> dictionary.</span><p><div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188270516__screen20301419121412"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="nb">TEXT</span><span class="w"> </span><span class="k">SEARCH</span><span class="w"> </span><span class="k">DICTIONARY</span><span class="w"> </span><span class="k">public</span><span class="p">.</span><span class="n">simple_dict</span><span class="w"> </span><span class="p">(</span>
<span class="w"> </span><span class="k">TEMPLATE</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">pg_catalog</span><span class="p">.</span><span class="k">simple</span><span class="p">,</span>
<span class="w"> </span><span class="n">STOPWORDS</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">english</span>
<span class="p">);</span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000001188270516__p1981618018157"><strong id="EN-US_TOPIC_0000001188270516__b10316410153714">english.stop</strong> is the full name of a file of stop words. For details about the syntax and parameters for creating a <strong id="EN-US_TOPIC_0000001188270516__b93743285373">Simple</strong> dictionary, see <a href="dws_06_0183.html">CREATE TEXT SEARCH DICTIONARY</a>.</p>
</p></li><li id="EN-US_TOPIC_0000001188270516__li68101816323"><span>Use the <strong id="EN-US_TOPIC_0000001188270516__b1964773973715">Simple</strong> dictionary.</span><p><div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188270516__screen19455219264"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
<span class="normal"> 2</span>
<span class="normal"> 3</span>
<span class="normal"> 4</span>
<span class="normal"> 5</span>
<span class="normal"> 6</span>
<span class="normal"> 7</span>
<span class="normal"> 8</span>
<span class="normal"> 9</span>
<span class="normal">10</span>
<span class="normal">11</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">ts_lexize</span><span class="p">(</span><span class="s1">'public.simple_dict'</span><span class="p">,</span><span class="s1">'Yes'</span><span class="p">);</span>
<span class="w"> </span><span class="n">ts_lexize</span><span class="w"> </span>
<span class="c1">-----------</span>
<span class="w"> </span><span class="err">{</span><span class="n">yes</span><span class="err">}</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
<span class="k">SELECT</span><span class="w"> </span><span class="n">ts_lexize</span><span class="p">(</span><span class="s1">'public.simple_dict'</span><span class="p">,</span><span class="s1">'The'</span><span class="p">);</span>
<span class="w"> </span><span class="n">ts_lexize</span><span class="w"> </span>
<span class="c1">-----------</span>
<span class="w"> </span><span class="err">{}</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
</p></li><li id="EN-US_TOPIC_0000001188270516__li781010811322"><span>Set <strong id="EN-US_TOPIC_0000001188270516__b7787936388">Accept=false</strong> so that the <strong id="EN-US_TOPIC_0000001188270516__b194691210133814">Simple</strong> dictionary returns <strong id="EN-US_TOPIC_0000001188270516__b17855161443812">NULL</strong> instead of a lower-cased non-stop word.</span><p><div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188270516__screen1295520203119"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span>
<span class="normal"> 2</span>
<span class="normal"> 3</span>
<span class="normal"> 4</span>
<span class="normal"> 5</span>
<span class="normal"> 6</span>
<span class="normal"> 7</span>
<span class="normal"> 8</span>
<span class="normal"> 9</span>
<span class="normal">10</span>
<span class="normal">11</span>
<span class="normal">12</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">ALTER</span><span class="w"> </span><span class="nb">TEXT</span><span class="w"> </span><span class="k">SEARCH</span><span class="w"> </span><span class="k">DICTIONARY</span><span class="w"> </span><span class="k">public</span><span class="p">.</span><span class="n">simple_dict</span><span class="w"> </span><span class="p">(</span><span class="w"> </span><span class="n">Accept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">false</span><span class="w"> </span><span class="p">);</span>
<span class="k">SELECT</span><span class="w"> </span><span class="n">ts_lexize</span><span class="p">(</span><span class="s1">'public.simple_dict'</span><span class="p">,</span><span class="s1">'Yes'</span><span class="p">);</span>
<span class="w"> </span><span class="n">ts_lexize</span><span class="w"> </span>
<span class="c1">-----------</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
<span class="k">SELECT</span><span class="w"> </span><span class="n">ts_lexize</span><span class="p">(</span><span class="s1">'public.simple_dict'</span><span class="p">,</span><span class="s1">'The'</span><span class="p">);</span>
<span class="w"> </span><span class="n">ts_lexize</span><span class="w"> </span>
<span class="c1">-----------</span>
<span class="w"> </span><span class="err">{}</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
</p></li></ol>
</div>
<p id="EN-US_TOPIC_0000001188270516__p523815414719"></p>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_06_0102.html">Dictionaries</a></div>
</div>
</div>