forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Lu, Huayi <luhuayi@huawei.com> Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
50 lines
8.4 KiB
HTML
50 lines
8.4 KiB
HTML
<a name="EN-US_TOPIC_0000001145830699"></a><a name="EN-US_TOPIC_0000001145830699"></a>
|
|
|
|
<h1 class="topictitle1">Ispell Dictionary</h1>
|
|
<div id="body1561195448345"><p id="EN-US_TOPIC_0000001145830699__p8060118">The Ispell dictionary template supports morphological dictionaries, which can normalize many different linguistic forms of a word into the same lexeme. For example, an English Ispell dictionary can match all declensions and conjugations of the search term <strong id="EN-US_TOPIC_0000001145830699__b207017481619">bank</strong>, such as <strong id="EN-US_TOPIC_0000001145830699__b1416243965216">banking</strong>, <strong id="EN-US_TOPIC_0000001145830699__b86761741205220">banked</strong>, <strong id="EN-US_TOPIC_0000001145830699__b13762184605210">banks</strong>, <strong id="EN-US_TOPIC_0000001145830699__b1423135355214">banks'</strong>, and <strong id="EN-US_TOPIC_0000001145830699__b153516564527">bank's</strong>.</p>
|
|
<p id="EN-US_TOPIC_0000001145830699__p5892155115356"><span id="EN-US_TOPIC_0000001145830699__text1413194050">GaussDB(DWS)</span> does not provide any predefined Ispell dictionaries or dictionary files. The .dict files and .affix files support multiple open-source dictionary formats, including <strong id="EN-US_TOPIC_0000001145830699__b1159291018118">Ispell</strong>, <strong id="EN-US_TOPIC_0000001145830699__b334215128112">MySpell</strong>, and <strong id="EN-US_TOPIC_0000001145830699__b182667146117">Hunspell</strong>.</p>
|
|
<div class="section" id="EN-US_TOPIC_0000001145830699__section737061503610"><h4 class="sectiontitle">Procedure</h4><ol id="EN-US_TOPIC_0000001145830699__ol14501539114610"><li id="EN-US_TOPIC_0000001145830699__li450163974617"><span>Obtain the dictionary definition file (.dict) and affix file (.affix).</span><p><p id="EN-US_TOPIC_0000001145830699__p959419111211">You can use an open-source dictionary. The name extensions of the open-source dictionary may be .aff and .dic. In this case, you need to change them to .affix and .dict. In addition, for some dictionary files (for example, Norwegian dictionary files), you need to run the following commands to convert the character encoding to UTF-8:</p>
|
|
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001145830699__screen8456192613377"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><span class="n">iconv</span><span class="w"> </span><span class="o">-</span><span class="n">f</span><span class="w"> </span><span class="n">ISO_8859</span><span class="o">-</span><span class="mi">1</span><span class="w"> </span><span class="o">-</span><span class="n">t</span><span class="w"> </span><span class="n">UTF</span><span class="o">-</span><span class="mi">8</span><span class="w"> </span><span class="o">-</span><span class="n">o</span><span class="w"> </span><span class="n">nn_no</span><span class="p">.</span><span class="n">affix</span><span class="w"> </span><span class="n">nn_NO</span><span class="p">.</span><span class="n">aff</span><span class="w"> </span>
|
|
<span class="n">iconv</span><span class="w"> </span><span class="o">-</span><span class="n">f</span><span class="w"> </span><span class="n">ISO_8859</span><span class="o">-</span><span class="mi">1</span><span class="w"> </span><span class="o">-</span><span class="n">t</span><span class="w"> </span><span class="n">UTF</span><span class="o">-</span><span class="mi">8</span><span class="w"> </span><span class="o">-</span><span class="n">o</span><span class="w"> </span><span class="n">nn_no</span><span class="p">.</span><span class="n">dict</span><span class="w"> </span><span class="n">nn_NO</span><span class="p">.</span><span class="n">dic</span><span class="w"></span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</p></li><li id="EN-US_TOPIC_0000001145830699__li18501639134619"><span>Create an Ispell dictionary.</span><p><div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001145830699__screen101864317208"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span>
|
|
<span class="normal">3</span>
|
|
<span class="normal">4</span>
|
|
<span class="normal">5</span>
|
|
<span class="normal">6</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="nb">TEXT</span><span class="w"> </span><span class="k">SEARCH</span><span class="w"> </span><span class="k">DICTIONARY</span><span class="w"> </span><span class="n">norwegian_ispell</span><span class="w"> </span><span class="p">(</span><span class="w"></span>
|
|
<span class="w"> </span><span class="k">TEMPLATE</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ispell</span><span class="p">,</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">DictFile</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">nn_no</span><span class="p">,</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">AffFile</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">nn_no</span><span class="p">,</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">FilePath</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'obs://bucket_name/path accesskey=ak secretkey=sk region=rg'</span><span class="w"></span>
|
|
<span class="p">);</span><span class="w"></span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
<p id="EN-US_TOPIC_0000001145830699__p436810233391">The full name of the Ispell dictionary file is <strong id="EN-US_TOPIC_0000001145830699__b4121528527">nn_no.dict</strong> and <strong id="EN-US_TOPIC_0000001145830699__b111817280218">nn_no.affix</strong>, and the dictionary is stored in the <strong id="EN-US_TOPIC_0000001145830699__b5488121213548"> 'obs://bucket01/obs.xxx.xxx.com accesskey=xxxxx secretkey=xxxxx region=</strong><em id="EN-US_TOPIC_0000001145830699__i56911074305"><span id="EN-US_TOPIC_0000001145830699__ph15401172810204">xx-xx-xx</span></em>'. For details about the syntax and parameters for creating an Ispell dictionary, see <a href="dws_06_0183.html">CREATE TEXT SEARCH DICTIONARY</a>.</p>
|
|
</p></li><li id="EN-US_TOPIC_0000001145830699__li1550143934613"><span>Use the Ispell dictionary to split compound words.</span><p><div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001145830699__screen2527244202618"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span>
|
|
<span class="normal">3</span>
|
|
<span class="normal">4</span>
|
|
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">ts_lexize</span><span class="p">(</span><span class="s1">'norwegian_ispell'</span><span class="p">,</span><span class="w"> </span><span class="s1">'sjokoladefabrikk'</span><span class="p">);</span><span class="w"></span>
|
|
<span class="w"> </span><span class="n">ts_lexize</span><span class="w"> </span>
|
|
<span class="c1">---------------------</span>
|
|
<span class="w"> </span><span class="err">{</span><span class="n">sjokolade</span><span class="p">,</span><span class="n">fabrikk</span><span class="err">}</span><span class="w"></span>
|
|
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span><span class="w"></span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
<p id="EN-US_TOPIC_0000001145830699__p199091334174211"><strong id="EN-US_TOPIC_0000001145830699__b437832232110">MySpell</strong> does not support compound words. <strong id="EN-US_TOPIC_0000001145830699__b08570353214">Hunspell</strong> supports compound words. <span id="EN-US_TOPIC_0000001145830699__text786846749">GaussDB(DWS)</span> supports only the basic compound word operations of <strong id="EN-US_TOPIC_0000001145830699__b1239141172312">Hunspell</strong>. Generally, an Ispell dictionary recognizes a limited set of words, so they should be followed by another broader dictionary, for example, a Snowball dictionary, which recognizes everything.</p>
|
|
</p></li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_06_0102.html">Dictionaries</a></div>
|
|
</div>
|
|
</div>
|
|
|