forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Lu, Huayi <luhuayi@huawei.com> Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
113 lines
28 KiB
HTML
113 lines
28 KiB
HTML
<a name="EN-US_TOPIC_0000001188270514"></a><a name="EN-US_TOPIC_0000001188270514"></a>
|
|
|
|
<h1 class="topictitle1">CREATE TEXT SEARCH DICTIONARY</h1>
|
|
<div id="body1560407392208"><div class="section" id="EN-US_TOPIC_0000001188270514__s11dbafd02fc245e8ab6884159af9fd0b"><h4 class="sectiontitle">Function</h4><p id="EN-US_TOPIC_0000001188270514__p55141914483"><strong id="EN-US_TOPIC_0000001188270514__b170791094413">CREATE TEXT SEARCH DICTIONARY</strong> creates a full-text retrieval dictionary. A dictionary is used to identify and process particular words during full-text retrieval.</p>
|
|
<p id="EN-US_TOPIC_0000001188270514__p5654125664618">Dictionaries are created by using predefined templates (defined in the <strong id="EN-US_TOPIC_0000001188270514__b171508618712">PG_TS_TEMPLATE</strong> system catalog). Five types of dictionaries can be created, <strong id="EN-US_TOPIC_0000001188270514__b31501061778">Simple</strong>, <strong id="EN-US_TOPIC_0000001188270514__b91511961070">Ispell</strong>, <strong id="EN-US_TOPIC_0000001188270514__b61514615718">Synonym</strong>, <strong id="EN-US_TOPIC_0000001188270514__b415226976">Thesaurus</strong>, and <strong id="EN-US_TOPIC_0000001188270514__b415256372">Snowball</strong>. Each type of dictionaries is used to handle different tasks.</p>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001188270514__s3885984ccb904e76a8b521140774edbb"><h4 class="sectiontitle">Precautions</h4><ul id="EN-US_TOPIC_0000001188270514__ul94641895249"><li id="EN-US_TOPIC_0000001188270514__li10464891245">A user with the <strong id="EN-US_TOPIC_0000001188270514__b178581246164813">SYSADMIN</strong> permission can create a dictionary. Then, the user automatically becomes the owner of the dictionary.</li><li id="EN-US_TOPIC_0000001188270514__li24716712720">A dictionary cannot be created in <strong id="EN-US_TOPIC_0000001188270514__b164661024916">pg_temp</strong> mode.</li><li id="EN-US_TOPIC_0000001188270514__li1695011287207">After a dictionary is created or modified, any modification to the user-defined dictionary definition file will not affect the dictionary in the database. To make such modifications take effect in the dictionary in the database, run the <strong id="EN-US_TOPIC_0000001188270514__b789383117509">ALTER</strong> statement to update the definition file of the dictionary.</li></ul>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001188270514__s097e199c371040739945504301b7678d"><h4 class="sectiontitle">Syntax</h4><div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188270514__s49960349d09943568d3fc82d29c074fc"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span>
|
|
<span class="normal">3</span>
|
|
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="nb">TEXT</span><span class="w"> </span><span class="k">SEARCH</span><span class="w"> </span><span class="k">DICTIONARY</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="k">TEMPLATE</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">template</span>
|
|
<span class="w"> </span><span class="p">[,</span><span class="w"> </span><span class="k">option</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="p">[,</span><span class="w"> </span><span class="p">...</span><span class="w"> </span><span class="p">]]</span>
|
|
<span class="p">);</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001188270514__s49eade590b0f4b098e648dd1dc5e3e60"><h4 class="sectiontitle">Parameter Description</h4><ul id="EN-US_TOPIC_0000001188270514__u94351f73b3e945628b8649a4f9cc4f3e"><li id="EN-US_TOPIC_0000001188270514__l58e147705fc64066994924862c8f66e5"><em id="EN-US_TOPIC_0000001188270514__i101021434544">name</em><p id="EN-US_TOPIC_0000001188270514__ab6d27b2b82d348be80745fa87fb975c7">Specifies the name of a dictionary to be created. (If you do not specify a schema name, the dictionary will be created in the current schema.)</p>
|
|
<p id="EN-US_TOPIC_0000001188270514__a9ee2ac78101e427f90c718c3df1d1e74">Value range: a string, which complies with the identifier naming convention. A value can contain a maximum of 63 characters.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li133625114011"><em id="EN-US_TOPIC_0000001188270514__i730815765716">template</em><p id="EN-US_TOPIC_0000001188270514__p63719564012">Specifies a template name.</p>
|
|
<p id="EN-US_TOPIC_0000001188270514__p93710584015">Valid value: templates (<strong id="EN-US_TOPIC_0000001188270514__b2265320135913">Simple</strong>, <strong id="EN-US_TOPIC_0000001188270514__b1861622175915">Synonym</strong>, <strong id="EN-US_TOPIC_0000001188270514__b339422314596">Thesaurus</strong>, <strong id="EN-US_TOPIC_0000001188270514__b18562125125910">Ispell</strong>, and <strong id="EN-US_TOPIC_0000001188270514__b7692152835912">Snowball</strong>) defined in the <strong id="EN-US_TOPIC_0000001188270514__b12421515115215">PG_TS_TEMPLATE</strong> system catalog</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li1286812455448"><a name="EN-US_TOPIC_0000001188270514__li1286812455448"></a><a name="li1286812455448"></a><em id="EN-US_TOPIC_0000001188270514__i14228183602">option</em><p id="EN-US_TOPIC_0000001188270514__p145461744155218">Specifies a parameter name. Each type of dictionaries has a template containing their custom parameters. Parameters function in a way irrelevant to their setting sequence.</p>
|
|
<ul id="EN-US_TOPIC_0000001188270514__ul73520317301"><li id="EN-US_TOPIC_0000001188270514__li132392610291">Parameters for a <strong id="EN-US_TOPIC_0000001188270514__b11844154418156">Simple</strong> dictionary<ul id="EN-US_TOPIC_0000001188270514__ul164711610132616"><li id="EN-US_TOPIC_0000001188270514__li1521442771710"><strong id="EN-US_TOPIC_0000001188270514__b101381554191714">STOPWORDS</strong><p id="EN-US_TOPIC_0000001188270514__p141621281174">Specifies the name of a file listing stop words. The default file name extension is .stop. For example, if the value of <strong id="EN-US_TOPIC_0000001188270514__b16938318264">STOPWORDS</strong> is <strong id="EN-US_TOPIC_0000001188270514__b394518162613">french</strong>, the actual file name is <strong id="EN-US_TOPIC_0000001188270514__b49451713262">french.stop</strong>. In the file, each line defines a stop word. Dictionaries will ignore blank lines and spaces in the file and convert stop-word phrases into lowercase.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li83020302172"><strong id="EN-US_TOPIC_0000001188270514__b6753155581714">ACCEPT</strong><p id="EN-US_TOPIC_0000001188270514__p780343081713">Specifies whether to accept a non-stop word as recognized. The default value is <strong id="EN-US_TOPIC_0000001188270514__b1116105752117">true</strong>.</p>
|
|
<p id="EN-US_TOPIC_0000001188270514__p5644711224">If <strong id="EN-US_TOPIC_0000001188270514__b862819542228">ACCEPT=true</strong> is set for a <strong id="EN-US_TOPIC_0000001188270514__b14498204318300">Simple</strong> dictionary, no token will be passed to subsequent dictionaries. In this case, you are advised to place the <strong id="EN-US_TOPIC_0000001188270514__b322116331311">Simple</strong> dictionary at the end of the dictionary list. If <strong id="EN-US_TOPIC_0000001188270514__b1639975118311">ACCEPT=false</strong> is set, you are advised to place the <strong id="EN-US_TOPIC_0000001188270514__b18430613133217">Simple</strong> dictionary before at least one dictionary in the list.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li13533193132616"><a name="EN-US_TOPIC_0000001188270514__li13533193132616"></a><a name="li13533193132616"></a><strong id="EN-US_TOPIC_0000001188270514__b953215316264">FILEPATH</strong><p id="EN-US_TOPIC_0000001188270514__p45331531122613">Specifies the directory for storing the stop word file. The stop word file can be stored locally or on the OBS server. If the file is stored locally, the directory format is <strong id="EN-US_TOPIC_0000001188270514__b1507360411105918">'file://</strong><em id="EN-US_TOPIC_0000001188270514__i1059636523105918">absolute_path</em><strong id="EN-US_TOPIC_0000001188270514__b18212144632819">'</strong>. If the file is stored on the OBS server, the directory format is <strong id="EN-US_TOPIC_0000001188270514__b431417295919">'obs://bucket/<em id="EN-US_TOPIC_0000001188270514__i14591574597">path accesskey</em>=<em id="EN-US_TOPIC_0000001188270514__i172451111105914">ak</em> <em id="EN-US_TOPIC_0000001188270514__i20323201425912">secretkey</em>=<em id="EN-US_TOPIC_0000001188270514__i15725101715594">sk</em> <em id="EN-US_TOPIC_0000001188270514__i531015217597">region</em>=<em id="EN-US_TOPIC_0000001188270514__i14709525165911">region_name</em>'</strong>. The directory must be enclosed in single quotation marks ('). The default value is the directory where predefined dictionary files are located. Both the <strong id="EN-US_TOPIC_0000001188270514__b14384919303">FILEPATH</strong> and <strong id="EN-US_TOPIC_0000001188270514__b6445149203012">STOPWORDS</strong> parameters need to be specified.</p>
|
|
<p id="EN-US_TOPIC_0000001188270514__p8298173815119">To create a dictionary using the stop word file on the OBS server, perform the following steps:</p>
|
|
<ol id="EN-US_TOPIC_0000001188270514__ol226715512592"><li id="EN-US_TOPIC_0000001188270514__li226885105915">Upload the stop word file to the OBS server. For example, upload the <strong id="EN-US_TOPIC_0000001188270514__b12273131318331">french.stop</strong> file to the <strong id="EN-US_TOPIC_0000001188270514__b1149811357331">gaussdb</strong> bucket on the OBS server <strong id="EN-US_TOPIC_0000001188270514__b18954135413333">obsv3.sa-fb-1.externaldemo.com</strong>. The URL is <strong id="EN-US_TOPIC_0000001188270514__b97611414133411">https://gaussdb.obsv3.sa-fb-1.externaldemo.com/french.stop</strong>. For details about how to upload the file and query the URL, see the <em id="EN-US_TOPIC_0000001188270514__i1494491411357">OBS User Manual</em>.</li><li id="EN-US_TOPIC_0000001188270514__li135454231133">Add <strong id="EN-US_TOPIC_0000001188270514__b095484543515">"</strong><em id="EN-US_TOPIC_0000001188270514__i192894492350">region_name</em><strong id="EN-US_TOPIC_0000001188270514__b13649125214357">": "</strong><em id="EN-US_TOPIC_0000001188270514__i13416166123614">obs domain</em><strong id="EN-US_TOPIC_0000001188270514__b9646141663610">"</strong> to the <strong id="EN-US_TOPIC_0000001188270514__b15333734183616">$GAUSSHOME/etc/region_map</strong> file. <em id="EN-US_TOPIC_0000001188270514__i9918184133614">region_name</em> can be a string consisting of uppercase letters, lowercase letters, digits, slashes (/), or underscores (_). <em id="EN-US_TOPIC_0000001188270514__i198144993717">obs domain</em> indicates the domain name of the OBS server.<p id="EN-US_TOPIC_0000001188270514__p2121251319">For example, if <em id="EN-US_TOPIC_0000001188270514__i158921054163717">region_name</em> is set to <strong id="EN-US_TOPIC_0000001188270514__b346216584372">rg</strong>, <strong id="EN-US_TOPIC_0000001188270514__b192297134389">region_map</strong> is as follows: <strong id="EN-US_TOPIC_0000001188270514__b0324134123819">"rg": "obsv3.sa-fb-1.externaldemo.com"</strong>.</p>
|
|
<div class="notice" id="EN-US_TOPIC_0000001188270514__note20607337349"><span class="noticetitle"><img src="public_sys-resources/notice_3.0-en-us.png"> </span><div class="noticebody"><p id="EN-US_TOPIC_0000001188270514__p93611249047"><em id="EN-US_TOPIC_0000001188270514__i12316857123810">region_name</em> and <em id="EN-US_TOPIC_0000001188270514__i81091417193910">obs domain</em> are enclosed in double quotation marks. There is no space on the left of the colon and one space on the right of the colon.</p>
|
|
</div></div>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li743063518594">Run the <strong id="EN-US_TOPIC_0000001188270514__b11900162104020">CREATE TEXT SEARCH DICTIONARY</strong> command to create a dictionary. The command is as follows:</li></ol>
|
|
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188270514__screen18990927121313"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="nb">TEXT</span><span class="w"> </span><span class="k">SEARCH</span><span class="w"> </span><span class="k">DICTIONARY</span><span class="w"> </span><span class="n">french_dict</span><span class="w"> </span><span class="p">(</span><span class="w"> </span><span class="k">TEMPLATE</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">pg_catalog</span><span class="p">.</span><span class="k">simple</span><span class="p">,</span><span class="w"> </span><span class="n">STOPWORDS</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">french</span><span class="p">,</span><span class="w"> </span><span class="n">FILEPATH</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'obs://gaussdb accesskey=xxx secretkey=yyy region=rg'</span><span class="w"> </span><span class="p">);</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
<p id="EN-US_TOPIC_0000001188270514__p89722551171">The <strong id="EN-US_TOPIC_0000001188270514__b9235175918408">french.stop</strong> file is stored in the root directory of the <strong id="EN-US_TOPIC_0000001188270514__b10403626164114">gaussdb</strong> bucket. Therefore, the <em id="EN-US_TOPIC_0000001188270514__i152122041124113">path</em> is empty.</p>
|
|
</li></ul>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li3769818163018">Parameters for a <strong id="EN-US_TOPIC_0000001188270514__b165616225119">Synonym</strong> dictionary<ul id="EN-US_TOPIC_0000001188270514__ul1786729203116"><li id="EN-US_TOPIC_0000001188270514__li9908122611315"><strong id="EN-US_TOPIC_0000001188270514__b23599148445">SYNONYM</strong><p id="EN-US_TOPIC_0000001188270514__p149066276287">Specifies the name of the definition file for a <strong id="EN-US_TOPIC_0000001188270514__b133171681127">Synonym</strong> dictionary. The default file name extension is .syn.</p>
|
|
<p id="EN-US_TOPIC_0000001188270514__p39811136114718">The file is a list of synonyms. Each line is in the format of <em id="EN-US_TOPIC_0000001188270514__i1189211065">token synonym</em>, that is, token and its synonym separated by a space.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li5422183819471"><strong id="EN-US_TOPIC_0000001188270514__b162141117154410">CASESENSITIVE</strong><p id="EN-US_TOPIC_0000001188270514__p1271883924716">Specifies whether tokens and their synonyms are case sensitive. The default value is <strong id="EN-US_TOPIC_0000001188270514__b188471141172">false</strong>, indicating that tokens and synonyms in dictionary files will be converted into lowercase. If this parameter is set to <strong id="EN-US_TOPIC_0000001188270514__b15945102917820">true</strong>, they will not be converted into lowercase.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li2975203353118"><strong id="EN-US_TOPIC_0000001188270514__b14974833193111">FILEPATH</strong><p id="EN-US_TOPIC_0000001188270514__p4975333103113">Specifies the directory for storing <strong id="EN-US_TOPIC_0000001188270514__b156018264106">Synonym</strong> dictionary files. The directory can be a local directory or an OBS directory. The default value is the directory where predefined dictionary files are located. The directory format and the process of creating a <strong id="EN-US_TOPIC_0000001188270514__b78511945184217">Synonym</strong> dictionary using a file on the OBS server are the same as those of the <a href="#EN-US_TOPIC_0000001188270514__li13533193132616">FILEPATH of the Simple dictionary</a>.</p>
|
|
</li></ul>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li172887716330">Parameters for a <strong id="EN-US_TOPIC_0000001188270514__b882614720100">Thesaurus</strong> dictionary<ul id="EN-US_TOPIC_0000001188270514__ul0565165617272"><li id="EN-US_TOPIC_0000001188270514__li13265721131414"><strong id="EN-US_TOPIC_0000001188270514__b12812195313552">DICTFILE</strong><p id="EN-US_TOPIC_0000001188270514__p1623814161718">Specifies the name of a dictionary definition file. The default file name extension is .ths.</p>
|
|
<p id="EN-US_TOPIC_0000001188270514__p86584227140">The file is a list of synonyms. Each line is in the format of <em id="EN-US_TOPIC_0000001188270514__i175612031318">sample words</em> <strong id="EN-US_TOPIC_0000001188270514__b8402184141310">:</strong> <em id="EN-US_TOPIC_0000001188270514__i8215315121315">indexed words</em>. The colon (:) is used as a separator between a phrase and its substitute word. If multiple sample words are matched, the TZ selects the longest one.</p>
|
|
</li></ul>
|
|
<ul id="EN-US_TOPIC_0000001188270514__ul10566175692712"><li id="EN-US_TOPIC_0000001188270514__li455652516147"><strong id="EN-US_TOPIC_0000001188270514__b1122613555554">DICTIONARY</strong><p id="EN-US_TOPIC_0000001188270514__p1822292741416">Specifies the name of a subdictionary used for word normalization. This parameter is mandatory and only one subdictionary name can be specified. The specified subdictionary must exist. It is used to identify and normalize input text before phrase matching.</p>
|
|
<p id="EN-US_TOPIC_0000001188270514__p1242051723714">If an input word cannot be recognized by the subdictionary, an error will be reported. In this case, remove the word or update the subdictionary to make the word recognizable. In addition, an asterisk (*) can be placed at the beginning of an indexed word to skip the application of a subdictionary on it, but all sample words must be recognizable by the subdictionary.</p>
|
|
<div class="p" id="EN-US_TOPIC_0000001188270514__p9897225193416">If the sample words defined in the dictionary file contain stop words defined in the subdictionary, use question marks (?) to replace them. Assume that <strong id="EN-US_TOPIC_0000001188270514__b280811397241">a</strong> and <strong id="EN-US_TOPIC_0000001188270514__b82315431241">the</strong> are stop words defined in the subdictionary.<pre class="screen" id="EN-US_TOPIC_0000001188270514__screen128832051145213">? one ? two : swsw</pre>
|
|
</div>
|
|
<p id="EN-US_TOPIC_0000001188270514__p314914310533"><strong id="EN-US_TOPIC_0000001188270514__b736785019279">a one the two</strong> and <strong id="EN-US_TOPIC_0000001188270514__b1672918591274">the one a two</strong> will be matched and output as <strong id="EN-US_TOPIC_0000001188270514__b883842032816">swsw</strong>.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li3693203211140"><strong id="EN-US_TOPIC_0000001188270514__b71872599556">FILEPATH</strong><p id="EN-US_TOPIC_0000001188270514__p201381234141411">Specifies the directory for storing dictionary definition files. The directory can be a local directory or an OBS directory. The default value is the directory where predefined dictionary files are located. The directory format and the process of creating a <strong id="EN-US_TOPIC_0000001188270514__b16928194554411">Synonym</strong> dictionary using a file on the OBS server are the same as those of the <a href="#EN-US_TOPIC_0000001188270514__li13533193132616">FILEPATH of the Simple dictionary</a>.</p>
|
|
</li></ul>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li3590340173413">Parameters for an <strong id="EN-US_TOPIC_0000001188270514__b56982286292">Ispell</strong> dictionary<ul id="EN-US_TOPIC_0000001188270514__ul8643134014350"><li id="EN-US_TOPIC_0000001188270514__li6403123817359"><strong id="EN-US_TOPIC_0000001188270514__b135851231145610">DICTFILE</strong><p id="EN-US_TOPIC_0000001188270514__p169415319334">Specifies the name of a dictionary definition file. The default file name extension is .dict.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li44331630183512"><strong id="EN-US_TOPIC_0000001188270514__b1543323093511">AFFFILE</strong><p id="EN-US_TOPIC_0000001188270514__p1743317304351">Specifies the name of an affix file. The default file name extension is .affix.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li19991204413517"><strong id="EN-US_TOPIC_0000001188270514__b2990044193516">STOPWORDS</strong><p id="EN-US_TOPIC_0000001188270514__p19991184416354">Specifies the name of a file listing stop words. The default file name extension is .stop. The file content format is the same as that of the file for a <strong id="EN-US_TOPIC_0000001188270514__b188616161340">Simple</strong> dictionary.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li821674983510"><strong id="EN-US_TOPIC_0000001188270514__b162161149203511">FILEPATH</strong><p id="EN-US_TOPIC_0000001188270514__p86651356113516">Specifies the directory for storing dictionary files. The directory can be a local directory or an OBS directory. The default value is the directory where predefined dictionary files are located. The directory format and the process of creating a <strong id="EN-US_TOPIC_0000001188270514__b03251811134510">Synonym</strong> dictionary using a file on the OBS server are the same as those of the <a href="#EN-US_TOPIC_0000001188270514__li13533193132616">FILEPATH of the Simple dictionary</a>.</p>
|
|
</li></ul>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li1585912584354">Parameters for a <strong id="EN-US_TOPIC_0000001188270514__b15421226163516">Snowball</strong> dictionary<ul id="EN-US_TOPIC_0000001188270514__ul622714279442"><li id="EN-US_TOPIC_0000001188270514__li1165615544373"><strong id="EN-US_TOPIC_0000001188270514__b19949171714574">LANGUAGE</strong><p id="EN-US_TOPIC_0000001188270514__p121185610373">Specifies the name of a language whose stemming algorithm will be used. According to spelling rules in the language, the algorithm normalizes the variants of an input word into a basic word or a stem.</p>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li10262357173720"><strong id="EN-US_TOPIC_0000001188270514__b46341815125710">STOPWORDS</strong><p id="EN-US_TOPIC_0000001188270514__p20978757173720">Specifies the name of a file listing stop words. The default file name extension is .stop. The file content format is the same as that of the file for a <strong id="EN-US_TOPIC_0000001188270514__b133746510423">Simple</strong> dictionary.</p>
|
|
</li></ul>
|
|
<ul id="EN-US_TOPIC_0000001188270514__ul1862715564217"><li id="EN-US_TOPIC_0000001188270514__li594740133814"><strong id="EN-US_TOPIC_0000001188270514__b8786121310577">FILEPATH</strong><p id="EN-US_TOPIC_0000001188270514__p196391115386">Specifies the directory for storing dictionary definition files. The directory can be a local directory or an OBS directory. The default value is the directory where predefined dictionary files are located. Both the <strong id="EN-US_TOPIC_0000001188270514__b36441155153111">FILEPATH</strong> and <strong id="EN-US_TOPIC_0000001188270514__b18645115593113">STOPWORDS</strong> parameters need to be specified. The directory format and the process of creating a <strong id="EN-US_TOPIC_0000001188270514__b10503105116458">Snowball</strong> dictionary using a file on the OBS server are the same as those of the <strong id="EN-US_TOPIC_0000001188270514__b2093014387464">Simple</strong> dictionary.</p>
|
|
</li></ul>
|
|
</li></ul>
|
|
<div class="note" id="EN-US_TOPIC_0000001188270514__note1970820111566"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="EN-US_TOPIC_0000001188270514__ul1036363368"><li id="EN-US_TOPIC_0000001188270514__li11364633618">The predefined dictionary file is stored in the <strong id="EN-US_TOPIC_0000001188270514__b1782102674716">$GAUSSHOME/share/postgresql/tsearch_data</strong> directory.</li></ul>
|
|
<ul id="EN-US_TOPIC_0000001188270514__ul10996181023617"><li id="EN-US_TOPIC_0000001188270514__li13996410103617">The name of a dictionary definition file can contain only lowercase letters, numbers, and underscores (_).</li></ul>
|
|
</div></div>
|
|
</li><li id="EN-US_TOPIC_0000001188270514__li6444162716327"><em id="EN-US_TOPIC_0000001188270514__i13518523428">value</em><p id="EN-US_TOPIC_0000001188270514__p15861137193115">Specifies a parameter value. If the value is not an identifier or a number, enclose it with single quotation marks (''). You can also enclose identifiers and numbers.</p>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001188270514__s974b31b9463848259512367e2d285b74"><h4 class="sectiontitle">Examples</h4><p id="EN-US_TOPIC_0000001188270514__p20700113019291">Create an <strong id="EN-US_TOPIC_0000001188270514__b212790533133410">Ispell</strong> dictionary <strong id="EN-US_TOPIC_0000001188270514__b200967695533410">english_ispell</strong> (the dictionary definition file is from the open source dictionary):</p>
|
|
<div class="notice" id="EN-US_TOPIC_0000001188270514__note29448236569"><span class="noticetitle"><img src="public_sys-resources/notice_3.0-en-us.png"> </span><div class="noticebody"><p id="EN-US_TOPIC_0000001188270514__p1794462317569">Hard-coded or plaintext AK and SK are risky. For security purposes, encrypt your AK and SK and store them in the configuration file or environment variables.</p>
|
|
</div></div>
|
|
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001188270514__en-us_topic_0176515517_screen47831721181510"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
|
|
<span class="normal">2</span>
|
|
<span class="normal">3</span>
|
|
<span class="normal">4</span>
|
|
<span class="normal">5</span>
|
|
<span class="normal">6</span>
|
|
<span class="normal">7</span>
|
|
<span class="normal">8</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">DROP</span><span class="w"> </span><span class="nb">TEXT</span><span class="w"> </span><span class="k">SEARCH</span><span class="w"> </span><span class="k">DICTIONARY</span><span class="w"> </span><span class="k">IF</span><span class="w"> </span><span class="k">EXISTS</span><span class="w"> </span><span class="n">english_ispell</span><span class="p">;</span>
|
|
<span class="k">CREATE</span><span class="w"> </span><span class="nb">TEXT</span><span class="w"> </span><span class="k">SEARCH</span><span class="w"> </span><span class="k">DICTIONARY</span><span class="w"> </span><span class="n">english_ispell</span><span class="w"> </span><span class="p">(</span>
|
|
<span class="w"> </span><span class="k">TEMPLATE</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ispell</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">DictFile</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">english</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">AffFile</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">english</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">StopWords</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">english</span><span class="p">,</span>
|
|
<span class="w"> </span><span class="n">FilePath</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'obs://bucket_name/path accesskey=ak secretkey=sk region=rg'</span><span class="w"> </span>
|
|
<span class="p">);</span>
|
|
</pre></div></td></tr></table></div>
|
|
|
|
</div>
|
|
<p id="EN-US_TOPIC_0000001188270514__p1082126112110">Create an <strong id="EN-US_TOPIC_0000001188270514__b163901891475">Snowball</strong> dictionary <strong id="EN-US_TOPIC_0000001188270514__b13391399712">english_snowball</strong> (the dictionary definition file is from the open source dictionary):</p>
|
|
<div class="notice" id="EN-US_TOPIC_0000001188270514__note8911201315239"><span class="noticetitle"><img src="public_sys-resources/notice_3.0-en-us.png"> </span><div class="noticebody"><p id="EN-US_TOPIC_0000001188270514__p13912121322315">Hard-coded or plaintext AK and SK are risky. For security purposes, encrypt your AK and SK and store them in the configuration file or environment variables.</p>
|
|
</div></div>
|
|
<pre class="screen" id="EN-US_TOPIC_0000001188270514__screen925825682017">DROP TEXT SEARCH DICTIONARY IF EXISTS english_snowball;
|
|
CREATE TEXT SEARCH DICTIONARY english_snowball (
|
|
TEMPLATE = snowball,
|
|
Language = english,
|
|
StopWords = english,
|
|
FilePath = 'obs://bucket_name/path accesskey=ak secretkey=sk region=rg'
|
|
);</pre>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000001188270514__s56e26682f0c24a8987ba07e3099ff0a9"><h4 class="sectiontitle">Helpful Links</h4><p id="EN-US_TOPIC_0000001188270514__p63691819155"><a href="dws_06_0146.html">ALTER TEXT SEARCH DICTIONARY</a>, <a href="dws_06_0183.html">CREATE TEXT SEARCH DICTIONARY</a></p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_06_0118.html">DDL Syntax</a></div>
|
|
</div>
|
|
</div>
|
|
|