doc-exports/docs/css/umn/css_01_0124.html
Wuwan, Qi 050b395397 CSS UMN 23.2.1 20230926
Reviewed-by: Kacur, Michal <michal.kacur@t-systems.com>
Co-authored-by: Wuwan, Qi <wuwanqi1@noreply.gitea.eco.tsi-dev.otc-service.com>
Co-committed-by: Wuwan, Qi <wuwanqi1@noreply.gitea.eco.tsi-dev.otc-service.com>
2024-01-10 14:23:15 +00:00

87 lines
9.4 KiB
HTML

<a name="css_01_0124"></a><a name="css_01_0124"></a>
<h1 class="topictitle1">(Optional) Pre-Building and Registering a Center Point Vector</h1>
<div id="body0000001200074646"><p id="css_01_0124__en-us_topic_0000001223434468_p11686115115">When you perform operations in <a href="css_01_0121.html#css_01_0121__en-us_topic_0000001309709789_section137344225249">Creating a Vector Index</a>, if <strong id="css_01_0124__en-us_topic_0000001223434468_b7219183093110">IVF_GRAPH</strong> and <strong id="css_01_0124__en-us_topic_0000001223434468_b967123213317">IVF_GRAPH_PQ</strong> index algorithms are selected, you need to pre-build and register the center point vector.</p>
<div class="section" id="css_01_0124__en-us_topic_0000001223434468_section19987133515201"><h4 class="sectiontitle">Context</h4><p id="css_01_0124__en-us_topic_0000001223434468_p139720427323">The vector index acceleration algorithms <strong id="css_01_0124__en-us_topic_0000001223434468_b16704154831211">IVF_GRAPH</strong> and <strong id="css_01_0124__en-us_topic_0000001223434468_b1423720554124">IVF_GRAPH_PQ</strong> are suitable for ultra-large-scale computing. These two algorithms allow you to narrow down the query range by dividing a vector space into subspaces through clustering or random sampling. Before pre-build, you need to obtain all center point vectors by clustering or random sampling.</p>
<p id="css_01_0124__en-us_topic_0000001223434468_p1739815429328">Then, pre-construct and register the center point vectors to create the <strong id="css_01_0124__en-us_topic_0000001223434468_b2349530161212">GRAPH</strong> or <strong id="css_01_0124__en-us_topic_0000001223434468_b579103314129">GRAPH_PQ</strong> index and register them with the Elasticsearch cluster. All nodes in the cluster can share the index file. Reuse of the center index among shards can effectively reduce the training overhead and the number of center index queries, improving the write and query performance.</p>
</div>
<div class="section" id="css_01_0124__en-us_topic_0000001223434468_section1266421212"><h4 class="sectiontitle">Procedure</h4><ol id="css_01_0124__en-us_topic_0000001223434468_ol16398134210325"><li id="css_01_0124__en-us_topic_0000001223434468_li1916520552213">On the <strong id="css_01_0124__en-us_topic_0000001223434468_b474592712565">Clusters</strong> page, locate the target cluster, and click <strong id="css_01_0124__en-us_topic_0000001223434468_b183162050172013">Access Kibana</strong> in the <strong id="css_01_0124__en-us_topic_0000001223434468_b71258553205">Operation</strong> column.</li><li id="css_01_0124__en-us_topic_0000001223434468_li823811672217">Click <strong id="css_01_0124__en-us_topic_0000001223434468_b1663985102118">Dev Tools</strong> in the navigation tree on the left.</li><li id="css_01_0124__en-us_topic_0000001223434468_li133981642183217">Create a center point index table.<ul id="css_01_0124__en-us_topic_0000001223434468_ul3591359377"><li id="css_01_0124__en-us_topic_0000001223434468_li175913591075">For example, if the created index is named <strong id="css_01_0124__en-us_topic_0000001223434468_b15188235293">my_dict</strong>, <strong id="css_01_0124__en-us_topic_0000001223434468_b1862525192917">number_of_shards</strong> of the index must be set to <strong id="css_01_0124__en-us_topic_0000001223434468_b2227182962919">1</strong>. Otherwise, the index cannot be registered.</li><li id="css_01_0124__en-us_topic_0000001223434468_li125928591978">If you want to use the <strong id="css_01_0124__en-us_topic_0000001223434468_b1930185802910">IVF_GRAPH</strong> index, set <strong id="css_01_0124__en-us_topic_0000001223434468_b147699717309">algorithm</strong> of the center point index to <strong id="css_01_0124__en-us_topic_0000001223434468_b114851532303">GRAPH</strong>.</li><li id="css_01_0124__en-us_topic_0000001223434468_li8592259277">If you want to use the <strong id="css_01_0124__en-us_topic_0000001223434468_b101647142302">IVF_GRAPH_PQ</strong> index, set <strong id="css_01_0124__en-us_topic_0000001223434468_b21641014163020">algorithm</strong> of the center point index to <strong id="css_01_0124__en-us_topic_0000001223434468_b01641514193012">GRAPH_PQ</strong>.</li></ul>
<pre class="screen" id="css_01_0124__en-us_topic_0000001223434468_screen1574012265910">PUT my_dict
{
"settings": {
"index": {
"vector": true
},
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"my_vector": {
"type": "vector",
"dimension": 2,
"indexing": true,
"algorithm": "GRAPH",
"metric": "euclidean"
}
}
}
}</pre>
</li><li id="css_01_0124__en-us_topic_0000001223434468_li13398134263215">Write the center point vector to the created index.<p id="css_01_0124__en-us_topic_0000001223434468_p103981842143210"><a name="css_01_0124__en-us_topic_0000001223434468_li13398134263215"></a><a name="en-us_topic_0000001223434468_li13398134263215"></a>Write the center point vector obtained through sampling or clustering into the created <strong id="css_01_0124__en-us_topic_0000001223434468_b1614514533211">my_dict</strong> index by referring to <a href="css_01_0121.html#css_01_0121__en-us_topic_0000001309709789_section137931314240">Importing Vector Data</a>.</p>
</li><li id="css_01_0124__en-us_topic_0000001223434468_li18398142103219">Call the registration API.<p id="css_01_0124__en-us_topic_0000001223434468_p5398104243220"><a name="css_01_0124__en-us_topic_0000001223434468_li18398142103219"></a><a name="en-us_topic_0000001223434468_li18398142103219"></a>Register the created <strong id="css_01_0124__en-us_topic_0000001223434468_b1638601243312">my_dict</strong> index with a <strong id="css_01_0124__en-us_topic_0000001223434468_b105061773312">Dict</strong> object with a globally unique identifier name (<strong id="css_01_0124__en-us_topic_0000001223434468_b851562473319">dict_name</strong>).</p>
<pre class="screen" id="css_01_0124__en-us_topic_0000001223434468_screen517103517010">PUT _vector/register/my_dict
{
"dict_name": "my_dict"
}</pre>
</li><li id="css_01_0124__en-us_topic_0000001223434468_li53982422324">Create an <strong id="css_01_0124__en-us_topic_0000001223434468_b62141215191114">IVF_GRAPH</strong> or <strong id="css_01_0124__en-us_topic_0000001223434468_b625718201111">IVF_GRAPH_PQ</strong> index.<p id="css_01_0124__en-us_topic_0000001223434468_p153986429320">You do not need to specify the dimension and metric information. Simply specify the registered dictionary name.</p>
<pre class="screen" id="css_01_0124__en-us_topic_0000001223434468_screen570595413012">PUT my_index
{
"settings": {
"index": {
"vector": true
}
},
"mappings": {
"properties": {
"my_vector": {
"type": "vector",
"indexing": true,
"algorithm": "IVF_GRAPH",
"dict_name": "my_dict",
"offload_ivf": false
}
}
}
}</pre>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="css_01_0124__en-us_topic_0000001223434468_table14306164211323" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Field mappings parameters</caption><thead align="left"><tr id="css_01_0124__en-us_topic_0000001223434468_row639912428329"><th align="left" class="cellrowborder" valign="top" width="27.27%" id="mcps1.3.3.2.6.5.2.3.1.1"><p id="css_01_0124__en-us_topic_0000001223434468_p53991842123212">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="72.72999999999999%" id="mcps1.3.3.2.6.5.2.3.1.2"><p id="css_01_0124__en-us_topic_0000001223434468_p1539964293215">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="css_01_0124__en-us_topic_0000001223434468_row12399104293212"><td class="cellrowborder" valign="top" width="27.27%" headers="mcps1.3.3.2.6.5.2.3.1.1 "><p id="css_01_0124__en-us_topic_0000001223434468_p173991842153218">dict_name</p>
</td>
<td class="cellrowborder" valign="top" width="72.72999999999999%" headers="mcps1.3.3.2.6.5.2.3.1.2 "><p id="css_01_0124__en-us_topic_0000001223434468_p7399144233215">Specifies the name of the depended central point index. The vector dimension and measurement metric of the index are the same as those of the Dict index.</p>
</td>
</tr>
<tr id="css_01_0124__en-us_topic_0000001223434468_row039917425324"><td class="cellrowborder" valign="top" width="27.27%" headers="mcps1.3.3.2.6.5.2.3.1.1 "><p id="css_01_0124__en-us_topic_0000001223434468_p1739924293214">offload_ivf</p>
</td>
<td class="cellrowborder" valign="top" width="72.72999999999999%" headers="mcps1.3.3.2.6.5.2.3.1.2 "><p id="css_01_0124__en-us_topic_0000001223434468_p8399142153214">Unloads the IVF inverted index implemented by the underlying index to Elasticsearch. In this way, the use of non-heap memory and the overhead of write and merge operations are reduced. However, the query performance also deteriorates. You can use the default value.</p>
<p id="css_01_0124__en-us_topic_0000001223434468_p739994217321">Value: <strong id="css_01_0124__en-us_topic_0000001223434468_b2050147514103828">true</strong> or <strong id="css_01_0124__en-us_topic_0000001223434468_b1656082363103828">false</strong></p>
<p id="css_01_0124__en-us_topic_0000001223434468_p203991842103215">Default value: <strong id="css_01_0124__en-us_topic_0000001223434468_b1836607280103828">false</strong></p>
</td>
</tr>
</tbody>
</table>
</div>
</li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="css_01_0117.html">Vector Retrieval</a></div>
</div>
</div>