doc-exports/docs/mrs/umn/ALM-45427.html
Yang, Tong 3b1f73dece MRS UMN 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-13 12:03:34 +00:00

98 lines
14 KiB
HTML

<a name="ALM-45427"></a><a name="ALM-45427"></a>
<h1 class="topictitle1">ALM-45427 ClickHouse Service Capacity Quota Usage in ZooKeeper Exceeds the Threshold</h1>
<div id="body1606211201997"><div class="section" id="ALM-45427__section8280367"><h4 class="sectiontitle">Description</h4><p id="ALM-45427__p107817516304">The alarm module checks the quota usage of the ClickHouse service in the ZooKeeper every 60 seconds. This alarm is generated when the alarm module detects that the usage exceeds the threshold (90%).</p>
<p id="ALM-45427__p5490163382810">This alarm is cleared when the system detects that the usage is lower than the threshold and the alarm is cleared.</p>
</div>
<div class="section" id="ALM-45427__section7414445"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-45427__table45079949" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-45427__row5683496"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-45427__p57710042">Alarm ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-45427__p44001849">Alarm Severity</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-45427__p7380012">Auto Clear</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-45427__row60910108"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-45427__p16488194717492">45427</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-45427__p588994817496">Major (default)</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-45427__p34071398">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-45427__section66730009"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-45427__table8319831" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-45427__row40868022"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-45427__p21975462">Name</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-45427__p35182007">Meaning</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-45427__row594512751512"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-45427__p8838358184914">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-45427__p837170125015">Specifies the cluster for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-45427__row31170320"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-45427__p39123317">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-45427__p172628810500">Specifies the service for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-45427__row2072013571152"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-45427__p16720175751516">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-45427__p1472014572158">Specifies the role for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-45427__row1557014212165"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-45427__p05702027168">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-45427__p1057022151613">Specifies the host for which the alarm is generated.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-45427__section63699172"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-45427__p485055019508">After the ZooKeeper quantity quota of the ClickHouse service exceeds the threshold, you cannot perform cluster operations on the ClickHouse service on FusionInsight Manager. As a result, the ClickHouse service cannot be used.</p>
</div>
<div class="section" id="ALM-45427__section36421639"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-45427__ul193311222132313"><li id="ALM-45427__li1133120229232">When table data is created, inserted, or deleted, the ClickHouse creates znodes on ZooKeeper nodes. As the service volume increases, the capacity of znodes may exceed the configured threshold.</li><li id="ALM-45427__li1309112592316">No quota limit is set for the metadata directory <strong id="ALM-45427__b960122917418">/clickhouse</strong> of ClickHouse in ZooKeeper.</li></ul>
</div>
<div class="section" id="ALM-45427__section2425015133012"><h4 class="sectiontitle">Procedure</h4><p id="ALM-45427__p19193152810241"><strong id="ALM-45427__b10127162895113">Check the znode capacity of the ClickHouse in the ZooKeeper.</strong></p>
<ol id="ALM-45427__ol15833103011437"><li id="ALM-45427__li19429132053415"><a name="ALM-45427__li19429132053415"></a><a name="li19429132053415"></a><span>Log in to the host where the ZooKeeper client is located and log in to the ZooKeeper client.</span><p><p id="ALM-45427__p9605547340">Switch to the client installation directory.</p>
<p id="ALM-45427__p195631949103411">Example: <strong id="ALM-45427__b2048510321197">cd <span id="ALM-45427__ph381512063917">/opt/client</span></strong></p>
<p id="ALM-45427__p589718591346">Run the following command to configure environment variables:</p>
<p id="ALM-45427__p1289715917344"><strong id="ALM-45427__b1365028103518">source bigdata_env</strong></p>
<p id="ALM-45427__p57261392351">Run the following command to authenticate the user (skip this step in common mode):</p>
<p id="ALM-45427__p147265993516"><strong id="ALM-45427__b132617312353">kinit</strong> <em id="ALM-45427__i717193316356">Component service user</em></p>
<p id="ALM-45427__p148895181357">Run the following command to log in to the client tool:</p>
<p id="ALM-45427__p1788951812353"><strong id="ALM-45427__b39504339443330">zkCli.sh -server</strong> <em id="ALM-45427__i124127869643330">service IP address of the node where the ZooKeeper role instance locates</em><strong id="ALM-45427__b169567977343330">:</strong><em id="ALM-45427__i91068997243330">client port</em></p>
</p></li><li id="ALM-45427__li431239102815"><span>Run the following command to check the quota used by the ClickHouse in the ZooKeeper and check whether the quota information is correctly set:</span><p><div class="p" id="ALM-45427__p4443124614296"><strong id="ALM-45427__b4443154652920">listquota /clickhouse</strong><pre class="screen" id="ALM-45427__screen893413614316">absolute path is /zookeeper/quota/clickhouse
Quota for path /clickhouse does not exist.</pre>
</div>
<ul id="ALM-45427__ul141941229155118"><li id="ALM-45427__li8194192914513">If the preceding information indicates that the quota configuration is incorrect, go to <a href="#ALM-45427__li17669171018349">3</a>.</li><li id="ALM-45427__li15478123120517">If not, go to <a href="#ALM-45427__li10833143016438">5</a>. </li></ul>
</p></li><li id="ALM-45427__li17669171018349"><a name="ALM-45427__li17669171018349"></a><a name="li17669171018349"></a><span>Log in to FusionInsight Manager and choose <strong id="ALM-45427__b114591835112">Cluster</strong> &gt; <strong id="ALM-45427__b1752141845117">Services</strong> &gt; <strong id="ALM-45427__b45221855112">ZooKeeper</strong>. On the displayed page, click <strong id="ALM-45427__b19532018175110">Configurations</strong> and click <strong id="ALM-45427__b65351813517">All Configurations</strong>. On this sub-tab page, search for <strong id="ALM-45427__b17541618125114">quotas.auto.check.enable</strong> to check whether its value is <strong id="ALM-45427__b65431819514">true</strong>.</span><p><p id="ALM-45427__p981130171519">If the value is not <strong id="ALM-45427__b290612119510">true</strong>, change the value to <strong id="ALM-45427__b4907192135119">true</strong> and click <strong id="ALM-45427__b390752118519">Save</strong>.</p>
</p></li><li id="ALM-45427__li1455151416374"><span>On FusionInsight Manager, choose <strong id="ALM-45427__b174600265516">Cluster</strong> &gt; <strong id="ALM-45427__b1746714266518">Services</strong> &gt; <strong id="ALM-45427__b2046714260516">ClickHouse</strong>, click <strong id="ALM-45427__b6467202685110">More</strong>, and select <strong id="ALM-45427__b18468102615110">Synchronize Configuration</strong>. After the synchronization is successful, go to <a href="#ALM-45427__li19429132053415">1</a>.</span></li><li id="ALM-45427__li10833143016438"><a name="ALM-45427__li10833143016438"></a><a name="li10833143016438"></a><span>Run the following command and check whether the ratio of the <strong id="ALM-45427__b106011629155117">bytes</strong> value of <strong id="ALM-45427__b660182919518">Output stat</strong> to the <strong id="ALM-45427__b46011290514">bytes</strong> value of <strong id="ALM-45427__b3601152985118">Output quota</strong> in the command output is greater than <strong id="ALM-45427__b960212919518">0.9</strong>:</span><p><div class="p" id="ALM-45427__p208331230174319"><strong id="ALM-45427__b133631336171117">listquota /clickhouse</strong><pre class="screen" id="ALM-45427__screen5140926162614">absolute path is /zookeeper/quota/clickhouse
<strong id="ALM-45427__b567616466261">Output quota</strong> for /clickhouse count=200000,<strong id="ALM-45427__b5701154162613">bytes</strong>=1000000000
<strong id="ALM-45427__b429414517267">Output stat</strong> for /clickhouse count=2667,<strong id="ALM-45427__b591125618260">bytes</strong>=60063</pre>
</div>
<div class="p" id="ALM-45427__p189093432418">In the preceding information, the <strong id="ALM-45427__b749858185916">bytes</strong> value of <strong id="ALM-45427__b372665710518">Output stat</strong> is <strong id="ALM-45427__b12732175775114">60063</strong>, and the <strong id="ALM-45427__b1484316419011">bytes</strong> value of <strong id="ALM-45427__b17732257105112">Output quota</strong> is <strong id="ALM-45427__b17733185765111">1000000000</strong>.<ul id="ALM-45427__ul1383313034316"><li id="ALM-45427__li883303064316">If yes, go to <a href="#ALM-45427__li157515124315">6</a>.</li><li id="ALM-45427__li48331930144319">If no, check whether the alarm is cleared 5 minutes later. If the alarm persists, go to <a href="#ALM-45427__li1460181994211">8</a>.</li></ul>
</div>
</p></li><li id="ALM-45427__li157515124315"><a name="ALM-45427__li157515124315"></a><a name="li157515124315"></a><span>On FusionInsight Manager, choose <strong id="ALM-45427__b1189171411532">Cluster</strong> &gt; <strong id="ALM-45427__b99091445319">Services</strong> &gt; <strong id="ALM-45427__b1890714165314">ClickHouse</strong> &gt; <strong id="ALM-45427__b690141475310">Configurations</strong> &gt; <strong id="ALM-45427__b79011148533">All Configurations</strong>, search for the <strong id="ALM-45427__b129141415316">clickhouse.zookeeper.quota.size</strong> parameter, and change the value of this parameter to twice the <strong id="ALM-45427__b721981120012">bytes</strong> value of <strong id="ALM-45427__b11918148538">Output stat</strong> in <a href="#ALM-45427__li10833143016438">5</a>.</span></li><li id="ALM-45427__li138331030114314"><span>Restart the ClickHouse instance for which the alarm is generated, and check whether the alarm is cleared 5 minutes later.</span><p><ul id="ALM-45427__ul78337307438"><li id="ALM-45427__li2833730184319">If yes, no further action is required.</li><li id="ALM-45427__li3833130144319">If no, perform <a href="#ALM-45427__li157515124315">6</a> again, and check whether the alarm is cleared 5 minutes later. If the alarm persists, go to <a href="#ALM-45427__li1460181994211">8</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-45427__p3847019615437"><strong id="ALM-45427__b1068744715437">Collect the fault information.</strong></p>
<ol start="8" id="ALM-45427__ol19460419184217"><li id="ALM-45427__li1460181994211"><a name="ALM-45427__li1460181994211"></a><a name="li1460181994211"></a><span>On FusionInsight Manager, choose <strong id="ALM-45427__b1793355233213">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-45427__b29349528320">Log</strong> &gt; <strong id="ALM-45427__b293585213215">Download</strong>.</span></li><li id="ALM-45427__li94601519114217"><span>Expand the <strong id="ALM-45427__b119673425211">Service</strong> drop-down list, and select <strong id="ALM-45427__b169681447529">ClickHouse</strong> for the target cluster.</span></li><li id="ALM-45427__li1686655576"><span>Choose the corresponding host form the host list.</span></li><li id="ALM-45427__li1746031964212"><span>Click <span><img id="ALM-45427__image9460181994219" src="en-us_image_0295706662.png"></span> in the upper right corner, and set <strong id="ALM-45427__b88576093465053">Start Date</strong> and <strong id="ALM-45427__b73209833465053">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-45427__b13519169765053">Download</strong>.</span></li><li id="ALM-45427__li194601019104215"><span>Contact <span id="ALM-45427__text396015513406">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-45427__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-45427__p55781648135011">This alarm is automatically cleared after the fault is rectified.</p>
</div>
<div class="section" id="ALM-45427__section53362350"><h4 class="sectiontitle">Related Information</h4><p id="ALM-45427__p7522741">None</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>