doc-exports/docs/cce/umn/cce_faq_00307.html
Dong, Qiu Jian 86fb05065f CCE UMN for 24.2.0 version -20240428
Reviewed-by: Eotvos, Oliver <oliver.eotvos@t-systems.com>
Co-authored-by: Dong, Qiu Jian <qiujiandong1@huawei.com>
Co-committed-by: Dong, Qiu Jian <qiujiandong1@huawei.com>
2024-06-10 08:19:07 +00:00

60 lines
8.6 KiB
HTML

<a name="cce_faq_00307"></a><a name="cce_faq_00307"></a>
<h1 class="topictitle1">How Do I Fix an Abnormal Container or Node Due to No Thin Pool Disk Space?</h1>
<div id="body0000001156293219"><div class="section" id="cce_faq_00307__section14642113816170"><h4 class="sectiontitle">Problem Description</h4><p id="cce_faq_00307__p9888191011010">When the disk space of a thin pool on a node is about to be used up, the following exceptions occasionally occur:</p>
<p id="cce_faq_00307__p15572103910018">Files or directories fail to be created in the container, the file system in the container is read-only, the node is tainted disk-pressure, or the node is unavailable.</p>
<p id="cce_faq_00307__p59986431188">You can run the <strong id="cce_faq_00307__b3817227132716">docker info</strong> command on the node to view the used and remaining thin pool space to locate the fault. The following figure is an example.</p>
<p id="cce_faq_00307__p1214188816"><span><img id="cce_faq_00307__image747917188816" src="en-us_image_0000001851743812.png"></span></p>
</div>
<div class="section" id="cce_faq_00307__section212331452219"><h4 class="sectiontitle">Possible Cause</h4><p id="cce_faq_00307__p8586426142211">When Docker device mapper is used, although you can configure the <strong id="cce_faq_00307__b141401742193114">basesize</strong> parameter to limit the size of the <strong id="cce_faq_00307__b13997946133115">/home</strong> directory of a single container (to 10 GB by default), all containers on the node still share the thin pool of the node for storage. They are not completely isolated. When the sum of the thin pool space used by certain containers reaches the upper limit, other containers cannot run properly.</p>
<p id="cce_faq_00307__p189861515194810">In addition, after a file is deleted in the <strong id="cce_faq_00307__b13150191883413">/home</strong> directory of the container, the thin pool space occupied by the file is not released immediately. Therefore, even if <strong id="cce_faq_00307__b20842114612376">basesize</strong> is set to 10 GB, the thin pool space occupied by files keeps increasing until 10 GB when files are created in the container. The space released after file deletion will be reused only after a while. If <strong id="cce_faq_00307__b02511936193920">the number of service containers on the node multiplied by basesize</strong> is greater than the thin pool space size of the node, there is a possibility that the thin pool space has been used up.</p>
</div>
<div class="section" id="cce_faq_00307__section697162415220"><h4 class="sectiontitle">Solution</h4><p id="cce_faq_00307__p12267205414188">When the thin pool space of a node is used up, some services can be migrated to other nodes to quickly recover services. But you are advised to use the following solutions to resolve the root cause:</p>
<p id="cce_faq_00307__p864914914813"><strong id="cce_faq_00307__b144303596403">Solution 1:</strong></p>
<p id="cce_faq_00307__p366414319226">Properly plan the service distribution and data plane disk space to avoid the scenario where <strong id="cce_faq_00307__b7771121712413">the number of service containers multiplied by basesize</strong> is greater than the thin pool size of the node. To expand the thin pool size, perform the following steps:</p>
<ol id="cce_faq_00307__ol41541435152513"><li id="cce_faq_00307__cce_bestpractice_00198_en-us_topic_0196817407_li1091823811013"><span>Expand the capacity of the data disk on the EVS console.</span></li><li id="cce_faq_00307__cce_bestpractice_00198_li15327184914542"><span>Log in to the CCE console and click the cluster. In the navigation pane, choose <strong id="cce_faq_00307__cce_bestpractice_00198_b176491516203817">Nodes</strong>. Click <strong id="cce_faq_00307__cce_bestpractice_00198_b464971673810">More</strong> &gt; <strong id="cce_faq_00307__cce_bestpractice_00198_b9649161615380">Sync Server Data</strong> in the row containing the target node.</span></li><li id="cce_faq_00307__cce_bestpractice_00198_en-us_topic_0196817407_li209187382011"><span>Log in to the target node.</span></li><li id="cce_faq_00307__cce_bestpractice_00198_li128005014232"><span>Run the <strong id="cce_faq_00307__cce_bestpractice_00198_b6455184022316">lsblk</strong> command to check the block device information of the node.</span><p><p id="cce_faq_00307__cce_bestpractice_00198_p980018092312">A data disk is divided depending on the container storage <strong id="cce_faq_00307__cce_bestpractice_00198_b687813596016">Rootfs</strong>:</p>
<ul id="cce_faq_00307__cce_bestpractice_00198_ul89731919102417"><li id="cce_faq_00307__cce_bestpractice_00198_li1536084418247">Overlayfs: No independent thin pool is allocated. Image data is stored in the <strong id="cce_faq_00307__cce_bestpractice_00198_b14504233414">dockersys</strong> disk.<pre class="screen" id="cce_faq_00307__cce_bestpractice_00198_screen736044442417"># lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 8:0 0 50G 0 disk
└─vda1 8:1 0 50G 0 part /
<strong id="cce_faq_00307__cce_bestpractice_00198_b9542144551613">vdb</strong> 8:16 0 200G 0 disk
├─vgpaas-dockersys 253:0 0 90G 0 lvm /var/lib/docker # Space used by the container engine
└─vgpaas-kubernetes 253:1 0 10G 0 lvm /mnt/paas/kubernetes/kubelet # Space used by Kubernetes</pre>
<p id="cce_faq_00307__cce_bestpractice_00198_p1599151113360">Run the following commands on the node to add the new disk capacity to the <strong id="cce_faq_00307__cce_bestpractice_00198_b746642417811">dockersys</strong> disk:</p>
<pre class="screen" id="cce_faq_00307__cce_bestpractice_00198_screen10503202016363">pvresize /dev/vdb
lvextend -l+100%FREE -n vgpaas/dockersys
resize2fs /dev/vgpaas/dockersys</pre>
</li><li id="cce_faq_00307__cce_bestpractice_00198_li7973131913245">Devicemapper: A thin pool is allocated to store image data.<pre class="screen" id="cce_faq_00307__cce_bestpractice_00198_screen10480142251"># lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 8:0 0 50G 0 disk
└─vda1 8:1 0 50G 0 part /
<strong id="cce_faq_00307__cce_bestpractice_00198_b9505458151516">vdb</strong> 8:16 0 200G 0 disk
├─<strong id="cce_faq_00307__cce_bestpractice_00198_b170511380163">vgpaas-dockersys</strong> 253:0 0 18G 0 lvm /var/lib/docker
├─vgpaas-thinpool_tmeta 253:1 0 3G 0 lvm
│ └─<strong id="cce_faq_00307__cce_bestpractice_00198_b10865144019161">vgpaas-thinpool</strong> 253:3 0 67G 0 lvm # Space used by thinpool
│ ...
├─vgpaas-thinpool_tdata 253:2 0 67G 0 lvm
│ └─vgpaas-thinpool 253:3 0 67G 0 lvm
│ ...
└─vgpaas-kubernetes 253:4 0 10G 0 lvm /mnt/paas/kubernetes/kubelet</pre>
<ul id="cce_faq_00307__cce_bestpractice_00198_ul151541148142616"><li id="cce_faq_00307__cce_bestpractice_00198_li8154948152611">Run the following commands on the node to add the new disk capacity to the <strong id="cce_faq_00307__cce_bestpractice_00198_b169691932144611">thinpool</strong> disk:<pre class="screen" id="cce_faq_00307__cce_bestpractice_00198_screen1941742617282">pvresize /dev/vdb
lvextend -l+100%FREE -n vgpaas/thinpool</pre>
</li><li id="cce_faq_00307__cce_bestpractice_00198_li715464810269">Run the following commands on the node to add the new disk capacity to the <strong id="cce_faq_00307__cce_bestpractice_00198_b143201925134616">dockersys</strong> disk:<pre class="screen" id="cce_faq_00307__cce_bestpractice_00198_screen3309227102613">pvresize /dev/vdb
lvextend -l+100%FREE -n vgpaas/dockersys
resize2fs /dev/vgpaas/dockersys</pre>
</li></ul>
</li></ul>
</p></li></ol>
<p id="cce_faq_00307__p1172110432919"><strong id="cce_faq_00307__b1290802694118">Solution 2:</strong></p>
<p id="cce_faq_00307__p12992815191314">Create and delete files in service containers in the local storage (such as emptyDir and hostPath) or cloud storage directory mounted to the container. Such files do not occupy the thin pool space.</p>
<p id="cce_faq_00307__p111731249131016"><strong id="cce_faq_00307__b156491415428">Solution 3:</strong></p>
<p id="cce_faq_00307__p11100181618187">If the OS uses OverlayFS, services can be deployed on such nodes to prevent the problem that the disk space occupied by files created or deleted in the container is not released immediately.</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="cce_faq_00281.html">Node Running</a></div>
</div>
</div>