Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

38 lines
3.8 KiB
HTML

<a name="mrs_01_0805"></a><a name="mrs_01_0805"></a>
<h1 class="topictitle1">Configuring the Number of Files in a Single HDFS Directory </h1>
<div id="body1590130534714"><div class="section" id="mrs_01_0805__section37652976172719"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_0805__p1473263292323">Generally, multiple services are deployed in a cluster, and the storage of most services depends on the HDFS file system. Different components such as Spark and Yarn or clients are constantly writing files to the same HDFS directory when the cluster is running. However, the number of files in a single directory in HDFS is limited. Users must plan to prevent excessive files in a single directory and task failure.</p>
<p id="mrs_01_0805__p28559559165643">You can set the number of files in a single directory using the <span class="parmname" id="mrs_01_0805__parmname1929322192336"><b>dfs.namenode.fs-limits.max-directory-items</b></span> parameter in HDFS. </p>
</div>
<div class="section" id="mrs_01_0805__section56786474172755"><h4 class="sectiontitle">Procedure</h4><ol id="mrs_01_0805__ol14718031172843"><li id="mrs_01_0805__li12695852161413"><span>Go to the <strong id="mrs_01_0805__b188533301109">All Configurations</strong> page of HDFS by referring to <a href="mrs_01_2125.html">Modifying Cluster Service Configuration Parameters</a>.</span></li><li id="mrs_01_0805__li12273486172843"><span>Search for the configuration item <span class="parmname" id="mrs_01_0805__parmname1399152411411"><b>dfs.namenode.fs-limits.max-directory-items</b></span>.</span><p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_0805__table28710145174948" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_0805__row34676457174948"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.2.2.1.2.4.1.1"><p id="mrs_01_0805__p46204700174948">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.2.2.1.2.4.1.2"><p id="mrs_01_0805__p51593219174948">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.2.2.1.2.4.1.3"><p id="mrs_01_0805__p18301202174948">Default Value</p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_0805__row30493097174948"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.2.2.1.2.4.1.1 "><p id="mrs_01_0805__p30677622175237">dfs.namenode.fs-limits.max-directory-items</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.2.2.1.2.4.1.2 "><p id="mrs_01_0805__p360698679294">Maximum number of items in a directory</p>
<p id="mrs_01_0805__p13687076174948">Value range: 1 to 6,400,000</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.2.2.1.2.4.1.3 "><p id="mrs_01_0805__p34911332174948">1048576</p>
</td>
</tr>
</tbody>
</table>
</div>
</p></li><li id="mrs_01_0805__li23283829175725"><span>Set the maximum number of files that can be stored in a single HDFS directory. Save the modified configuration. Restart the expired service or instance for the configuration to take effect.</span><p><div class="note" id="mrs_01_0805__note3366184318256"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_0805__p3452113418256">Plan data storage in advance based on time and service type categories to prevent excessive files in a single directory. You are advised to use the default value, which is about 1 million pieces of data in a single directory.</p>
</div></div>
</p></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0790.html">Using HDFS</a></div>
</div>
</div>