forked from docs/doc-exports
Reviewed-by: Kacur, Michal <michal.kacur@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
96 lines
9.0 KiB
HTML
96 lines
9.0 KiB
HTML
<a name="mrs_01_24480"></a><a name="mrs_01_24480"></a>
|
|
|
|
<h1 class="topictitle1">Locating Abnormal Hive Files</h1>
|
|
<div id="body0000001533373546"><div class="section" id="mrs_01_24480__section1242183110355"><h4 class="sectiontitle">Scenario</h4><ul id="mrs_01_24480__ul1956226174119"><li id="mrs_01_24480__li1156266154110">Data files stored in Hive are abnormal due to misoperations or disk damage, thereby causing task execution failures or incorrect data results.</li><li id="mrs_01_24480__li105627604111">Common non-text data files can be located using the specified tool.<div class="note" id="mrs_01_24480__note10418191614215"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_24480__p5420116174212">This section applies only to MRS <span id="mrs_01_24480__ph1961311204312">3.2.0</span> or later.</p>
|
|
</div></div>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="mrs_01_24480__section2102197134412"><h4 class="sectiontitle">Procedure</h4><ol id="mrs_01_24480__ol352113395447"><li id="mrs_01_24480__li16522193918446">Log in to the node where the Hive service is installed as user <strong id="mrs_01_24480__b9608847326">omm</strong> and run the following command to go to the Hive installation directory:<p id="mrs_01_24480__p112883198479"><strong id="mrs_01_24480__b18763202764716">cd ${BIGDATA_HOME}/FusionInsight_HD_*/install/FusionInsight-Hive-*/hive-*/bin</strong></p>
|
|
</li><li id="mrs_01_24480__li1475163714716">Run the following tool to locate abnormal Hive files:<p id="mrs_01_24480__p221371114489"><a name="mrs_01_24480__li1475163714716"></a><a name="li1475163714716"></a><strong id="mrs_01_24480__b13793875481">sh hive_parser_file.sh [--help] <filetype> <command> <input-file|input-directory></strong></p>
|
|
<p id="mrs_01_24480__p7877172714542"><a href="#mrs_01_24480__table11352804551">Table 1</a> describes the related parameters.</p>
|
|
<p id="mrs_01_24480__p101611355373">Note: You can run only one command at a time.</p>
|
|
|
|
<div class="tablenoborder"><a name="mrs_01_24480__table11352804551"></a><a name="table11352804551"></a><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_24480__table11352804551" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_24480__row1635250165519"><th align="left" class="cellrowborder" valign="top" width="12.55125512551255%" id="mcps1.3.2.2.2.4.2.4.1.1"><p id="mrs_01_24480__p235211018556">Parameter</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="41.574157415741574%" id="mcps1.3.2.2.2.4.2.4.1.2"><p id="mrs_01_24480__p43528013551">Description</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="45.87458745874587%" id="mcps1.3.2.2.2.4.2.4.1.3"><p id="mrs_01_24480__p1835210185514">Remarks</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="mrs_01_24480__row33528020553"><td class="cellrowborder" valign="top" width="12.55125512551255%" headers="mcps1.3.2.2.2.4.2.4.1.1 "><p id="mrs_01_24480__p535217055520">filetype</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="41.574157415741574%" headers="mcps1.3.2.2.2.4.2.4.1.2 "><p id="mrs_01_24480__p1764815293013">Specifies the format of the data file to be parsed. Currently, only the ORC, RC (RCFile), and Parquet formats are supported.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.87458745874587%" headers="mcps1.3.2.2.2.4.2.4.1.3 "><p id="mrs_01_24480__p13522007550">Currently, data files in the RC format can only be viewed.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_24480__row8352110145518"><td class="cellrowborder" valign="top" width="12.55125512551255%" headers="mcps1.3.2.2.2.4.2.4.1.1 "><p id="mrs_01_24480__p935220016553">-c</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="41.574157415741574%" headers="mcps1.3.2.2.2.4.2.4.1.2 "><p id="mrs_01_24480__p1535390185512">Prints the column information in the current metadata.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.87458745874587%" headers="mcps1.3.2.2.2.4.2.4.1.3 "><p id="mrs_01_24480__p16130458710">The column information includes the class name, file format, and sequence number.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_24480__row1435313016550"><td class="cellrowborder" valign="top" width="12.55125512551255%" headers="mcps1.3.2.2.2.4.2.4.1.1 "><p id="mrs_01_24480__p153530045518">-d</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="41.574157415741574%" headers="mcps1.3.2.2.2.4.2.4.1.2 "><p id="mrs_01_24480__p52753337814">Prints data in a data file. You can limit the data volume using the <strong id="mrs_01_24480__b1838175314102">limit</strong> parameter.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.87458745874587%" headers="mcps1.3.2.2.2.4.2.4.1.3 "><p id="mrs_01_24480__p169641749383">The data is the content of the specified data file. Note that only one value can be specified for the <strong id="mrs_01_24480__b25362327124">limit</strong> parameter at a time.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_24480__row1035315017553"><td class="cellrowborder" valign="top" width="12.55125512551255%" headers="mcps1.3.2.2.2.4.2.4.1.1 "><p id="mrs_01_24480__p153533015510">-t</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="41.574157415741574%" headers="mcps1.3.2.2.2.4.2.4.1.2 "><p id="mrs_01_24480__p65622041594">Prints the time zone to which the data is written.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.87458745874587%" headers="mcps1.3.2.2.2.4.2.4.1.3 "><p id="mrs_01_24480__p639761510917">The time zone is the zone to which the file is written.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_24480__row19353110165512"><td class="cellrowborder" valign="top" width="12.55125512551255%" headers="mcps1.3.2.2.2.4.2.4.1.1 "><p id="mrs_01_24480__p2353160135519">-h</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="41.574157415741574%" headers="mcps1.3.2.2.2.4.2.4.1.2 "><p id="mrs_01_24480__p1348520333911">Prints the help information.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.87458745874587%" headers="mcps1.3.2.2.2.4.2.4.1.3 "><p id="mrs_01_24480__p7931142199">Help information.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_24480__row9353702557"><td class="cellrowborder" valign="top" width="12.55125512551255%" headers="mcps1.3.2.2.2.4.2.4.1.1 "><p id="mrs_01_24480__p635330195510">-m</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="41.574157415741574%" headers="mcps1.3.2.2.2.4.2.4.1.2 "><p id="mrs_01_24480__p6156321017">Prints information about various storage formats.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.87458745874587%" headers="mcps1.3.2.2.2.4.2.4.1.3 "><p id="mrs_01_24480__p203585269107">The information varies based on the storage format. For example, if the file format is ORC, information such as strip and block size will be printed.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_24480__row735390155512"><td class="cellrowborder" valign="top" width="12.55125512551255%" headers="mcps1.3.2.2.2.4.2.4.1.1 "><p id="mrs_01_24480__p13353203550">-a</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="41.574157415741574%" headers="mcps1.3.2.2.2.4.2.4.1.2 "><p id="mrs_01_24480__p1164512529103">Prints detailed information.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.87458745874587%" headers="mcps1.3.2.2.2.4.2.4.1.3 "><p id="mrs_01_24480__p738719143117">The detailed information, including the preceding parameters, is displayed.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_24480__row16576125171112"><td class="cellrowborder" valign="top" width="12.55125512551255%" headers="mcps1.3.2.2.2.4.2.4.1.1 "><p id="mrs_01_24480__p3577625121117">input-file</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="41.574157415741574%" headers="mcps1.3.2.2.2.4.2.4.1.2 "><p id="mrs_01_24480__p1857752513112">Specifies the data files to be input.</p>
|
|
</td>
|
|
<td class="cellrowborder" rowspan="2" valign="top" width="45.87458745874587%" headers="mcps1.3.2.2.2.4.2.4.1.3 "><p id="mrs_01_24480__p57169751817">If the input directory contains a file of the supported formats, the file will be parsed. Otherwise, this operation is omitted. You can specify a local file or an HDFS/OBS file or directory.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_24480__row139064313114"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.2.4.2.4.1.1 "><p id="mrs_01_24480__p79071631151113">input-directory</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.2.4.2.4.1.2 "><p id="mrs_01_24480__p106524221179">Specifies the directory where the input data file is located. This parameter is used when there are multiple subfiles.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</li><li id="mrs_01_24480__li634193014190">Example:<p id="mrs_01_24480__p103411303190"><a name="mrs_01_24480__li634193014190"></a><a name="li634193014190"></a><strong id="mrs_01_24480__b143416306191">sh hive_parser_file.sh orc -d limit=100 hdfs://hacluster/user/hive/warehouse/orc_test</strong></p>
|
|
<p id="mrs_01_24480__p03411830101919">If the file name does not contain a prefix similar to <strong id="mrs_01_24480__b1997016261265">hdfs://hacluster</strong>, the local file is read by default.</p>
|
|
</li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0581.html">Using Hive</a></div>
|
|
</div>
|
|
</div>
|
|
|