Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

96 lines
6.9 KiB
HTML

<a name="mrs_01_24037"></a><a name="mrs_01_24037"></a>
<h1 class="topictitle1">Read</h1>
<div id="body0000001082071870"><p id="mrs_01_24037__p8060118">The read operation of Hudi applies to three views of Hudi. You can select a proper view for query based on requirements.</p>
<p id="mrs_01_24037__p159699134611">Hudi supports multiple query engines, including Spark and Hive. For details, see <a href="#mrs_01_24037__table42155834714">Table 1</a> and <a href="#mrs_01_24037__table8194141519510">Table 2</a>.</p>
<div class="tablenoborder"><a name="mrs_01_24037__table42155834714"></a><a name="table42155834714"></a><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_24037__table42155834714" frame="border" border="1" rules="all"><caption><b>Table 1 </b>COW tables</caption><thead align="left"><tr id="mrs_01_24037__row72145844713"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.3.2.4.1.1"><p id="mrs_01_24037__p72105804715">Query Engine</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.3.2.4.1.2"><p id="mrs_01_24037__p17215824712">Real-time View/Read-optimized View</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.3.2.4.1.3"><p id="mrs_01_24037__p16217587472">Incremental View</p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_24037__row52155816471"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.2.4.1.1 "><p id="mrs_01_24037__p1324583475">Hive</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.2.4.1.2 "><p id="mrs_01_24037__p5395834720">Y</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.2.4.1.3 "><p id="mrs_01_24037__p6319589477">Y</p>
</td>
</tr>
<tr id="mrs_01_24037__row23115810478"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.2.4.1.1 "><p id="mrs_01_24037__p735580471">Spark (SparkSQL)</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.2.4.1.2 "><p id="mrs_01_24037__p13313588478">Y</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.2.4.1.3 "><p id="mrs_01_24037__p8375834720">Y</p>
</td>
</tr>
<tr id="mrs_01_24037__row1757009145016"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.2.4.1.1 "><p id="mrs_01_24037__p1557020910503">Spark (SparkDataSource API)</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.2.4.1.2 "><p id="mrs_01_24037__p557014913500">Y</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.3.2.4.1.3 "><p id="mrs_01_24037__p205704965015">Y</p>
</td>
</tr>
</tbody>
</table>
</div>
<div class="tablenoborder"><a name="mrs_01_24037__table8194141519510"></a><a name="table8194141519510"></a><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_24037__table8194141519510" frame="border" border="1" rules="all"><caption><b>Table 2 </b>MOR tables</caption><thead align="left"><tr id="mrs_01_24037__row2019431518513"><th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.4.2.5.1.1"><p id="mrs_01_24037__p619471515119">Query Engine</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.4.2.5.1.2"><p id="mrs_01_24037__p0194181517514">Real-time View</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.4.2.5.1.3"><p id="mrs_01_24037__p9194151545116">Incremental View</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.4.2.5.1.4"><p id="mrs_01_24037__p18194191595116">Read-optimized View</p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_24037__row9194115135110"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.1 "><p id="mrs_01_24037__p5194121519518">Hive</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.2 "><p id="mrs_01_24037__p1819431513512">Y</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.3 "><p id="mrs_01_24037__p14194715175110">Y</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.4 "><p id="mrs_01_24037__p419461545120">Y</p>
</td>
</tr>
<tr id="mrs_01_24037__row4194101555118"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.1 "><p id="mrs_01_24037__p1419421525115">Spark (SparkSQL)</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.2 "><p id="mrs_01_24037__p51945151517">Y</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.3 "><p id="mrs_01_24037__p419419151515">Y</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.4 "><p id="mrs_01_24037__p18194101585117">Y</p>
</td>
</tr>
<tr id="mrs_01_24037__row61947158516"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.1 "><p id="mrs_01_24037__p71941115145120">Spark (SparkDataSource API)</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.2 "><p id="mrs_01_24037__p1119481545117">Y</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.3 "><p id="mrs_01_24037__p10194181515517">Y</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.2.5.1.4 "><p id="mrs_01_24037__p81941415155115">Y</p>
</td>
</tr>
</tbody>
</table>
</div>
<div class="caution" id="mrs_01_24037__note9887111505217"><span class="cautiontitle"><img src="public_sys-resources/caution_3.0-en-us.png"> </span><div class="cautionbody"><ul id="mrs_01_24037__ul13576121955212"><li id="mrs_01_24037__li0577819145219">Currently, the partition deduction capability is not supported when Hudi uses the Spark DataSource API to read data. For example, when the DataSource API is used to query a bootstrap table, the partition field may not be displayed or may be displayed as null.</li><li id="mrs_01_24037__li16578151919526">For an incremental view, set <strong id="mrs_01_24037__b1483348029115521">hoodie.hudicow.consume.mode</strong> to <strong id="mrs_01_24037__b341154080115521">INCREMENTAL</strong>. This parameter applies only to queries on the incremental view and cannot be used for queries on other types of Hudi tables or queries on other tables. You can set <strong id="mrs_01_24037__b1379200793115521">hoodie.hudicow.consume.mode</strong> to <strong id="mrs_01_24037__b1111510204115521">SNAPSHOT</strong> or any value to restore the configuration.</li></ul>
</div></div>
</div>
<div>
<ul class="ullinks">
<li class="ulchildlink"><strong><a href="mrs_01_24098.html">Reading COW Table Views</a></strong><br>
</li>
<li class="ulchildlink"><strong><a href="mrs_01_24099.html">Reading MOR Table Views</a></strong><br>
</li>
</ul>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_24062.html">Basic Operations</a></div>
</div>
</div>