forked from docs/doc-exports
Reviewed-by: Kacur, Michal <michal.kacur@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
18 lines
2.6 KiB
HTML
18 lines
2.6 KiB
HTML
<a name="mrs_01_24089"></a><a name="mrs_01_24089"></a>
|
|
|
|
<h1 class="topictitle1">Cleaning</h1>
|
|
<div id="body32001227"><p id="mrs_01_24089__en-us_topic_0000001173789554_p161011423163211">Cleaning is used to delete data of versions that are no longer required.</p>
|
|
<p id="mrs_01_24089__en-us_topic_0000001173789554_p8060118">Hudi uses the cleaner working in the background to continuously delete unnecessary data of old versions. You can configure <strong id="mrs_01_24089__en-us_topic_0000001173789554_b123561012153714">hoodie.cleaner.policy</strong> and <strong id="mrs_01_24089__en-us_topic_0000001173789554_b24381715193715">hoodie.cleaner.commits.retained</strong> to use different cleaning policies and determine the number of saved commits.</p>
|
|
<p id="mrs_01_24089__en-us_topic_0000001173789554_p63519263813">You can use either of the following methods to perform cleaning:</p>
|
|
<ul id="mrs_01_24089__en-us_topic_0000001173789554_ul1215713174116"><li id="mrs_01_24089__en-us_topic_0000001173789554_li015761164114">Using Hudi CLI<p id="mrs_01_24089__en-us_topic_0000001173789554_p173606340425"><a name="mrs_01_24089__en-us_topic_0000001173789554_li015761164114"></a><a name="en-us_topic_0000001173789554_li015761164114"></a><strong id="mrs_01_24089__en-us_topic_0000001173789554_b95931311194217">cleans run --sparkMaster yarn --hoodieConfigs 'hoodie.cleaner.policy=KEEP_LATEST_COMMITS,hoodie.cleaner.commits.retained=1,hoodie.cleaner.incremental.mode=false,hoodie.keep.max.commits=3,hoodie.keep.min.commits=2</strong>'</p>
|
|
</li><li id="mrs_01_24089__en-us_topic_0000001173789554_li4491111214115">Using APIs<p id="mrs_01_24089__en-us_topic_0000001173789554_p18602411144314"><a name="mrs_01_24089__en-us_topic_0000001173789554_li4491111214115"></a><a name="en-us_topic_0000001173789554_li4491111214115"></a><strong id="mrs_01_24089__b180142772614">spark-submit --master yarn --jars /opt/client/Hudi/hudi/lib/hudi-client-common-</strong><em id="mrs_01_24089__i1483427192620">xxx</em><strong id="mrs_01_24089__b19691343261">.jar --class org.apache.hudi.utilities.HoodieCleaner /opt/client/Hudi/hudi/lib/hudi-utilities_</strong><em id="mrs_01_24089__i6971103432615">xxx</em><strong id="mrs_01_24089__b096916341267">.jar --target-base-path /tmp/default/tb_test_mor</strong></p>
|
|
</li></ul>
|
|
<p id="mrs_01_24089__en-us_topic_0000001173789554_p1925630164318">For details about more cleaning parameters, see <a href="mrs_01_24032.html">Hudi Configuration Reference</a>.</p>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_24038.html">Data Management and Maintenance</a></div>
|
|
</div>
|
|
</div>
|
|
|