forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Lai, Weijian <laiweijian4@huawei.com> Co-committed-by: Lai, Weijian <laiweijian4@huawei.com>
14 lines
1.8 KiB
HTML
14 lines
1.8 KiB
HTML
<a name="EN-US_TOPIC_0000002043183656"></a><a name="EN-US_TOPIC_0000002043183656"></a>
|
|
|
|
<h1 class="topictitle1">Processing Data</h1>
|
|
<div id="body0000001147649176"><p id="EN-US_TOPIC_0000002043183656__p12922195161317">After data is collected and imported, the data cannot directly meet the training requirements. Process data during R&D to ensure data quality and prevent negative impact on subsequent operations (such as data labeling and model training). ModelArts provides data processing to extract valuable and meaningful data from a large amount of disordered and difficult-to-understand data.</p>
|
|
<p id="EN-US_TOPIC_0000002043183656__p8060118">ModelArts provides four basic data processing functions:</p>
|
|
<ul id="EN-US_TOPIC_0000002043183656__ul76845422234"><li id="EN-US_TOPIC_0000002043183656__li11685184232313">Data validation: helps AI developers identify invalid data, such as damaged data and unqualified data, and effectively prevent algorithm precision deterioration or training failures caused by noisy data.</li><li id="EN-US_TOPIC_0000002043183656__li1463444515237">Data cleansing: checks data consistency based on data validation and correct some invalid values.</li><li id="EN-US_TOPIC_0000002043183656__li750114902317">Data selection: During AI development, a large amount of duplicate data may exist in the collected data. The duplicate data does not improve the model precision. Moreover, it takes a long time to label the data. In this case, use data selection to preprocess data and deduplicate collected data.</li><li id="EN-US_TOPIC_0000002043183656__li186891156162310">Data augmentation: increases the data volume.</li></ul>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dataprepare-modelarts-0020.html">Data Analysis and Preview</a></div>
|
|
</div>
|
|
</div>
|
|
|