forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Lai, Weijian <laiweijian4@huawei.com> Co-committed-by: Lai, Weijian <laiweijian4@huawei.com>
37 lines
9.0 KiB
HTML
37 lines
9.0 KiB
HTML
<a name="EN-US_TOPIC_0000002079182993"></a><a name="EN-US_TOPIC_0000002079182993"></a>
|
|
|
|
<h1 class="topictitle1">Importing Data from an OBS Path</h1>
|
|
<div id="body0000001151066736"><div class="section" id="EN-US_TOPIC_0000002079182993__section113381459132217"><h4 class="sectiontitle">Prerequisites</h4><ul id="EN-US_TOPIC_0000002079182993__ul738713422317"><li id="EN-US_TOPIC_0000002079182993__li1338844142312">You have created a dataset.</li><li id="EN-US_TOPIC_0000002079182993__li105086910238">You have stored the data to be imported in OBS. You have stored the manifest file in OBS. </li><li id="EN-US_TOPIC_0000002079182993__li9197102013471">The OBS bucket and ModelArts are in the same region and you can operate the bucket.</li></ul>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000002079182993__section15125210017"><h4 class="sectiontitle">Importing File Data from an OBS Path</h4><p id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_p10503034142120">The parameters on the GUI for data import vary according to the dataset type. The following uses a dataset of the image classification type as an example.</p>
|
|
<ol id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_ol577418325226"><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li67741232192212">Log in to the ModelArts management console. In the navigation pane on the left, choose <strong id="EN-US_TOPIC_0000002079182993__b144668923492754">Data Management</strong> > <span class="parmname" id="EN-US_TOPIC_0000002079182993__parmname133985680992754"><b>Datasets</b></span>.</li><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li2561048162217">Locate the target dataset and click <strong id="EN-US_TOPIC_0000002079182993__b200159718692754">Import</strong> in the <strong id="EN-US_TOPIC_0000002079182993__b146203571092754">Operation</strong> column. Alternatively, you can click the dataset name to go to the <span class="wintitle" id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_wintitle1054772151917"><b>Dashboard</b></span> tab of the dataset, and click <span class="uicontrol" id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_uicontrol11324159196"><b>Import</b></span> in the upper right corner.</li><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li819356132411">In the <strong id="EN-US_TOPIC_0000002079182993__b162365291992754">Import</strong> dialog box, set the parameters as follows and click <strong id="EN-US_TOPIC_0000002079182993__b182797395892754">OK</strong>.<ul id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_ul1991882712913"><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li864095614298"><strong id="EN-US_TOPIC_0000002079182993__b52967713292754">Data Source</strong>: <strong id="EN-US_TOPIC_0000002079182993__b190063172192754">OBS</strong></li><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li209181427142913"><strong id="EN-US_TOPIC_0000002079182993__b132105511192754">Import Mode</strong>: <strong id="EN-US_TOPIC_0000002079182993__b133430487092754">Path</strong></li><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li4918627122910"><strong id="EN-US_TOPIC_0000002079182993__b129462288592754">Import Path</strong>: OBS path for storing data</li><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li03184213301"><strong id="EN-US_TOPIC_0000002079182993__b68951674392754">Labeling Status</strong>: <strong id="EN-US_TOPIC_0000002079182993__b119265958192754">Labeled</strong></li><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li1285513193419"><strong id="EN-US_TOPIC_0000002079182993__b205809962792754">Advanced Feature Settings</strong>: This function is disabled by default. You can click the button on the right to enable this function.<p id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_p1297210363417"><strong id="EN-US_TOPIC_0000002079182993__b109299594492754">Import by Tag</strong> enables the system to automatically obtain the labels of the current dataset. Click <strong id="EN-US_TOPIC_0000002079182993__b33848556492754">Add Label</strong> to add a label. This field is optional. After importing the data, you can add or delete labels during data labeling.</p>
|
|
</li></ul>
|
|
<p id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_p27841440192514"></p>
|
|
<p id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_p83371232132720">After the data is imported, it will be automatically synchronized to the dataset. On the <strong id="EN-US_TOPIC_0000002079182993__b164883482592754">Datasets</strong> page, click the dataset name to view its details and create a labeling job to label the data.</p>
|
|
</li></ol>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000002079182993__section6618144201717"><h4 class="sectiontitle">Labeling Status of File Data</h4><p id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_p12618944121711">The labeling status can be <strong id="EN-US_TOPIC_0000002079182993__b2687531292754">Unlabeled</strong> or <strong id="EN-US_TOPIC_0000002079182993__b69312347192754">Labeled</strong>.</p>
|
|
<ul id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_ul461824471720"><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li19618134401716"><strong id="EN-US_TOPIC_0000002079182993__b162086890292754">Unlabeled</strong>: Only the labeling object (such as unlabeled images or texts) is imported.</li><li id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_li090691254610"><strong id="EN-US_TOPIC_0000002079182993__b186829803392754">Labeled</strong>: Both the labeling object and content are imported. Labeling content importing is not supported for datasets in free format.<p id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_p17599193014615">To ensure that the labeling content can be correctly read, you must store data in strict accordance with the specifications.</p>
|
|
<p id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_p54911015164618">If <strong id="EN-US_TOPIC_0000002079182993__b1257824254119">Import Mode</strong> is set to <strong id="EN-US_TOPIC_0000002079182993__b165791642174116">Path</strong>, store the data to be imported according to the labeling file specifications. For details, see <a href="dataprepare-modelarts-0013.html">Specifications for Importing Data from an OBS Directory</a>.</p>
|
|
<p id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_p2966917143217">If <strong id="EN-US_TOPIC_0000002079182993__b44624065292754">Import Mode</strong> is set to <strong id="EN-US_TOPIC_0000002079182993__b74006288692754">manifest</strong>, the manifest file specifications must be met.</p>
|
|
<div class="note" id="EN-US_TOPIC_0000002079182993__en-us_topic_0000001193972801_note9813353123716"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="EN-US_TOPIC_0000002079182993__ul1771035121010"><li id="EN-US_TOPIC_0000002079182993__li1571185191012">If the labeling status is set to <strong id="EN-US_TOPIC_0000002079182993__b131988015392754">Labeled</strong>, ensure that the folder or manifest file complies with the format specifications. Otherwise, the import may fail.</li><li id="EN-US_TOPIC_0000002079182993__li182505542101">After the labeled file is imported, check whether the imported data is in the labeled state.</li></ul>
|
|
</div></div>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="EN-US_TOPIC_0000002079182993__section10985154617246"><h4 class="sectiontitle">Importing a Table Dataset from OBS</h4><p id="EN-US_TOPIC_0000002079182993__p20293164012269">ModelArts allows you to import table data (CSV files) from OBS.</p>
|
|
<p id="EN-US_TOPIC_0000002079182993__p26722932414">Import description:</p>
|
|
<ul id="EN-US_TOPIC_0000002079182993__ul8292194010164"><li id="EN-US_TOPIC_0000002079182993__li142921040131618">The prerequisite for successful import is that the schema of the data source must be the same as that specified during dataset creation. The schema indicates column names and types of a table. Once specified during dataset creation, the values cannot be changed.</li><li id="EN-US_TOPIC_0000002079182993__li16292104031610">When a CSV file is imported from OBS, the data type is not validated, but the number of columns must be the same as that in the schema of the dataset. If the data format is invalid, the data is set to null. For details, see <a href="dataprepare-modelarts-0006.html#EN-US_TOPIC_0000002079104369__table5251155510463">Table 3</a>.</li><li id="EN-US_TOPIC_0000002079182993__li12922401163">You must select the directory where the CSV file is stored. The number of columns in the CSV file must be the same as that in the dataset schema. The schema of the CSV file can be automatically obtained.</li></ul>
|
|
<pre class="screen" id="EN-US_TOPIC_0000002079182993__screen181713011576">├─dataset-import-example
|
|
│ table_import_1.csv
|
|
│ table_import_2.csv
|
|
│ table_import_3.csv
|
|
│ table_import_4.csv</pre>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dataprepare-modelarts-0010.html">Importing Data from OBS</a></div>
|
|
</div>
|
|
</div>
|
|
|