When a dataset is imported, the data storage directory and file name must comply with the ModelArts specifications if the data to be used is stored in OBS.
Only the following types of dataset support the OBS path import mode: Image classification, Object detection, Text classification, Table, and Sound classification.
To import data from an OBS directory, you must have the read permission on the OBS directory.
In the following example, Cat and Dog are label names.
dataset-import-example ├─Cat │ 10.jpg │ 11.jpg │ 12.jpg │ └─Dog 1.jpg 2.jpg 3.jpg
In the following example, import-dir-1 and import-dir-2 are the imported subdirectories:
dataset-import-example ├─import-dir-1 │ 10.jpg │ 10.txt │ 11.jpg │ 11.txt │ 12.jpg │ 12.txt └─import-dir-2 1.jpg 1.txt 2.jpg 2.txt
The following shows a label file for a single label, for example, the 1.txt file:
Cat
The following shows a label file for multiple labels, for example, the 1.txt file:
Cat Dog
The label files for object detection must be in PASCAL VOC format. For details about the format, see Table 6.
Example:
├─dataset-import-example │ IMG_20180919_114732.jpg │ IMG_20180919_114732.xml │ IMG_20180919_114745.jpg │ IMG_20180919_114745.xml │ IMG_20180919_114945.jpg │ IMG_20180919_114945.xml
A label file example is as follows:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <annotation> <folder>NA</folder> <filename>bike_1_1593531469339.png</filename> <source> <database>Unknown</database> </source> <size> <width>554</width> <height>606</height> <depth>3</depth> </size> <segmented>0</segmented> <object> <name>Dog</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <occluded>0</occluded> <bndbox> <xmin>279</xmin> <ymin>52</ymin> <xmax>474</xmax> <ymax>278</ymax> </bndbox> </object> <object> <name>Cat</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <occluded>0</occluded> <bndbox> <xmin>279</xmin> <ymin>198</ymin> <xmax>456</xmax> <ymax>421</ymax> </bndbox> </object> </annotation>
Text classification supports two import modes.
It touches good and responds quickly. I don't know how it performs in the future. positive Three months ago, I bought a very good phone and replaced my old one with it. It can operate longer between charges. positive Why does my phone heat up if I charge it for a while? The volume button stuck after being pressed down. negative It's a gift for Father's Day. The logistics is fast and I received it in 24 hours. I like the earphones because the bass sounds feel good and they would not fall off. positive
For example, the content of labeled object COMMENTS_20180919_114745.txt is as follows:
It touches good and responds quickly. I don't know how it performs in the future. Three months ago, I bought a very good phone and replaced my old one with it. It can operate longer between charges. Why does my phone heat up if I charge it for a while? The volume button stuck after being pressed down. It's a gift for Father's Day. The logistics is fast and I received it in 24 hours. I like the earphones because the bass sounds feel good and they would not fall off.
The content of label file COMMENTS_20180919_114745_result.txt is as follows:
positive negative negative positive
The data format requires users to store labeled objects and their label files (in one-to-one relationship with the labeled objects) in the same directory. For example, if the name of the labeled object file is COMMENTS_20180919_114745.txt, the name of the label file must be COMMENTS _20180919_114745_result.txt.
Example of data file storage:
├─dataset-import-example │ COMMENTS_20180919_114732.txt │ COMMENTS _20180919_114732_result.txt │ COMMENTS _20180919_114745.txt │ COMMENTS _20180919_114745_result.txt │ COMMENTS _20180919_114945.txt │ COMMENTS _20180919_114945_result.txt