Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Lai, Weijian <laiweijian4@huawei.com> Co-committed-by: Lai, Weijian <laiweijian4@huawei.com>
58 KiB
Creating an Import Task
Function
This API is used to create a dataset import task to import samples and labels from the storage system to the dataset.
URI
POST /v2/{project_id}/datasets/{dataset_id}/import-tasks
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
dataset_id |
Yes |
String |
Dataset ID. |
project_id |
Yes |
String |
Project ID. For details about how to obtain the project ID, see Obtaining a Project ID. |
Request Parameters
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
annotation_format |
No |
String |
Format of the labeling information. Currently, only object detection is supported. The options are as follows:
|
data_source |
No |
DataSource object |
Data source. |
difficult_only |
No |
Boolean |
Whether to import only hard examples. The options are as follows:
|
excluded_labels |
No |
Array of Label objects |
Do not import samples containing the specified label. |
final_annotation |
No |
Boolean |
Whether to import data to the final state. The options are as follows:
|
import_annotations |
No |
Boolean |
Whether to import labels. The options are as follows:
|
import_folder |
No |
String |
Name of the subdirectory in the dataset storage directory after import. You can specify the same subdirectory for multiple import tasks to avoid repeated import of the same samples. This field is invalid for table datasets. |
import_origin |
No |
String |
Data source. The options are as follows:
|
import_path |
No |
String |
OBS path or manifest path to be imported.
|
import_samples |
No |
Boolean |
Whether to import samples. The options are as follows:
|
import_type |
No |
String |
Import mode. The options are as follows:
|
included_labels |
No |
Array of Label objects |
Import samples containing the specified label. |
label_format |
No |
LabelFormat object |
Label format. This parameter is used only for text datasets. |
with_column_header |
No |
Boolean |
Whether the first row in the file is a column name. This field is valid for the table dataset. The options are as follows:
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
data_path |
No |
String |
Data source path. |
data_type |
No |
Integer |
Data type. The options are as follows:
|
schema_maps |
No |
Array of SchemaMap objects |
Schema mapping information corresponding to the table data. |
source_info |
No |
SourceInfo object |
Information required for importing a table data source. |
with_column_header |
No |
Boolean |
Whether the first row in the file is a column name. This field is valid for the table dataset. The options are as follows:
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
dest_name |
No |
String |
Name of the destination column. |
src_name |
No |
String |
Name of the source column. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
cluster_id |
No |
String |
ID of an MRS cluster. |
cluster_mode |
No |
String |
Running mode of an MRS cluster. The options are as follows:
|
cluster_name |
No |
String |
Name of an MRS cluster. |
database_name |
No |
String |
Name of the database to which the table dataset is imported. |
input |
No |
String |
HDFS path of a table dataset. |
ip |
No |
String |
IP address of your GaussDB(DWS) cluster. |
port |
No |
String |
Port number of your GaussDB(DWS) cluster. |
queue_name |
No |
String |
DLI queue name of a table dataset. |
subnet_id |
No |
String |
Subnet ID of an MRS cluster. |
table_name |
No |
String |
Name of the table to which a table dataset is imported. |
user_name |
No |
String |
Username, which is mandatory for GaussDB(DWS) data. |
user_password |
No |
String |
User password, which is mandatory for GaussDB(DWS) data. |
vpc_id |
No |
String |
ID of the VPC where an MRS cluster resides. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
attributes |
No |
Array of LabelAttribute objects |
Multi-dimensional attribute of a label. For example, if the label is music, attributes such as style and artist may be included. |
name |
No |
String |
Label name. |
property |
No |
LabelProperty object |
Basic attribute key-value pair of a label, such as color and shortcut keys. |
type |
No |
Integer |
Label type. The options are as follows:
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
default_value |
No |
String |
Default value of a label attribute. |
id |
No |
String |
Label attribute ID. |
name |
No |
String |
Label attribute name. |
type |
No |
String |
Label attribute type. The options are as follows:
|
values |
No |
Array of LabelAttributeValue objects |
List of label attribute values. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
id |
No |
String |
Label attribute value ID. |
value |
No |
String |
Label attribute value. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
@modelarts:color |
No |
String |
Default attribute: Label color, which is a hexadecimal code of the color. By default, this parameter is left blank. Example: #FFFFF0. |
@modelarts:default_shape |
No |
String |
Default attribute: Default shape of an object detection label (dedicated attribute). By default, this parameter is left blank. The options are as follows:
|
@modelarts:from_type |
No |
String |
Default attribute: Type of the head entity in the triplet relationship label. This attribute must be specified when a relationship label is created. This parameter is used only for the text triplet dataset. |
@modelarts:rename_to |
No |
String |
Default attribute: The new name of the label. |
@modelarts:shortcut |
No |
String |
Default attribute: Label shortcut key. By default, this parameter is left blank. For example: D. |
@modelarts:to_type |
No |
String |
Default attribute: Type of the tail entity in the triplet relationship label. This attribute must be specified when a relationship label is created. This parameter is used only for the text triplet dataset. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
label_type |
No |
String |
Label type of text classification. The options are as follows:- 0: The label is separated from the text, and they are distinguished by the fixed suffix _result. For example, the text file is abc.txt, and the label file is abc_result.txt.- 1: Default value. Labels and texts are stored in the same file and separated by separators. You can use text_sample_separator to specify the separator between the text and label and text_label_separator to specify the separator between labels. |
text_label_separator |
No |
String |
Separator between labels. By default, a comma (,) is used as the separator. The separator needs to be escaped. The separator can contain only one character, such as a letter, a digit, or any of the following special characters: !@#$%^&*_=|?/':.;, |
text_sample_separator |
No |
String |
Separator between the text and label. By default, the Tab key is used as the separator. The separator needs to be escaped. The separator can contain only one character, such as a letter, a digit, or any of the following special characters: !@#$%^&*_=|?/':.;, |
Response Parameters
Status code: 200
Parameter |
Type |
Description |
---|---|---|
task_id |
String |
ID of an import task. |
Example Requests
- Creating an Import Task (Importing Data from OBS)
{ "import_type" : "dir", "import_path" : "s3://test-obs/daoLu_images/cat-rabbit/", "included_tags" : [ ], "import_annotations" : false, "difficult_only" : false }
- Creating an Import Task (Importing Data from Manifest)
{ "import_type" : "manifest", "import_path" : "s3://test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V002/V002.manifest", "included_tags" : [ "cat", "rabbit", "Cat", "Rabbit" ], "import_annotations" : true, "difficult_only" : false }
Example Responses
Status code: 200
OK
{ "task_id" : "gfghHSokody6AJigS5A_m1dYqOw8vWCAznw1V28" }
Status Codes
Status Code |
Description |
---|---|
200 |
OK |
401 |
Unauthorized |
403 |
Forbidden |
404 |
Not Found |
Error Codes
See Error Codes.