
ModelArts GA API 06052022 from R&D R&D has provided a right version of ModelArts GA API (06052022) Reviewed-by: Artem Goncharov <Artem.goncharov@gmail.com>
42 KiB
Querying the Dataset Version List
Function
This API is used to query the version list of a specific dataset.
URI
GET /v2/{project_id}/datasets/{dataset_id}/versions
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
dataset_id |
Yes |
String |
Dataset ID. |
project_id |
Yes |
String |
Project ID. For details about how to obtain the project ID, see Obtaining a Project ID. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
status |
No |
Integer |
Status of a dataset version. The options are as follows: - 0: creating - 1: running - 2: deleting - 3: deleted - 4: error |
train_evaluate_ratio |
No |
String |
Version split ratio for version filtering. The numbers before and after the comma indicate the minimum and maximum split ratios, and the versions whose split ratios are within the range are filtered out, for example, 0.0,1.0. Note: If this parameter is left blank or unavailable, the system does not filter datasets based on the version split ratio by default. |
version_format |
No |
Integer |
Format of a dataset version. The options are as follows: - 0: default format - 1: CarbonData (supported only by table datasets) - 2: CSV |
Request Parameters
None
Response Parameters
Status code: 200
Parameter |
Type |
Description |
---|---|---|
total_number |
Integer |
Total number of dataset versions. |
versions |
Array of DatasetVersion objects |
Dataset version list. |
Parameter |
Type |
Description |
---|---|---|
add_sample_count |
Integer |
Number of added samples. |
annotated_sample_count |
Integer |
Number of samples with labeled versions. |
annotated_sub_sample_count |
Integer |
Number of labeled subsamples. |
clear_hard_property |
Boolean |
Whether to clear hard example properties during release. The options are as follows: - true: Clear hard example properties. (Default value) - false: Do not clear hard example properties. |
code |
String |
Status code of a preprocessing task such as rotation and cropping. |
create_time |
Long |
Time when a version is created. |
crop |
Boolean |
Whether to crop the image. This field is valid only for the object detection dataset whose labeling box is in the rectangle shape. The options are as follows: - true: Crop the image. - false: Do not crop the image. (Default value) |
crop_path |
String |
Path for storing cropped files. |
crop_rotate_cache_path |
String |
Temporary directory for executing the rotation and cropping task. |
data_path |
String |
Path for storing data. |
data_statistics |
Map<String,Object> |
Sample statistics on a dataset, including the statistics on sample metadata in JSON format. |
data_validate |
Boolean |
Whether data is validated by the validation algorithm before release. The options are as follows: - true: The data has been validated. - false: The data has not been validated. |
deleted_sample_count |
Integer |
Number of deleted samples. |
deletion_stats |
Map<String,Integer> |
Deletion reason statistics. |
description |
String |
Description of a version. |
export_images |
Boolean |
Whether to export images to the version output directory during release. The options are as follows: - true: Export images to the version output directory. - false: Do not export images to the version output directory. (Default value) |
extract_serial_number |
Boolean |
Whether to parse the subsample number during release. The field is valid for the healthcare dataset. The options are as follows: - true: Parse the subsample number. - false: Do not parse the subsample number. (Default value) |
include_dataset_data |
Boolean |
Whether to include the source data of a dataset during release. The options are as follows: - true: The source data of a dataset is included. - false: The source data of a dataset is not included. |
is_current |
Boolean |
Whether the current dataset version is used. The options are as follows: - true: The current dataset version is used. - false: The current dataset version is not used. |
label_stats |
Array of LabelStats objects |
Label statistics list of a released version. |
label_type |
String |
Label type of a released version. The options are as follows: - multi: Multi-label samples are included. - single: All samples are single-labeled. |
manifest_cache_input_path |
String |
Input path for the manifest file cache during version release. |
manifest_path |
String |
Path for storing the manifest file with the released version. |
message |
String |
Task information recorded during release (for example, error information). |
modified_sample_count |
Integer |
Number of modified samples. |
previous_annotated_sample_count |
Integer |
Number of labeled samples of parent versions. |
previous_total_sample_count |
Integer |
Total samples of parent versions. |
previous_version_id |
String |
Parent version ID |
processor_task_id |
String |
ID of a preprocessing task such as rotation and cropping. |
processor_task_status |
Integer |
Status of a preprocessing task such as rotation and cropping. The options are as follows: - 0: initialized - 1: running - 2: completed - 3: failed - 4: stopped - 5: timeout - 6: deletion failed - 7: stop failed |
remove_sample_usage |
Boolean |
Whether to clear the existing usage information of a dataset during release. The options are as follows: - true: Clear the existing usage information of a dataset. (Default value) - false: Do not clear the existing usage information of a dataset. |
rotate |
Boolean |
Whether to rotate the image. The options are as follows: - true: Rotate the image. - false: Do not rotate the image. (Default value) |
rotate_path |
String |
Path for storing the rotated file. |
sample_state |
String |
Sample status. The options are as follows: - ALL: labeled - NONE: unlabeled - UNCHECK: pending acceptance - ACCEPTED: accepted - REJECTED: rejected - UNREVIEWED: pending review - REVIEWED: reviewed - WORKFORCE_SAMPLED: sampled - WORKFORCE_SAMPLED_UNCHECK: sampling unchecked - WORKFORCE_SAMPLED_CHECKED: sampling checked - WORKFORCE_SAMPLED_ACCEPTED: sampling accepted - WORKFORCE_SAMPLED_REJECTED: sampling rejected - AUTO_ANNOTATION: to be confirmed |
status |
Integer |
Status of a dataset version. The options are as follows: - 0: creating - 1: running - 2: deleting - 3: deleted - 4: error |
tags |
Array of strings |
Key identifier list of the dataset. The labeling type is used as the default label when the labeling task releases a version. For example, ["Image","Object detection"]. |
task_type |
Integer |
Labeling task type of the released version, which is the same as the dataset type. |
total_sample_count |
Integer |
Total number of version samples. |
total_sub_sample_count |
Integer |
Total number of subsamples generated from the parent samples. |
train_evaluate_sample_ratio |
String |
Split training and verification ratio during version release. The default value is 1.00, indicating that all labeled samples are split into the training set. |
update_time |
Long |
Time when a version is updated. |
version_format |
String |
Format of a dataset version. The options are as follows: - Default: default format - CarbonData: CarbonData (supported only by table datasets) - CSV: CSV |
version_id |
String |
Dataset version ID. |
version_name |
String |
Dataset version name. |
with_column_header |
Boolean |
Whether the first row in the released CSV file is a column name. This field is valid for the table dataset. The options are as follows: - true: The first row in the released CSV file is a column name. - false: The first row in the released CSV file is not a column name. |
Parameter |
Type |
Description |
---|---|---|
attributes |
Array of LabelAttribute objects |
Multi-dimensional attribute of a label. For example, if the label is music, attributes such as style and artist may be included. |
count |
Integer |
Number of labels. |
name |
String |
Label name. |
property |
LabelProperty object |
Basic attribute key-value pair of a label, such as color and shortcut keys. |
sample_count |
Integer |
Number of samples containing the label. |
type |
Integer |
Label type. The options are as follows: - 0: image classification - 1: object detection - 100: text classification - 101: named entity recognition - 102: text triplet relationship - 103: text triplet entity - 200: speech classification - 201: speech content - 202: speech paragraph labeling - 600: video classification |
Parameter |
Type |
Description |
---|---|---|
default_value |
String |
Default value of a label attribute. |
id |
String |
Label attribute ID. |
name |
String |
Label attribute name. |
type |
String |
Label attribute type. The options are as follows: - text: text - select: single-choice drop-down list |
values |
Array of LabelAttributeValue objects |
List of label attribute values. |
Parameter |
Type |
Description |
---|---|---|
id |
String |
Label attribute value ID. |
value |
String |
Label attribute value. |
Parameter |
Type |
Description |
---|---|---|
@modelarts:color |
String |
Default attribute: Label color, which is a hexadecimal code of the color. By default, this parameter is left blank. Example: #FFFFF0. |
@modelarts:default_shape |
String |
Default attribute: Default shape of an object detection label (dedicated attribute). By default, this parameter is left blank. The options are as follows: - bndbox: rectangle - polygon: polygon - circle: circle - line: straight line - dashed: dotted line - point: point - polyline: polyline |
@modelarts:from_type |
String |
Default attribute: Type of the head entity in the triplet relationship label. This attribute must be specified when a relationship label is created. This parameter is used only for the text triplet dataset. |
@modelarts:rename_to |
String |
Default attribute: The new name of the label. |
@modelarts:shortcut |
String |
Default attribute: Label shortcut key. By default, this parameter is left blank. For example: D. |
@modelarts:to_type |
String |
Default attribute: Type of the tail entity in the triplet relationship label. This attribute must be specified when a relationship label is created. This parameter is used only for the text triplet dataset. |
Example Requests
Querying the Version List of a Specific Dataset
GET https://{endpoint}/v2/{project_id}/datasets/{dataset_id}/versions
Example Responses
Status code: 200
OK
{ "total_number" : 3, "versions" : [ { "version_id" : "54IXbeJhfttGpL46lbv", "version_name" : "V003", "version_format" : "Default", "previous_version_id" : "eSOKEQaXhKzxN00WKoV", "status" : 1, "create_time" : 1605930512183, "total_sample_count" : 10, "annotated_sample_count" : 10, "total_sub_sample_count" : 0, "annotated_sub_sample_count" : 0, "manifest_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V003/V003.manifest", "data_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V003/data/", "is_current" : true, "train_evaluate_sample_ratio" : "0.8", "remove_sample_usage" : false, "export_images" : false, "description" : "", "task_type" : 0, "extract_serial_number" : false }, { "version_id" : "eSOKEQaXhKzxN00WKoV", "version_name" : "V002", "version_format" : "Default", "previous_version_id" : "vlGvUqOcxxGPIB0ugeE", "status" : 1, "create_time" : 1605691027084, "total_sample_count" : 10, "annotated_sample_count" : 10, "total_sub_sample_count" : 0, "annotated_sub_sample_count" : 0, "manifest_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V002/V002.manifest", "data_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V002/data/", "is_current" : false, "train_evaluate_sample_ratio" : "0.9999", "remove_sample_usage" : false, "export_images" : false, "description" : "", "task_type" : 0, "extract_serial_number" : false }, { "version_id" : "vlGvUqOcxxGPIB0ugeE", "version_name" : "V001", "version_format" : "Default", "status" : 1, "create_time" : 1605690687346, "total_sample_count" : 10, "annotated_sample_count" : 10, "total_sub_sample_count" : 0, "annotated_sub_sample_count" : 0, "manifest_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V001/V001.manifest", "data_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V001/data/", "is_current" : false, "train_evaluate_sample_ratio" : "0.99", "remove_sample_usage" : false, "export_images" : false, "description" : "", "task_type" : 0, "extract_serial_number" : false } ] }
Status Codes
Status Code |
Description |
---|---|
200 |
OK |
401 |
Unauthorized |
403 |
Forbidden |
404 |
Not Found |
Error Codes
See Error Codes.