ModelArts GA API 06052022 from R&D (#14 )

ModelArts GA API 06052022 from R&D

R&D has provided a right version of ModelArts GA API (06052022)

Reviewed-by: Artem Goncharov <Artem.goncharov@gmail.com>

2022-05-23 16:26:34 +00:00

42 KiB

Raw Permalink Blame History

Querying the Dataset Version List

Function

This API is used to query the version list of a specific dataset.

URI

GET /v2/{project_id}/datasets/{dataset_id}/versions

**Table 1** Path Parameters
Parameter	Mandatory	Type	Description
dataset_id	Yes	String	Dataset ID.
project_id	Yes	String	Project ID. For details about how to obtain the project ID, see Obtaining a Project ID.

**Table 2** Query Parameters
Parameter	Mandatory	Type	Description
status	No	Integer	Status of a dataset version. The options are as follows: - 0: creating - 1: running - 2: deleting - 3: deleted - 4: error
train_evaluate_ratio	No	String	Version split ratio for version filtering. The numbers before and after the comma indicate the minimum and maximum split ratios, and the versions whose split ratios are within the range are filtered out, for example, 0.0,1.0. Note: If this parameter is left blank or unavailable, the system does not filter datasets based on the version split ratio by default.
version_format	No	Integer	Format of a dataset version. The options are as follows: - 0: default format - 1: CarbonData (supported only by table datasets) - 2: CSV

Request Parameters

None

Response Parameters

Status code: 200

**Table 3** Response body parameters
Parameter	Type	Description
total_number	Integer	Total number of dataset versions.
versions	Array of DatasetVersion objects	Dataset version list.

**Table 4** DatasetVersion
Parameter	Type	Description
add_sample_count	Integer	Number of added samples.
annotated_sample_count	Integer	Number of samples with labeled versions.
annotated_sub_sample_count	Integer	Number of labeled subsamples.
clear_hard_property	Boolean	Whether to clear hard example properties during release. The options are as follows: - true: Clear hard example properties. (Default value) - false: Do not clear hard example properties.
code	String	Status code of a preprocessing task such as rotation and cropping.
create_time	Long	Time when a version is created.
crop	Boolean	Whether to crop the image. This field is valid only for the object detection dataset whose labeling box is in the rectangle shape. The options are as follows: - true: Crop the image. - false: Do not crop the image. (Default value)
crop_path	String	Path for storing cropped files.
crop_rotate_cache_path	String	Temporary directory for executing the rotation and cropping task.
data_path	String	Path for storing data.
data_statistics	Map<String,Object>	Sample statistics on a dataset, including the statistics on sample metadata in JSON format.
data_validate	Boolean	Whether data is validated by the validation algorithm before release. The options are as follows: - true: The data has been validated. - false: The data has not been validated.
deleted_sample_count	Integer	Number of deleted samples.
deletion_stats	Map<String,Integer>	Deletion reason statistics.
description	String	Description of a version.
export_images	Boolean	Whether to export images to the version output directory during release. The options are as follows: - true: Export images to the version output directory. - false: Do not export images to the version output directory. (Default value)
extract_serial_number	Boolean	Whether to parse the subsample number during release. The field is valid for the healthcare dataset. The options are as follows: - true: Parse the subsample number. - false: Do not parse the subsample number. (Default value)
include_dataset_data	Boolean	Whether to include the source data of a dataset during release. The options are as follows: - true: The source data of a dataset is included. - false: The source data of a dataset is not included.
is_current	Boolean	Whether the current dataset version is used. The options are as follows: - true: The current dataset version is used. - false: The current dataset version is not used.
label_stats	Array of LabelStats objects	Label statistics list of a released version.
label_type	String	Label type of a released version. The options are as follows: - multi: Multi-label samples are included. - single: All samples are single-labeled.
manifest_cache_input_path	String	Input path for the manifest file cache during version release.
manifest_path	String	Path for storing the manifest file with the released version.
message	String	Task information recorded during release (for example, error information).
modified_sample_count	Integer	Number of modified samples.
previous_annotated_sample_count	Integer	Number of labeled samples of parent versions.
previous_total_sample_count	Integer	Total samples of parent versions.
previous_version_id	String	Parent version ID
processor_task_id	String	ID of a preprocessing task such as rotation and cropping.
processor_task_status	Integer	Status of a preprocessing task such as rotation and cropping. The options are as follows: - 0: initialized - 1: running - 2: completed - 3: failed - 4: stopped - 5: timeout - 6: deletion failed - 7: stop failed
remove_sample_usage	Boolean	Whether to clear the existing usage information of a dataset during release. The options are as follows: - true: Clear the existing usage information of a dataset. (Default value) - false: Do not clear the existing usage information of a dataset.
rotate	Boolean	Whether to rotate the image. The options are as follows: - true: Rotate the image. - false: Do not rotate the image. (Default value)
rotate_path	String	Path for storing the rotated file.
sample_state	String	Sample status. The options are as follows: - ALL: labeled - NONE: unlabeled - UNCHECK: pending acceptance - ACCEPTED: accepted - REJECTED: rejected - UNREVIEWED: pending review - REVIEWED: reviewed - WORKFORCE_SAMPLED: sampled - WORKFORCE_SAMPLED_UNCHECK: sampling unchecked - WORKFORCE_SAMPLED_CHECKED: sampling checked - WORKFORCE_SAMPLED_ACCEPTED: sampling accepted - WORKFORCE_SAMPLED_REJECTED: sampling rejected - AUTO_ANNOTATION: to be confirmed
status	Integer	Status of a dataset version. The options are as follows: - 0: creating - 1: running - 2: deleting - 3: deleted - 4: error
tags	Array of strings	Key identifier list of the dataset. The labeling type is used as the default label when the labeling task releases a version. For example, ["Image","Object detection"].
task_type	Integer	Labeling task type of the released version, which is the same as the dataset type.
total_sample_count	Integer	Total number of version samples.
total_sub_sample_count	Integer	Total number of subsamples generated from the parent samples.
train_evaluate_sample_ratio	String	Split training and verification ratio during version release. The default value is 1.00, indicating that all labeled samples are split into the training set.
update_time	Long	Time when a version is updated.
version_format	String	Format of a dataset version. The options are as follows: - Default: default format - CarbonData: CarbonData (supported only by table datasets) - CSV: CSV
version_id	String	Dataset version ID.
version_name	String	Dataset version name.
with_column_header	Boolean	Whether the first row in the released CSV file is a column name. This field is valid for the table dataset. The options are as follows: - true: The first row in the released CSV file is a column name. - false: The first row in the released CSV file is not a column name.

**Table 5** LabelStats
Parameter	Type	Description
attributes	Array of LabelAttribute objects	Multi-dimensional attribute of a label. For example, if the label is music, attributes such as style and artist may be included.
count	Integer	Number of labels.
name	String	Label name.
property	LabelProperty object	Basic attribute key-value pair of a label, such as color and shortcut keys.
sample_count	Integer	Number of samples containing the label.
type	Integer	Label type. The options are as follows: - 0: image classification - 1: object detection - 100: text classification - 101: named entity recognition - 102: text triplet relationship - 103: text triplet entity - 200: speech classification - 201: speech content - 202: speech paragraph labeling - 600: video classification

**Table 6** LabelAttribute
Parameter	Type	Description
default_value	String	Default value of a label attribute.
id	String	Label attribute ID.
name	String	Label attribute name.
type	String	Label attribute type. The options are as follows: - text: text - select: single-choice drop-down list
values	Array of LabelAttributeValue objects	List of label attribute values.

**Table 7** LabelAttributeValue
Parameter	Type	Description
id	String	Label attribute value ID.
value	String	Label attribute value.

**Table 8** LabelProperty
Parameter	Type	Description
@modelarts:color	String	Default attribute: Label color, which is a hexadecimal code of the color. By default, this parameter is left blank. Example: #FFFFF0.
@modelarts:default_shape	String	Default attribute: Default shape of an object detection label (dedicated attribute). By default, this parameter is left blank. The options are as follows: - bndbox: rectangle - polygon: polygon - circle: circle - line: straight line - dashed: dotted line - point: point - polyline: polyline
@modelarts:from_type	String	Default attribute: Type of the head entity in the triplet relationship label. This attribute must be specified when a relationship label is created. This parameter is used only for the text triplet dataset.
@modelarts:rename_to	String	Default attribute: The new name of the label.
@modelarts:shortcut	String	Default attribute: Label shortcut key. By default, this parameter is left blank. For example: D.
@modelarts:to_type	String	Default attribute: Type of the tail entity in the triplet relationship label. This attribute must be specified when a relationship label is created. This parameter is used only for the text triplet dataset.

Example Requests

Querying the Version List of a Specific Dataset

GET https://{endpoint}/v2/{project_id}/datasets/{dataset_id}/versions

Example Responses

Status code: 200

{
  "total_number" : 3,
  "versions" : [ {
    "version_id" : "54IXbeJhfttGpL46lbv",
    "version_name" : "V003",
    "version_format" : "Default",
    "previous_version_id" : "eSOKEQaXhKzxN00WKoV",
    "status" : 1,
    "create_time" : 1605930512183,
    "total_sample_count" : 10,
    "annotated_sample_count" : 10,
    "total_sub_sample_count" : 0,
    "annotated_sub_sample_count" : 0,
    "manifest_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V003/V003.manifest",
    "data_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V003/data/",
    "is_current" : true,
    "train_evaluate_sample_ratio" : "0.8",
    "remove_sample_usage" : false,
    "export_images" : false,
    "description" : "",
    "task_type" : 0,
    "extract_serial_number" : false
  }, {
    "version_id" : "eSOKEQaXhKzxN00WKoV",
    "version_name" : "V002",
    "version_format" : "Default",
    "previous_version_id" : "vlGvUqOcxxGPIB0ugeE",
    "status" : 1,
    "create_time" : 1605691027084,
    "total_sample_count" : 10,
    "annotated_sample_count" : 10,
    "total_sub_sample_count" : 0,
    "annotated_sub_sample_count" : 0,
    "manifest_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V002/V002.manifest",
    "data_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V002/data/",
    "is_current" : false,
    "train_evaluate_sample_ratio" : "0.9999",
    "remove_sample_usage" : false,
    "export_images" : false,
    "description" : "",
    "task_type" : 0,
    "extract_serial_number" : false
  }, {
    "version_id" : "vlGvUqOcxxGPIB0ugeE",
    "version_name" : "V001",
    "version_format" : "Default",
    "status" : 1,
    "create_time" : 1605690687346,
    "total_sample_count" : 10,
    "annotated_sample_count" : 10,
    "total_sub_sample_count" : 0,
    "annotated_sub_sample_count" : 0,
    "manifest_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V001/V001.manifest",
    "data_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/V001/data/",
    "is_current" : false,
    "train_evaluate_sample_ratio" : "0.99",
    "remove_sample_usage" : false,
    "export_images" : false,
    "description" : "",
    "task_type" : 0,
    "extract_serial_number" : false
  } ]
}

Status Codes

Status Code	Description
200	OK
401	Unauthorized
403	Forbidden
404	Not Found

Error Codes

See Error Codes.

Parent topic: Dataset Version Management

42 KiB Raw Permalink Blame History