This API is used to query details about a dataset.
You can debug this API through automatic authentication in or use the SDK sample code generated by API Explorer.
GET /v2/{project_id}/datasets/{dataset_id}
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
dataset_id |
Yes |
String |
Dataset ID. |
project_id |
Yes |
String |
Project ID. For details about how to obtain a project ID, see Obtaining a Project ID and Name. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
check_running_task |
No |
Boolean |
Whether to detect tasks (including initialization tasks) that are running in a dataset. Options:
|
running_task_type |
No |
Integer |
Type of the running tasks (including initialization tasks) to be detected. The options are as follows:
|
None
Status code: 200
Parameter |
Type |
Description |
---|---|---|
annotated_sample_count |
Integer |
Number of labeled samples in a dataset. |
annotated_sub_sample_count |
Integer |
Number of labeled subsamples. |
content_labeling |
Boolean |
Whether to enable content labeling for the speech paragraph labeling dataset. This function is enabled by default. |
create_time |
Long |
Time when a dataset is created. |
current_version_id |
String |
Current version ID of a dataset. |
current_version_name |
String |
Current version name of a dataset. |
data_format |
String |
Data format. |
data_sources |
Array of DataSource objects |
Data source list. |
data_statistics |
Map<String,Object> |
Sample statistics on a dataset, including the statistics on sample metadata. |
data_update_time |
Long |
Time when a sample and a label are updated. |
dataset_format |
Integer |
Dataset format. Options:
|
dataset_id |
String |
Dataset ID. |
dataset_name |
String |
Dataset name. |
dataset_tags |
Array of strings |
Key identifier list of a dataset, for example, ["Image","Object detection"]. |
dataset_type |
Integer |
Dataset type. Options:
|
dataset_version_count |
Integer |
Number of dataset versions. |
deleted_sample_count |
Integer |
Number of deleted samples. |
deletion_stats |
Map<String,Integer> |
Deletion reason statistics. |
description |
String |
Dataset description. |
enterprise_project_id |
String |
Enterprise project ID. |
exist_running_task |
Boolean |
Whether the dataset contains running (including initialization) tasks. Options:
|
exist_workforce_task |
Boolean |
Whether the dataset contains team labeling tasks. Options:
|
feature_supports |
Array of strings |
List of features supported by the dataset. Currently, only the value 0 is supported, indicating that the OBS file size is limited. |
import_data |
Boolean |
Whether to import data. Options:
|
import_task_id |
String |
ID of an import task. |
inner_annotation_path |
String |
Path for storing the labeling result of a dataset. |
inner_data_path |
String |
Path for storing the internal data of a dataset. |
inner_log_path |
String |
Path for storing internal logs of a dataset. |
inner_task_path |
String |
Path for internal task of a dataset. |
inner_temp_path |
String |
Path for storing internal temporary files of a dataset. |
inner_work_path |
String |
Output directory of a dataset. |
label_task_count |
Integer |
Number of labeling tasks. |
labels |
Array of Label objects |
Dataset label list. |
loading_sample_count |
Integer |
Number of loading samples. |
managed |
Boolean |
Whether a dataset is hosted. Options:
|
next_version_num |
Integer |
Number of next versions of a dataset. |
running_tasks_id |
Array of strings |
ID list of running (including initialization) tasks. |
schema |
Array of Field objects |
Schema list. |
status |
Integer |
Dataset status. Options:
|
third_path |
String |
Third-party path. |
total_sample_count |
Integer |
Total number of dataset samples. |
total_sub_sample_count |
Integer |
Total number of subsamples generated from the parent samples. For example, the total number of key frame images extracted from the video labeling dataset is that of subsamples. |
unconfirmed_sample_count |
Integer |
Number of auto labeling samples to be confirmed. |
update_time |
Long |
Time when a dataset is updated. |
versions |
Array of DatasetVersion objects |
Dataset version information. Currently, only the current version information of a dataset is recorded. |
work_path |
String |
Output dataset path, which is used to store output files such as label files. The path is an OBS path in the format of /Bucket name/File path. For example: /obs-bucket. |
work_path_type |
Integer |
Type of the dataset output path. Options:
|
workforce_descriptor |
WorkforceDescriptor object |
Team labeling information. |
workforce_task_count |
Integer |
Number of team labeling tasks of a dataset. |
workspace_id |
String |
Workspace ID. If no workspace is created, the default value is 0. If a workspace is created and used, use the actual value. |
Parameter |
Type |
Description |
---|---|---|
data_path |
String |
Data source path. |
data_type |
Integer |
Data type. Options:
|
schema_maps |
Array of SchemaMap objects |
Schema mapping information corresponding to the table data. |
source_info |
SourceInfo object |
Information required for importing a table data source. |
with_column_header |
Boolean |
Whether the first row in the file is a column name. This field is valid for the table dataset. Options:
|
Parameter |
Type |
Description |
---|---|---|
dest_name |
String |
Name of the destination column. |
src_name |
String |
Name of the source column. |
Parameter |
Type |
Description |
---|---|---|
cluster_id |
String |
ID of an MRS cluster. |
cluster_mode |
String |
Running mode of an MRS cluster. Options:
|
cluster_name |
String |
Name of an MRS cluster. |
database_name |
String |
Name of the database to which the table dataset is imported. |
input |
String |
HDFS path of a table dataset. |
ip |
String |
IP address of your GaussDB(DWS) cluster. |
port |
String |
Port number of your GaussDB(DWS) cluster. |
queue_name |
String |
DLI queue name of a table dataset. |
subnet_id |
String |
Subnet ID of an MRS cluster. |
table_name |
String |
Name of the table to which a table dataset is imported. |
user_name |
String |
Username, which is mandatory for GaussDB(DWS) data. |
user_password |
String |
User password, which is mandatory for GaussDB(DWS) data. |
vpc_id |
String |
ID of the VPC where an MRS cluster resides. |
Parameter |
Type |
Description |
---|---|---|
attributes |
Array of LabelAttribute objects |
Multi-dimensional attribute of a label. For example, if the label is music, attributes such as style and artist may be included. |
name |
String |
Label name. |
property |
LabelProperty object |
Basic attribute key-value pair of a label, such as color and shortcut keys. |
type |
Integer |
Label type. Options:
|
Parameter |
Type |
Description |
---|---|---|
description |
String |
Schema description. |
name |
String |
Schema name. |
schema_id |
Integer |
Schema ID. |
type |
String |
Schema value type. |
Parameter |
Type |
Description |
---|---|---|
add_sample_count |
Integer |
Number of added samples. |
analysis_cache_path |
String |
Cache path for feature analysis. |
analysis_status |
Integer |
Status of a feature analysis task. Options:
|
analysis_task_id |
String |
ID of a feature analysis task. |
annotated_sample_count |
Integer |
Number of samples with labeled versions. |
annotated_sub_sample_count |
Integer |
Number of labeled subsamples. |
clear_hard_property |
Boolean |
Whether to clear hard example properties during release. Options:
|
code |
String |
Status code of a preprocessing task such as rotation and cropping. |
create_time |
Long |
Time when a version is created. |
crop |
Boolean |
Whether to crop the image. This field is valid only for the object detection dataset whose labeling box is in the rectangle shape. Options:
|
crop_path |
String |
Path for storing cropped files. |
crop_rotate_cache_path |
String |
Temporary directory for executing the rotation and cropping task. |
data_analysis |
Map<String,Object> |
Feature analysis result in JSON format. |
data_path |
String |
Path for storing data. |
data_statistics |
Map<String,Object> |
Sample statistics on a dataset, including the statistics on sample metadata in JSON format. |
data_validate |
Boolean |
Whether data is validated by the validation algorithm before release. Options:
|
deleted_sample_count |
Integer |
Number of deleted samples. |
deletion_stats |
Map<String,Integer> |
Deletion reason statistics. |
description |
String |
Description of a version. |
export_images |
Boolean |
Whether to export images to the version output directory during release. Options:
|
extract_serial_number |
Boolean |
Whether to parse the subsample number during release. The field is valid for the healthcare dataset. Options:
|
include_dataset_data |
Boolean |
Whether to include the source data of a dataset during release. Options:
|
is_current |
Boolean |
Whether the current dataset version is used. Options:
|
label_stats |
Array of LabelStats objects |
Label statistics list of a released version. |
label_type |
String |
Label type of a released version. Options:
|
manifest_cache_input_path |
String |
Input path for the manifest file cache during version release. |
manifest_path |
String |
Path for storing the manifest file with the released version. |
message |
String |
Task information recorded during release (for example, error information). |
modified_sample_count |
Integer |
Number of modified samples. |
previous_annotated_sample_count |
Integer |
Number of labeled samples of parent versions. |
previous_total_sample_count |
Integer |
Total samples of parent versions. |
previous_version_id |
String |
Parent version ID |
processor_task_id |
String |
ID of a preprocessing task such as rotation and cropping. |
processor_task_status |
Integer |
Status of a preprocessing task such as rotation and cropping. Options:
|
remove_sample_usage |
Boolean |
Whether to clear the existing usage information of a dataset during release. Options:
|
rotate |
Boolean |
Whether to rotate the image. Options:
|
rotate_path |
String |
Path for storing the rotated file. |
sample_state |
String |
Sample status. Options:
|
start_processor_task |
Boolean |
Whether to start a data analysis task during release. Options:
|
status |
Integer |
Status of a dataset version. Options:
|
tags |
Array of strings |
Key identifier list of the dataset. The labeling type is used as the default label when the labeling task releases a version. For example, ["Image","Object detection"]. |
task_type |
Integer |
Labeling task type of the released version, which is the same as the dataset type. |
total_sample_count |
Integer |
Total number of version samples. |
total_sub_sample_count |
Integer |
Total number of subsamples generated from the parent samples. |
train_evaluate_sample_ratio |
String |
Split training and verification ratio during version release. The default value is 1.00, indicating that all released versions are training sets. |
update_time |
Long |
Time when a version is updated. |
version_format |
String |
Format of a dataset version. Options:
|
version_id |
String |
Dataset version ID. |
version_name |
String |
Dataset version name. |
with_column_header |
Boolean |
Whether the first row in the released CSV file is a column name. This field is valid for the table dataset. Options:
|
Parameter |
Type |
Description |
---|---|---|
attributes |
Array of LabelAttribute objects |
Multi-dimensional attribute of a label. For example, if the label is music, attributes such as style and artist may be included. |
count |
Integer |
Number of labels. |
name |
String |
Label name. |
property |
LabelProperty object |
Basic attribute key-value pair of a label, such as color and shortcut keys. |
sample_count |
Integer |
Number of samples containing the label. |
type |
Integer |
Label type. Options:
|
Parameter |
Type |
Description |
---|---|---|
default_value |
String |
Default value of a label attribute. |
id |
String |
Label attribute ID. |
name |
String |
Label attribute name. |
type |
String |
Label attribute type. Options:
|
values |
Array of LabelAttributeValue objects |
List of label attribute values. |
Parameter |
Type |
Description |
---|---|---|
id |
String |
Label attribute value ID. |
value |
String |
Label attribute value. |
Parameter |
Type |
Description |
---|---|---|
@modelarts:color |
String |
Default attribute: Label color, which is a hexadecimal code of the color. By default, this parameter is left blank. Example: #FFFFF0. |
@modelarts:default_shape |
String |
Default attribute: Default shape of an object detection label (dedicated attribute). By default, this parameter is left blank. Options:
|
@modelarts:from_type |
String |
Default attribute: Type of the head entity in the triplet relationship label. This attribute must be specified when a relationship label is created. This parameter is used only for the text triplet dataset. |
@modelarts:rename_to |
String |
Default attribute: The new name of the label. |
@modelarts:shortcut |
String |
Default attribute: Label shortcut key. By default, this parameter is left blank. For example: D. |
@modelarts:to_type |
String |
Default attribute: Type of the tail entity in the triplet relationship label. This attribute must be specified when a relationship label is created. This parameter is used only for the text triplet dataset. |
Parameter |
Type |
Description |
---|---|---|
current_task_id |
String |
ID of a team labeling task. |
current_task_name |
String |
Name of a team labeling task. |
reject_num |
Integer |
Number of rejected samples. |
repetition |
Integer |
Number of persons who label each sample. The minimum value is 1. |
is_synchronize_auto_labeling_data |
Boolean |
Whether to synchronously update auto labeling data. Options:
|
is_synchronize_data |
Boolean |
Whether to synchronize updated data, such as uploading files, synchronizing data sources, and assigning imported unlabeled files to team members. Options:
|
workers |
Array of Worker objects |
List of labeling team members. |
workforce_id |
String |
ID of a labeling team. |
workforce_name |
String |
Name of a labeling team. |
Parameter |
Type |
Description |
---|---|---|
create_time |
Long |
Creation time. |
description |
String |
Labeling team member description. The value contains 0 to 256 characters and does not support the following special characters: ^!<>=&"' |
String |
Email address of a labeling team member. |
|
role |
Integer |
Role. Options:
|
status |
Integer |
Current login status of a labeling team member. Options:
|
update_time |
Long |
Update time. |
worker_id |
String |
ID of a labeling team member. |
workforce_id |
String |
ID of a labeling team. |
Querying Details About a Dataset
GET https://{endpoint}/v2/{project_id}/datasets/{dataset_id}
Status code: 200
OK
{ "dataset_id" : "gfghHSokody6AJigS5A", "dataset_name" : "dataset-f9e8", "dataset_type" : 0, "data_format" : "Default", "next_version_num" : 4, "status" : 1, "data_sources" : [ { "data_type" : 0, "data_path" : "/test-obs/classify/input/animals/" } ], "create_time" : 1605690595404, "update_time" : 1605690595404, "description" : "", "current_version_id" : "54IXbeJhfttGpL46lbv", "current_version_name" : "V003", "total_sample_count" : 10, "annotated_sample_count" : 10, "unconfirmed_sample_count" : 0, "work_path" : "/test-obs/classify/output/", "inner_work_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/", "inner_annotation_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/annotation/", "inner_data_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/data/", "inner_log_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/logs/", "inner_temp_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/temp/", "inner_task_path" : "/test-obs/classify/output/dataset-f9e8-gfghHSokody6AJigS5A/task/", "work_path_type" : 0, "workspace_id" : "0", "enterprise_project_id" : "0", "workforce_task_count" : 0, "feature_supports" : [ "0" ], "managed" : false, "import_data" : false, "label_task_count" : 1, "dataset_format" : 0, "dataset_version_count" : 3, "content_labeling" : true, "labels" : [ { "name" : "Rabbits", "type" : 0, "property" : { "@modelarts:color" : "#3399ff" } }, { "name" : "Bees", "type" : 0, "property" : { "@modelarts:color" : "#3399ff" } } ] }
Status Code |
Description |
---|---|
200 |
OK |
401 |
Unauthorized |
403 |
Forbidden |
404 |
Not Found |
See Error Codes.