Reviewed-by: Jiang, Beibei <beibei.jiang@t-systems.com> Reviewed-by: Rechenburg, Matthias <matthias.rechenburg@t-systems.com> Co-authored-by: Lai, Weijian <laiweijian4@huawei.com> Co-committed-by: Lai, Weijian <laiweijian4@huawei.com>
65 KiB
Obtaining Details About a Training Job
Sample Code
In ModelArts notebook, you do not need to enter authentication parameters for session authentication. For details about session authentication of other development environments, see Session Authentication.
- Method 1: Use the specified job_id and version_id.
1 2 3 4 5
from modelarts.session import Session from modelarts.estimator import Estimator session = Session() estimator = Estimator(modelarts_session=session, job_id="182626", version_id="278813") job_info = estimator.get_job_info()
- Method 2: Use the training job created in Creating a Training Job.
1
job_info = job_instance.get_job_info()
- Method 3: Use the training job version object returned in Querying the List of Training Job Versions.
1
job_info = job_version_instance_list[0].get_job_info()
Parameter Description
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
modelarts_session |
Yes |
Object |
Session object. For details about the initialization method, see Session Authentication. |
job_id |
Yes |
String |
ID of a training job. Obtain job_id using the training job object generated in Creating a Training Job, for example, job_instance.job_id, or from the response in Obtaining Training Jobs. |
version_id |
Yes |
String |
Version ID of a training job. Obtain version_id using the training job object generated in Creating a Training Job, for example, job_instance.version_id, or from the response in Obtaining Training Jobs. |
Parameter |
Type |
Description |
---|---|---|
error_msg |
String |
Error message when the API call fails. This parameter is not included when the API call succeeds. |
error_code |
String |
Error code when the API fails to be called. For details, see Error Codes in ModelArts API Reference. This parameter is not included when the API call succeeds. |
is_success |
Boolean |
Whether the API call succeeds |
job_id |
Long |
Training job ID |
job_name |
String |
Training job name |
job_desc |
String |
Description of a training job |
version_id |
Long |
Version ID of a training job |
version_name |
String |
Version name of a training job |
pre_version_id |
Long |
Name of the previous version of a training job |
engine_type |
Short |
Engine type of a training job. The mapping between engine_type and engine_name is as follows:
|
engine_name |
String |
Name of the engine selected for a training job. Currently, the following engines are supported:
|
engine_id |
Long |
ID of the engine selected for a training job |
engine_version |
String |
Version of the engine selected for a training job |
status |
Integer |
Status of a training job. For details about job statuses, see Job Statuses. |
app_url |
String |
Code directory of a training job |
boot_file_url |
String |
Boot file of a training job |
create_time |
Long |
Time when a training job is created |
parameter |
JSON Array |
Running parameters of a training job. It is a collection of label-value pairs. This parameter is a container environment variable when a job uses a custom image. |
duration |
Long |
Training job running duration, in milliseconds |
spec_id |
Long |
ID of the resource specifications selected for a training job |
core |
String |
Number of cores of the resource specifications |
cpu |
String |
CPU memory of the resource specifications |
gpu_num |
Integer |
Number of GPUs of the resource specifications |
gpu_type |
String |
GPU type of the resource specifications |
worker_server_num |
Integer |
Number of workers in a training job |
data_url |
String |
Dataset of a training job |
train_url |
String |
OBS path to the training job output file |
dataset_version_id |
String |
Dataset version ID of a training job |
dataset_id |
String |
Dataset ID of a training job |
data_source |
JSON Array |
Datasets of a training job |
model_id |
Long |
Model ID of a training job |
model_metric_list |
JSON Array |
Model metrics of a training job |
system_metric_list |
JSON Array |
System monitoring metrics of a training job |
user_image_url |
String |
SWR URL of the custom image used by a training job |
user_command |
String |
Boot command used to start the container of the custom image of a training job |
Parameter |
Type |
Description |
---|---|---|
dataset_id |
String |
Dataset ID of a training job |
dataset_version |
String |
Dataset version ID of a training job |
type |
String |
Dataset type obs: Data from OBS is used. dataset: Data from a specified dataset is used. |
data_url |
String |
OBS bucket path |
Parameter |
Type |
Description |
---|---|---|
metric |
JSON Array |
Validation metrics of a class of a training job |
total_metric |
JSON Array |
All validation metrics of a training job |
Parameter |
Type |
Description |
---|---|---|
cpuUsage |
JSON Array |
CPU usage of a training job |
memUsage |
JSON Array |
Memory usage of a training job |
gpuUtil |
JSON Array |
GPU usage of a training job |
Parameter |
Type |
Description |
---|---|---|
metric_values |
JSON Array |
Validation metrics of a class of a training job |
reserved_data |
JSON Array |
Reserved parameter |
metric_meta |
JSON Array |
A class of a training job, including the class ID and name |
Parameter |
Type |
Description |
---|---|---|
recall |
JSON Array |
Recall of a class of a training job |
precision |
JSON Array |
Precision of a class of a training job |
accuracy |
JSON Array |
Accuracy of a class of a training job |
Parameter |
Type |
Description |
---|---|---|
total_metric_meta |
JSON Array |
Reserved parameter |
total_reserved_data |
JSON Array |
Reserved parameter |
total_metric_values |
JSON Array |
All validation metrics of a training job |
Parameter |
Type |
Description |
---|---|---|
f1_score |
Float |
F1 score of a training job |
recall |
Float |
Total recall of a training job |
precision |
Float |
Total precision of a training job |
accuracy |
Float |
Total accuracy of a training job |