Algorithms and Their Running Parameters

This section describes the built-in algorithms supported by ModelArts and the running parameters supported by each algorithm. You can set running parameters for a training job as required.

yolo_v3

Table 1 Algorithm description

Parameter

Description

Name

yolo_v3

Usage

Object detection and locating

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

81.7%(mAP)

mAP is an indicator that measures the effect of an object detection algorithm. For object detection tasks, the precision rate (Precision) and recall rate (Recall) can be calculated for each class of object. The rates can be calculated and tested multiple times for each class of object based on different thresholds, and a P-R curve is obtained accordingly. The area under the curve is the average value.

Training Dataset

Camvid

Data Format

shape: [H>=224, W>=224, C>=1]; type: int8

Running Parameter

lr=0.0001 ; mom=0.9 ; wd=0.0005

For more available running parameters, see Table 2.

Table 2 Running parameters

Optional Parameter

Parameter Description

Default Value

lr

Learning rate

0.0001

mom

Momentum of the training network

0.9

wd

Parameter weight decay coefficient, L2

0.0005

num_classes

Total number of image classes used in training. If you add images of other classes, this parameter will be automatically identified and matched. You do not need to manually change the parameter value.

None

split_spec

Split ratio of the training set and validation set

0.8

batch_size

Total number of training images updated each time

16

eval_frequence

Frequency for validating the model. By default, validation is performed every epoch.

1

num_epoch

Number of training epochs

10

Table 3 Running parameters that can be added

Optional Parameter

Parameter Description

Recommended Value

num_examples

Total number of images used for training. For example, if the total number of images is 1,000, the images used for training is 800 based on the split ratio.

16551

disp_batches

The loss and training speed of the model is displayed every N batches.

20

warm_up_epochs

Number of epochs when the target learning rate of the warm-up strategy is reached

0

lr_steps

Number of epochs when the learning rate attenuates in the multi-factor strategy. By default, the learning rate attenuates to 0.1 times of the original value at the 10th and 15th epochs.

10,15

retinanet_resnet_v1_50

Table 4 Algorithm description

Parameter

Description

Name

retinanet_resnet_v1_50

Usage

Object detection and locating

Engine Type

TensorFlow, TF-1.8.0-python2.7

Precision

83.15%(mAP)

mAP is an indicator that measures the effect of an object detection algorithm. For object detection tasks, the precision rate (Precision) and recall rate (Recall) can be calculated for each class of object. The rates can be calculated and tested multiple times for each class of object based on different thresholds, and a P-R curve is obtained accordingly. The area under the curve is the average value.

Training Dataset

Pascal VOC2007, detection of 20 classes of objects

Data Format

shape: [H, W, C>=1]; type: int8

Running Parameter

For more available running parameters, see Table 5.

Table 5 Running parameters

Optional Parameter

Parameter Description

Default Value

split_spec

Split ratio of the training set and validation set

train:0.8,eval:0.2

num_gpus

Number of used GPUs

1

batch_size

Number of images for each iteration (standalone). To ensure the algorithm precision, you are advised to use the default value.

32

eval_batch_size

Number of images read each step during validation (standalone)

32

learning_rate_strategy

Learning rate strategy. The value ranges from 0 to 1. For example, the value can be set to 0.001.

0.002

evaluate_every_n_epochs

A validation is performed after N epochs are trained.

1

save_interval_secs

Interval for saving the model. The unit is second.

If model running time is greater than 2,000,000s, the model is saved once every 2,000,000s by default. If model running time is less than 2,000,000s, the model is saved when the running is complete.

2000000

max_epoches

Maximum number of training epochs

100

log_every_n_steps

Logs are printed every N steps. By default, logs are printed every 10 steps.

10

save_summaries_steps

Summary information is saved every five steps, including the model gradient update value and training parameters.

5

inception_v3

Table 6 Algorithm description

Parameter

Description

Name

inception_v3

Usage

Image Classification

Engine Type

TensorFlow, TF-1.8.0-python2.7

Precision

78.00%(top1), 93.90%(top5)

  • top1 indicates that the classification is considered correct only when the image with the maximum probability is the correct image.
  • top5 indicates that the classification is considered correct only when the correct image is within the top 5 images.

Training Dataset

ImageNet, classification of 1,000 image classes

Data Format

shape: [H, W, C>=1]; type: int8

Running Parameter

batch_size=32 ; split_spec=train:0.8,eval:0.2 ;

For more available running parameters, see Table 7.

Table 7 Running parameters

Optional Parameter

Parameter Description

Default Value

split_spec

Split ratio of the training set and validation set

train:0.8,eval:0.2

num_gpus

Number of used GPUs

1

batch_size

Number of images for each iteration (standalone). To ensure the algorithm precision, you are advised to use the default value.

32

eval_batch_size

Number of images read each step during validation (standalone)

32

learning_rate_strategy

Learning rate strategy. For example, 10:0.001,20:0.0001 indicates that the learning rate for 0 to 10 epochs is 0.001, and that for 10 to 20 epochs is 0.0001.

0.002

evaluate_every_n_epochs

A validation is performed after N epochs are trained.

1

save_interval_secs

Interval for saving the model. The unit is second.

If model running time is greater than 2,000,000s, the model is saved once every 2,000,000s by default. If model running time is less than 2,000,000s, the model is saved when the running is complete.

2000000

max_epoches

Maximum number of training epochs

100

log_every_n_steps

Logs are printed every N steps. By default, logs are printed every 10 steps.

10

save_summaries_steps

Summary information is saved every five steps, including the model gradient update value and training parameters.

5

Table 8 Running parameters that can be added

Optional Parameter

Parameter Description

Recommended Value

weight_decay

L2 regularization weight decay

0.00004

optimizer

Optimizer. The options are as follows:

  • dymomentumw
  • sgd
  • adam
  • momentum

momentum

momentum

Optimizer parameter momentum

0.9

patience

After training of N epochs, if the precision (mAP for object detection and accuracy for image classification) does not increase compared with the previous maximum value, that is, the difference between the precision and the maximum precision is less than the value of decay_min_delta, the learning rate attenuates to one tenth of the original value. The default value of N is 8.

8

decay_patience

After training of extra M epochs on the basis of the preceding patience, if the precision (mAP for object detection and accuracy for image classification) does not increase, that is, the difference between the precision and the maximum precision is less than the value of decay_min_delta, training will be terminated early. The default value of M is 1.

1

decay_min_delta

Minimum difference between the precision (mAP for object detection and accuracy for image classification) corresponding to different learning rates. If the parameter value is greater than 0.001, the precision is increased. Otherwise, the precision is not increased.

0.001

image_size

Size of the input image. If this parameter is set to None, the default image size prevails.

None

lr_warmup_strategy

Warm-up strategy (linear or exponential)

linear

num_readers

Number of threads for reading data

64

fp16

Whether to use FP16 for training

FALSE

max_lr

Maximum learning rate for the dymomentum and dymomentumw optimizers, or when use_lr_schedule is used

6.4

min_lr

Minimum learning rate for the dymomentum and dymomentumw optimizers, or when use_lr_schedule is used

0.005

warmup

Proportion of warm-up in total training steps. This parameter is valid when use_lr_schedule is lcd or poly.

0.1

cooldown

Minimum learning rate in the warm-up

0.05

max_mom

Maximum momentum. This parameter is valid for dynamic momentum.

0.98

min_mom

Minimum momentum. This parameter is valid for dynamic momentum.

0.85

use_lars

Whether to use LARS

FALSE

use_nesterov

Whether to use Nesterov Momentum

TRUE

preprocess_threads

Number of threads for image preprocessing

12

use_lr_schedule

Learning rate adjustment policy ('lcd':linear_cosine_decay, 'poly':polynomial_decay)

None

darknet_53

Table 9 Algorithm description

Parameter

Description

Name

darknet_53

Usage

Image Classification

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

78.56%(top1), 94.43%(top5)

  • top1 indicates that the classification is considered correct only when the image with the maximum probability is the correct image.
  • top5 indicates that the classification is considered correct only when the correct image is within the top 5 images.

Training Dataset

ImageNet, classification of 1,000 image classes

Data Format

shape: [H>=224, W>=224, C>=1]; type: int8

Running Parameter

split_spec=0.8 ; batch_size=4 ;

For more available running parameters, see Table 10.

Table 10 Running parameters

Optional Parameter

Parameter Description

Default Value

split_spec

Split ratio of the training set and validation set

0.8

batch_size

Total amount of input data each time the parameters are updated

4

lr

Learning rate of the updated parameters

0.0001

save_frequency

Interval for saving the model, indicating that the model is saved every N epochs

1

num_classes

Total number of image classes in training

None

num_epoch

Number of training epochs

10

SegNet_VGG_BN_16

Table 11 Algorithm description

Parameter

Description

Name

SegNet_VGG_BN_16

Usage

Image semantic segmentation

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

89%(pixel acc)

pixel acc indicates the ratio of correct pixels to total pixels.

Training Dataset

Camvid

Data Format

shape: [H=360, W=480, C==3]; type: int8

Running Parameter

deploy_on_terminal=False;

For more available running parameters, see Table 12.

Table 12 Running parameters

Optional Parameter

Parameter Description

Default Value

lr

Learning rate of the updated parameters

0.0001

mom

Momentum of the training network

0.9

wd

Attenuation coefficient

0.0005

num_classes

Total number of image classes in training. You do not need to plus 1 here.

11

batch_size

Total number of training images updated each time

8

num_epoch

Number of training epochs

15

save_frequency

Interval for saving the model, indicating that the model is saved every N epochs

1

num_examples

Total number of images used for training, which indicates the number of files in train.txt

2953

ResNet_v2_50

Table 13 Algorithm description

Parameter

Description

Name

ResNet_v2_50

Usage

Image Classification

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

75.55%(top1), 92.6%(top5)

  • top1 indicates that the classification is considered correct only when the image with the maximum probability is the correct image.
  • top5 indicates that the classification is considered correct only when the correct image is within the top 5 images.

Training Dataset

ImageNet, classification of 1,000 image classes

Data Format

shape: [H>=32, W>=32, C>=1]; type: int8

Running Parameter

split_spec=0.8 ; batch_size=4 ;

The available running parameters are the same as those for the darknet_53 algorithm. For details, see Table 10.

ResNet_v1_50

Table 14 Algorithm description

Parameter

Description

Name

ResNet_v1_50

Usage

Image Classification

Engine Type

TensorFlow, TF-1.8.0-python2.7

Precision

74.2%(top1), 91.7%(top5)

  • top1 indicates that the classification is considered correct only when the image with the maximum probability is the correct image.
  • top5 indicates that the classification is considered correct only when the correct image is within the top 5 images.

Training Dataset

ImageNet, classification of 1,000 image classes

Data Format

shape: [H>=600,W<=1024,C>=1];type:int8

Running Parameter

batch_size=32 ; split_spec=train:0.8,eval:0.2 ;

The available running parameters are the same as those for the inception_v3 algorithm. For details, see Table 7.

Faster_RCNN_ResNet_v2_101

To achieve a satisfactory training effect, fine-tuning of parameters is required when this algorithm is used. Otherwise, the result may not meet the expectation. Alternatively, you can use other algorithms.

Table 15 Algorithm description

Parameter

Description

Name

Faster_RCNN_ResNet_v2_101

Usage

Object detection and locating

Engine Type

MXNet, MXNet-1.2.1-python2.7

Precision

80.05%(mAP)

mAP is an indicator that measures the effect of an object detection algorithm. For object detection tasks, the precision rate (Precision) and recall rate (Recall) can be calculated for each class of object. The rates can be calculated and tested multiple times for each class of object based on different thresholds, and a P-R curve is obtained accordingly. The area under the curve is the average value.

Training Dataset

PASCAL VOC2007, PASCAL VOC2012

Data Format

shape: [H, W, C==3]; type: int8

Running Parameter

lr=0.0001 ; eval_frequence=1 ;

For more available running parameters, see Table 16.

Table 16 Running parameters

Optional Parameter

Parameter Description

Default Value

num_classes

Total number of image classes in training. The value must plus 1 because there is a background class.

None

eval_frequence

Frequency for validating the model. By default, validation is performed every epoch.

1

lr

Learning rate

0.0001

mom

Momentum of the training network

0.9

wd

Parameter weight decay coefficient, L2

0.0005

split_spec

Split ratio of the training set and validation set

0.8

Faster_RCNN_ResNet_v1_50

Table 17 Algorithm description

Parameter

Description

Name

Faster_RCNN_ResNet_v1_50

Usage

Object detection and locating

Engine Type

TensorFlow, TF-1.8.0-python2.7

Precision

73.6%(mAP)

mAP is an indicator that measures the effect of an object detection algorithm. For object detection tasks, the precision rate (Precision) and recall rate (Recall) can be calculated for each class of object. The rates can be calculated and tested multiple times for each class of object based on different thresholds, and a P-R curve is obtained accordingly. The area under the curve is the average value.

Training Dataset

Pascal VOC2007, detection of 20 classes of objects

Data Format

shape: [H>=600,W<=1024,C>=1];type:int8

Running Parameter

For details about the parameters and default values, see Table 5.