Specifications for Importing a Manifest File

The manifest file defines the mapping between labeled objects and content. The manifest file import mode means that the manifest file is used for dataset import. The manifest file can be imported from OBS. When importing a manifest file from OBS, ensure that you have the permissions to access the directory where the manifest file is stored.

There are many requirements on the manifest file compilation. Import new data from OBS. Generally, manifest file import is used for data migration of ModelArts in different regions or using different accounts. If you have labeled data in a region using ModelArts, you can obtain the manifest file of the published dataset from the output path. Then you can import the dataset using the manifest file to ModelArts of other regions or accounts. The imported data carries the labeling information and does not need to be labeled again, improving development efficiency.

The manifest file that contains information about the original file and labeling can be used in labeling, training, and inference scenarios. The manifest file that contains only information about the original file can be used in inference scenarios or used to generate an unlabeled dataset. The manifest file must meet the following requirements:

Image Classification

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
{
    "source":"s3://path/to/image1.jpg",
    "usage":"TRAIN",
    "hard":"true",
    "hard-coefficient":0.8,
    "id":"0162005993f8065ef47eefb59d1e4970",
    "annotation": [
        {
            "type": "modelarts/image_classification",
            "name": "cat",
            "property": {
                "color":"white",
                "kind":"Persian cat"            
            },
            "hard":"true",
            "hard-coefficient":0.8,
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"        
        },
        {
            "type": "modelarts/image_classification",
            "name":"animal",
            "annotated-by":"modelarts/active-learning",
            "confidence": 0.8,
            "creation-time":"2019-01-23 11:30:30"        
        }],
    "inference-loc":"/path/to/inference-output"
}
Table 1 Parameters

Parameter

Mandatory

Description

source

Yes

URI of an object to be labeled. For details about data source types and examples, see Table 2.

usage

No

By default, the parameter value is left blank. Possible values are as follows:

  • TRAIN: The object is used for training.
  • EVAL: The object is used for evaluation.
  • TEST: The object is used for testing.
  • INFERENCE: The object is used for inference.

If the parameter value is left blank, you decide how to use the object.

id

No

Sample ID exported from the system. You do not need to set this parameter when importing the sample.

annotation

No

If the parameter value is left blank, the object is not labeled. The value of annotation consists of an object list. For details about the parameters, see Table 3.

inference-loc

No

This parameter is available when the file is generated by the inference service, indicating the location of the inference result file.

Table 2 Data source types

Type

Example

OBS

"source":"s3://path-to-jpg"

Content

"source":"content://I love machine learning"

Table 3 annotation objects

Parameter

Mandatory

Description

type

Yes

Label type. Possible values are as follows:

  • image_classification: image classification
  • text_classification: text classification
  • text_entity: named entity recognition
  • object_detection: object detection
  • audio_classification: sound classification
  • audio_content: speech labeling
  • audio_segmentation: speech paragraph labeling

name

Yes/No

This parameter is mandatory for the classification type but optional for other types. This example uses the image classification type.

id

Yes/No

Label ID. This parameter is mandatory for triplets but optional for other types. The entity label ID of a triplet is in E+number format, for example, E1 and E2. The relationship label ID of a triplet is in R+number format, for example, R1 and R2.

property

No

Labeling property. In this example, there are two properties: color and kind.

hard

No

Indicates whether the example is a hard example. True indicates that the labeling example is a hard example, and False indicates that the labeling example is not a hard example.

annotated-by

No

The default value is human, indicating manual labeling.

  • human

creation-time

No

Time when the labeling job was created. It is the time when labeling information was written, not the time when the manifest file was generated.

confidence

No

Confidence score of machine labeling. The value ranges from 0 to 1.

Image Segmentation

{
    "annotation": [{
        "annotation-format": "PASCAL VOC",
        "type": "modelarts/image_segmentation",
        "annotation-loc": "s3://path/to/annotation/image1.xml",
        "creation-time": "2020-12-16 21:36:27",
        "annotated-by": "human"
    }],
    "usage": "train",
    "source": "s3://path/to/image1.jpg",
    "id": "16d196c19bf61994d7deccafa435398c",
    "sample-type": 0
}
Table 4 PASCAL VOC format parameters

Parameter

Mandatory

Description

folder

Yes

Directory where the data source is located

filename

Yes

Name of the file to be labeled

size

Yes

Image pixel

  • width: image width. This parameter is mandatory.
  • height: image height. This parameter is mandatory.
  • depth: number of image channels. This parameter is mandatory.

segmented

Yes

Segmented or not

mask_source

No

Segmentation mask path

object

Yes

Object detection information. Multiple object{} functions are generated for multiple objects.

  • name: type of the labeled content. This parameter is mandatory.
  • pose: shooting angle of the labeled content. This parameter is mandatory.
  • truncated: whether the labeled content is truncated (0 indicates that the content is not truncated). This parameter is mandatory.
  • occluded: whether the labeled content is occluded (0 indicates that the content is not occluded). This parameter is mandatory.
  • difficult: whether the labeled object is difficult to identify (0 indicates that the object is easy to identify). This parameter is mandatory.
  • confidence: confidence score of the labeled object. The value ranges from 0 to 1. This parameter is optional.
  • bndbox: bounding box type. This parameter is mandatory. For details about the possible values, see Table 5.
  • mask_color: label color, which is represented by the RGB value. This parameter is mandatory.
Table 5 Bounding box types

type

Shape

Labeling information

polygon

Polygon

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>100<y2>

<x3>250<x3>

<y3>150<y3>

<x4>200<x4>

<y4>200<y4>

<x5>100<x5>

<y5>200<y5>

<x6>50<x6>

<y6>150<y6>

<x7>100<x7>

<y7>100<y7>

Example:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<annotation>
    <folder>NA</folder>
    <filename>image_0006.jpg</filename>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>230</width>
        <height>300</height>
        <depth>3</depth>
    </size>
    <segmented>1</segmented>
    <mask_source>obs://xianao/out/dataset-8153-Jmf5ylLjRmSacj9KevS/annotation/V001/segmentationClassRaw/image_0006.png</mask_source>
    <object>
        <name>bike</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <mask_color>193,243,53</mask_color>
        <occluded>0</occluded>
        <polygon>
            <x1>71</x1>
            <y1>48</y1>
            <x2>75</x2>
            <y2>73</y2>
            <x3>49</x3>
            <y3>69</y3>
            <x4>68</x4>
            <y4>92</y4>
            <x5>90</x5>
            <y5>101</y5>
            <x6>45</x6>
            <y6>110</y6>
            <x7>71</x7>
            <y7>48</y7>
        </polygon>
    </object>
</annotation>

Text Classification

{
    "source": "content://I like this product ",
    "id":"XGDVGS",
    "annotation": [
        {
            "type": "modelarts/text_classification",
            "name": " positive",
            "annotated-by": "human",
            "creation-time": "2019-01-23 11:30:30"        
        } ]
}

The content parameter indicates the text to be labeled. The other parameters are the same as those described in Image Classification. For details, see Table 1.

Named Entity Recognition

{
    "source":"content://Michael Jordan is the most famous basketball player in the world.",
    "usage":"TRAIN",
    "annotation":[
        {
            "type":"modelarts/text_entity",
            "name":"Person",
            "property":{
                "@modelarts:start_index":0,
                "@modelarts:end_index":14
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        },
        {
            "type":"modelarts/text_entity",
            "name":"Category",
            "property":{
                "@modelarts:start_index":34,
                "@modelarts:end_index":44
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        }
    ]
}

The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.

Table 6 describes the property parameters. For example, if you want to extract Michael from "source":"content://Michael Jordan", the value of start_index is 0 and that of end_index is 7.

Table 6 property parameters

Parameter

Data Type

Description

@modelarts:start_index

Integer

Start position of the text. The value starts from 0, including the characters specified by start_index.

@modelarts:end_index

Integer

End position of the text, excluding the characters specified by end_index.

Text Triplet

{
    "source":"content://"Three Body" is a series of long science fiction novels created by Liu Cix.",
    "usage":"TRAIN",
    "annotation":[
        {
            "type":"modelarts/text_entity",
            "name":"Person",
            "id":"E1",
            "property":{
                "@modelarts:start_index":67,
                "@modelarts:end_index":74
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        },
        {
            "type":"modelarts/text_entity",
            "name":"Book",
            "id":"E2",
            "property":{
                "@modelarts:start_index":0,
                "@modelarts:end_index":12
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        },
        {
            "type":"modelarts/text_triplet",
            "name":"Author",
            "id":"R1",
            "property":{
                "@modelarts:from":"E1",
                "@modelarts:to":"E2"
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        },
        {
            "type":"modelarts/text_triplet",
            "name":"Works",
            "id":"R2",
            "property":{
                "@modelarts:from":"E2",
                "@modelarts:to":"E1"
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        }
    ]
}

The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.

Table 5 property parameters describes the property parameters. @modelarts:start_index and @modelarts:end_index are the same as those of named entity recognition. For example, when source is set to content://"Three Body" is a series of long science fiction novels created by Liu Cix., Liu Cix is an entity person, Three Body is an entity book, the person is the author of the book, and the book is works of the person.

Table 7 property parameters

Parameter

Data Type

Description

@modelarts:start_index

Integer

Start position of the triplet entities. The value starts from 0, including the characters specified by start_index.

@modelarts:end_index

Integer

End position of the triplet entities, excluding the characters specified by end_index.

@modelarts:from

String

Start entity ID of the triplet relationship

@modelarts:to

String

Entity ID pointed to in the triplet relationship

Object Detection

{
    "source":"s3://path/to/image1.jpg",
    "usage":"TRAIN",
    "hard":"true",
    "hard-coefficient":0.8,
    "annotation": [
        {
            "type":"modelarts/object_detection",
            "annotation-loc": "s3://path/to/annotation1.xml",
            "annotation-format":"PASCAL VOC",
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"                
        }]
}
Table 8 PASCAL VOC format parameters

Parameter

Mandatory

Description

folder

Yes

Directory where the data source is located

filename

Yes

Name of the file to be labeled

size

Yes

Image pixel

  • width: image width. This parameter is mandatory.
  • height: image height. This parameter is mandatory.
  • depth: number of image channels. This parameter is mandatory.

segmented

Yes

Segmented or not

object

Yes

Object detection information. Multiple object{} functions are generated for multiple objects.

  • name: type of the labeled content. This parameter is mandatory.
  • pose: shooting angle of the labeled content. This parameter is mandatory.
  • truncated: whether the labeled content is truncated (0 indicates that the content is not truncated). This parameter is mandatory.
  • occluded: whether the labeled content is occluded (0 indicates that the content is not occluded). This parameter is mandatory.
  • difficult: whether the labeled object is difficult to identify (0 indicates that the object is easy to identify). This parameter is mandatory.
  • confidence: confidence score of the labeled object. The value ranges from 0 to 1. This parameter is optional.
  • bndbox: bounding box type. This parameter is mandatory. For details about the possible values, see Table 9.
Table 9 Bounding box types

type

Shape

Labeling Information

point

Point

Coordinates of a point

<x>100<x>

<y>100<y>

line

Line

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>200<y2>

bndbox

Rectangle

Coordinates of the upper left and lower right points

<xmin>100<xmin>

<ymin>100<ymin>

<xmax>200<xmax>

<ymax>200<ymax>

polygon

Polygon

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>100<y2>

<x3>250<x3>

<y3>150<y3>

<x4>200<x4>

<y4>200<y4>

<x5>100<x5>

<y5>200<y5>

<x6>50<x6>

<y6>150<y6>

circle

Circle

Center coordinates and radius

<cx>100<cx>

<cy>100<cy>

<r>50<r>

Example:
<annotation>
   <folder>test_data</folder>
   <filename>260730932.jpg</filename>
   <size>
       <width>767</width>
       <height>959</height>
       <depth>3</depth>
   </size>
   <segmented>0</segmented>
   <object>
       <name>point</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <point>
           <x1>456</x1>
           <y1>596</y1>
       </point>
   </object>
   <object>
       <name>line</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <line>
           <x1>133</x1>
           <y1>651</y1>
           <x2>229</x2>
           <y2>561</y2>
       </line>
   </object>
   <object>
       <name>bag</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <bndbox>
           <xmin>108</xmin>
           <ymin>101</ymin>
           <xmax>251</xmax>
           <ymax>238</ymax>
       </bndbox>
   </object>
   <object>
       <name>boots</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <hard-coefficient>0.8</hard-coefficient>
       <polygon>
           <x1>373</x1>
           <y1>264</y1>
           <x2>500</x2>
           <y2>198</y2>
           <x3>437</x3>
           <y3>76</y3>
           <x4>310</x4>
           <y4>142</y4>
       </polygon>
   </object>
   <object>
       <name>circle</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <circle>
           <cx>405</cx>
           <cy>170</cy>
           <r>100<r>
       </circle>
   </object>
</annotation>

Sound Classification

{
"source":
"s3://path/to/pets.wav", 
    "annotation": [
        {
            "type": "modelarts/audio_classification",
            "name":"cat",    
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        } 
    ]
}

The parameters such as source, usage, and annotation are the same as those described in Image Classification. For details, see Table 1.

Speech Labeling

{
    "source":"s3://path/to/audio1.wav",
    "annotation":[
        {
            "type":"modelarts/audio_content",
            "property":{
                "@modelarts:content":"Today is a good day."
            },
            "annotated-by":"human",
            "creation-time":"2019-01-23 11:30:30"
        }
    ]
}

Speech Paragraph Labeling

{
    "source":"s3://path/to/audio1.wav",
    "usage":"TRAIN",
    "annotation":[
        {
           
"type":"modelarts/audio_segmentation",
            "property":{
                "@modelarts:start_time":"00:01:10.123",
                "@modelarts:end_time":"00:01:15.456",
               
                "@modelarts:source":"Tom",
               
                "@modelarts:content":"How are you?"
            },
           "annotated-by":"human",
           "creation-time":"2019-01-23 11:30:30"
        },
        {
           "type":"modelarts/audio_segmentation",
            "property":{
                "@modelarts:start_time":"00:01:22.754",
                "@modelarts:end_time":"00:01:24.145",
                "@modelarts:source":"Jerry",
                "@modelarts:content":"I'm fine, thank you."
            },
           "annotated-by":"human",
           "creation-time":"2019-01-23 11:30:30"
        }
    ]
}

Video Labeling

{
	"annotation": [{
		"annotation-format": "PASCAL VOC",
		"type": "modelarts/object_detection",
		"annotation-loc": "s3://path/to/annotation1_t1.473722.xml",
		"creation-time": "2020-10-09 14:08:24",
		"annotated-by": "human"
	}],
	"usage": "train",
	"property": {
		"@modelarts:parent_duration": 8,
		"@modelarts:parent_source": "s3://path/to/annotation1.mp4",
		"@modelarts:time_in_video": 1.473722
	},
	"source": "s3://input/path/to/annotation1_t1.473722.jpg",
	"id": "43d88677c1e9a971eeb692a80534b5d5",
	"sample-type": 0
}
Table 11 property parameters

Parameter

Data Type

Description

@modelarts:parent_duration

Double

Duration of the labeled video, in seconds

@modelarts:time_in_video

Double

Timestamp of the labeled video frame, in seconds

@modelarts:parent_source

String

OBS path of the labeled video

Table 12 PASCAL VOC format parameters

Parameter

Mandatory

Description

folder

Yes

Directory where the data source is located

filename

Yes

Name of the file to be labeled

size

Yes

Image pixel

  • width: image width. This parameter is mandatory.
  • height: image height. This parameter is mandatory.
  • depth: number of image channels. This parameter is mandatory.

segmented

Yes

Segmented or not

object

Yes

Object detection information. Multiple object{} functions are generated for multiple objects.

  • name: type of the labeled content. This parameter is mandatory.
  • pose: shooting angle of the labeled content. This parameter is mandatory.
  • truncated: whether the labeled content is truncated (0 indicates that the content is not truncated). This parameter is mandatory.
  • occluded: whether the labeled content is occluded (0 indicates that the content is not occluded). This parameter is mandatory.
  • difficult: whether the labeled object is difficult to identify (0 indicates that the object is easy to identify). This parameter is mandatory.
  • confidence: confidence score of the labeled object. The value ranges from 0 to 1. This parameter is optional.
  • bndbox: bounding box type. This parameter is mandatory. For details about the possible values, see Table 13.
Table 13 Bounding box types

type

Shape

Labeling Information

point

Point

Coordinates of a point

<x>100<x>

<y>100<y>

line

Line

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>200<y2>

bndbox

Rectangle

Coordinates of the upper left and lower right points

<xmin>100<xmin>

<ymin>100<ymin>

<xmax>200<xmax>

<ymax>200<ymax>

polygon

Polygon

Coordinates of points

<x1>100<x1>

<y1>100<y1>

<x2>200<x2>

<y2>100<y2>

<x3>250<x3>

<y3>150<y3>

<x4>200<x4>

<y4>200<y4>

<x5>100<x5>

<y5>200<y5>

<x6>50<x6>

<y6>150<y6>

circle

Circle

Center coordinates and radius

<cx>100<cx>

<cy>100<cy>

<r>50<r>

Example:
<annotation>
   <folder>test_data</folder>
   <filename>260730932_t1.473722.jpg.jpg</filename>
   <size>
       <width>767</width>
       <height>959</height>
       <depth>3</depth>
   </size>
   <segmented>0</segmented>
   <object>
       <name>point</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <point>
           <x1>456</x1>
           <y1>596</y1>
       </point>
   </object>
   <object>
       <name>line</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <line>
           <x1>133</x1>
           <y1>651</y1>
           <x2>229</x2>
           <y2>561</y2>
       </line>
   </object>
   <object>
       <name>bag</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <bndbox>
           <xmin>108</xmin>
           <ymin>101</ymin>
           <xmax>251</xmax>
           <ymax>238</ymax>
       </bndbox>
   </object>
   <object>
       <name>boots</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <hard-coefficient>0.8</hard-coefficient>
       <polygon>
           <x1>373</x1>
           <y1>264</y1>
           <x2>500</x2>
           <y2>198</y2>
           <x3>437</x3>
           <y3>76</y3>
           <x4>310</x4>
           <y4>142</y4>
       </polygon>
   </object>
   <object>
       <name>circle</name>
       <pose>Unspecified</pose>
       <truncated>0</truncated>
       <occluded>0</occluded>
       <difficult>0</difficult>
       <circle>
           <cx>405</cx>
           <cy>170</cy>
           <r>100<r>
       </circle>
   </object>
</annotation>