Images or target bounding boxes are analyzed based on image features, such as blurs and brightness to draw visualized curves to help process datasets.
You can also select multiple versions of a dataset to view their curves for comparison and analysis.
You can also click a dataset name to go to the dataset page and click the Data Features tab.
Version: Select a published version of the dataset.
Version: Select the version to be compared from the drop-down list You can also select only one version.
Type: Select the type to be analyzed. The value can be all, train, eval, or inference.
Data Feature Metric: Select metrics to be displayed from the drop-down list. For details, see Supported Data Feature Metrics.
Then, the selected version and metrics are displayed on the page. The displayed chart helps you understand data distribution for better data processing.
After data feature analysis is complete, you can click Task History on the right of the Data Features tab to view historical analysis tasks and their statuses in the displayed dialog box.
Metric |
Description |
Explanation |
---|---|---|
Resolution
|
Image resolution. An area value is used as a statistical value. |
Metric analysis results are used to check whether there is an offset point. If an offset point exists, you can resize or delete the offset point. |
Aspect Ratio
|
An aspect ratio is a proportional relationship between an image's width and height. |
The chart of the metric is in normal distribution, which is generally used to compare the difference between the training set and the dataset used in the real scenario. |
Brightness
|
Brightness is the perception elicited by the luminance of a visual target. A larger value indicates better image brightness. |
The chart of the metric is in normal distribution. You can determine whether the brightness of the entire dataset is high or low based on the distribution center. You can adjust the brightness based on your application scenario. For example, if the application scenario is night, the brightness should be lower. |
Saturation
|
Color saturation of an image. A larger value indicates that the entire image color is easier to distinguish. |
The chart of the metric is in normal distribution, which is generally used to compare the difference between the training set and the dataset used in the real scenario. |
Blur Score Clarity |
Image clarity, which is calculated using the Laplace operator. A larger value indicates clearer edges and higher clarity. |
You can determine whether the clarity meets the requirements based on the application scenario. For example, if data is collected from HD cameras, the clarity must be higher. You can sharpen or blur the dataset and add noises to adjust the clarity. |
Colorfulness
|
Horizontal coordinate: Colorfulness of an image. A larger value indicates richer colors. Vertical coordinate: Number of images |
Colorfulness on the visual sense, which is generally used to compare the difference between the training set and the dataset used in the real scenario. |
Bounding Box Number
|
Horizontal coordinate: Number of bounding boxes in an image Vertical coordinate: Number of images |
It is difficult for a model to detect a large number of bounding boxes in an image. Therefore, more images containing many bounding boxes are required for training. |
Std of Bounding Boxes Area Per Image Standard Deviation of Bounding Boxes Per Image |
Horizontal coordinate: Standard deviation of bounding boxes in an image. If an image has only one bounding box, the standard deviation is 0. A larger standard deviation indicates higher bounding box size variation in an image. Vertical coordinate: Number of images |
It is difficult for a model to detect a large number of bounding boxes with different sizes in an image. You can add data for training based on scenarios or delete data if such scenarios do not exist. |
Aspect Ratio of Bounding Boxes
|
Horizontal coordinate: Aspect ratio of the target bounding boxes Vertical coordinate: Number of bounding boxes in all images |
The chart of the metric is generally in Poisson distribution, which is closely related to application scenarios. This metric is mainly used to compare the differences between the training set and the validation set. For example, if the training set is a rectangle, the result will be significantly affected if the validation set is close to a square. |
Area Ratio of Bounding Boxes
|
Horizontal coordinate: Area ratio of the target bounding boxes, that is, the ratio of the bounding box area to the entire image area. A larger value indicates a higher ratio of the object in the image. Vertical coordinate: Number of bounding boxes in all images |
The metric is used to determine the distribution of anchors used in the model. If the target bounding box is large, set the anchor to a large value. |
Marginalization Value of Bounding Boxes
|
Horizontal coordinate: Marginalization degree, that is, the ratio of the distance between the center point of the target bounding box and the center point of the image to the total distance of the image. A larger value indicates that the object is closer to the edge. (The total distance of an image is the distance from the intersection point of a ray (that starts from the center point of the image and passes through the center point of the bounding box) and the image border to the center point of the image.) Vertical coordinate: Number of bounding boxes in all images |
Generally, the chart of the metric is in normal distribution. The metric is used to determine whether an object is at the edge of an image. If a part of an object is at the edge of an image, you can add a dataset or do not label the object. |
Overlap Score of Bounding Boxes Overlap Score of Bounding Boxes |
Horizontal coordinate: Overlap degree, that is, the part of a single bounding box overlapped by other bounding boxes. The value ranges from 0 to 1. A larger value indicates that more parts are overlapped by other bounding boxes. Vertical coordinate: Number of bounding boxes in all images |
The metric is used to determine the overlapping degree of objects to be detected. Overlapped objects are difficult to detect. You can add a dataset or do not label some objects based on your needs. |
Brightness of Bounding Boxes Brightness of Bounding Boxes |
Horizontal coordinate: Brightness of the image in the target bounding box. A larger value indicates brighter image. Vertical coordinate: Number of bounding boxes in all images |
Generally, the chart of the metric is in normal distribution. The metric is used to determine the brightness of an object to be detected. In some special scenarios, the brightness of an object is low and may not meet the requirements. |
Blur Score of Bounding Boxes Clarity of Bounding Boxes |
Horizontal coordinate: Clarity of the image in the target bounding box. A larger value indicates higher image clarity. Vertical coordinate: Number of bounding boxes in all images |
The metric is used to determine whether the object to be detected is blurred. For example, a moving object may become blurred during collection and its data needs to be collected again. |