forked from docs/doc-exports

ModelArts GA UMN 06052022 from R&D R&D provided a new version of the ModelArts User Manual in May 2022. Depends-On: #11 Reviewed-by: Artem Goncharov <Artem.goncharov@gmail.com>
1.9 KiB
1.9 KiB
What Are the Requirements for Training Data When You Create a Predictive Analytics Project in ExeML?
Requirements on Datasets
- Data files cannot be stored in the root directory of an OBS bucket.
- The name of files in a dataset consists of letters, digits, hyphens (-), and underscores (_), and the file name extension is CSV.
- The files are saved in CSV format. Use newline characters (\n or LF) to separate lines and commas (,) to separate columns of the file content. The file content cannot contain Chinese characters. The column content cannot contain special characters such as commas (,) and newline characters. The quotation marks are not supported. It is recommended that the column content consist of letters and digits.
- The number of columns in the training data must be the same, and the total number of data records must be greater than or equal to 100. The training columns cannot contain data of the timestamp format (such as yy-mm-dd or yyyy-mm-dd). If you select continuous values for a label column, ensure that the column contains only digits and the training data has at least 25 different values. The training data CSV file cannot contain the table header. Otherwise, the training fails.
Parent topic: ExeML