1
0
forked from docs/doc-exports
doc-exports/docs/dli/umn/dli_03_0075.html
Su, Xiaomeng fdd43c552e dli_umn_20240808
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
2024-08-09 11:00:57 +00:00

6.6 KiB

How Do I Dump Data to OBS and Create an OBS Partitioned Table?

In this example, the day field is used as the partition field with the parquet encoding format (only the parquet format is supported currently) to dump car_info data to OBS. For more information, see "File System Sink Stream (Recommended)" in Data Lake Insight SQL Syntax Reference.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
create sink stream car_infos (
  carId string,
  carOwner string,
  average_speed double,
  day string
  ) partitioned by (day)
  with (
    type = "filesystem",
    file.path = "obs://obs-sink/car_infos",
    encode = "parquet",
    ak = "{{myAk}}",
    sk = "{{mySk}}"
);

Structure of the data storage directory in OBS: obs://obs-sink/car_infos/day=xx/part-x-x.

After the data is generated, the OBS partition table can be established for subsequent batch processing through the following SQL statements:

  1. Create an OBS partition table.
    1
    2
    3
    4
    5
    6
    7
    8
    create table car_infos (
      carId string,
      carOwner string,
      average_speed double
    )
      partitioned by (day string)
      stored as parquet
      location 'obs://obs-sink/car-infos';
    
  2. Restore partition information from the associated OBS path.
    1
    alter table car_infos recover partitions;