Preparing Data on OBS

Scenarios

Before you use the SQL on OBS feature to query OBS data:

  1. You have stored the ORC data on OBS.

    For example, the ORC table has been created when you use the Hive or Spark component, and the ORC data has been stored on OBS.

    Assume that there are two ORC data files, named product_info.0 and product_info.1, whose original data is stored in the demo.db/product_info_orc/ directory of the mybucket OBS bucket. You can view their original data in Original Data.

  2. If your data files are already on OBS, perform steps in Obtaining the OBS Path of Original Data and Setting Read Permission.

    This section uses the ORC format as an example to describe how to import data. The method for importing CarbonData data is similar.

Original Data

Assume that you have stored the two ORC data files on OBS and their original data is as follows:

Obtaining the OBS Path of Original Data and Setting Read Permission

  1. Log in to the OBS management console.

    Click Service List and choose Object Storage Service to open the OBS management console.

  2. Obtain the OBS path for storing source data files.

    After the source data files are uploaded to an OBS bucket, a globally unique access path is generated. You need to specify the OBS paths of source data files when creating a foreign table.

    For details about how to view an OBS path, see "OBS Console Operation Guide > Managing Objects > Accessing an Object Using Its Object URL" in the Object Storage Service User Guide.

    For example, the OBS paths are as follows:

    1
    2
    https://obs.xxx.com/mybucket/demo.db/product_info_orc/product_info.0
    https://obs.xxx.com/mybucket/demo.db/product_info_orc/product_info.1
    

  3. Grant the OBS bucket read permission for the user.

    The user who executes the SQL on OBS function needs to obtain the read permission on the OBS bucket where the source data file is located. You can configure the ACL for the OBS buckets to grant the read permission to a specific user.

    For details, see "OBS Console Operation Guide > Permission Control > Configuring a Bucket ACL" in the Object Storage Service User Guide.