doc-exports/docs/dli/umn/dli_03_0086.html
Su, Xiaomeng 12dd64efc7 dli_umn_20240430
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
2024-05-15 11:56:22 +00:00

1.3 KiB

How Do I Merge Small Files?

If a large number of small files are generated during SQL execution, job execution and table query will take a long time. In this case, you should merge small files.

  1. Set the configuration item as follows:

    spark.sql.shuffle.partitions = Number of partitions (number of the generated small files in this case)

  2. Execute the following SQL statements:
    INSERT OVERWRITE TABLE tablename
    select  * FROM  tablename distribute by rand()