Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
10 KiB
Improving Read Performance Using Client Metadata Cache
Scenario
Improve the HDFS read performance by using the client to cache the metadata for block locations.
Procedure
Navigation path for setting parameters:
On FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > HDFS > Configurations, select All Configurations, and enter the parameter name in the search box.
Parameter |
Description |
Default Value |
---|---|---|
dfs.client.metadata.cache.enabled |
Enables or disables the client to cache the metadata for block locations. Set this parameter to true and use it along with the dfs.client.metadata.cache.pattern parameter to enable the cache. |
false |
dfs.client.metadata.cache.pattern |
Indicates the regular expression pattern of the path of the file to be cached. Only the metadata for block locations of these files is cached until the metadata expires. This parameter is valid only when dfs.client.metadata.cache.enabled is set to true. Example: /test.* indicates that all files whose paths start with /test are read. NOTE:
|
- |
dfs.client.metadata.cache.expiry.sec |
Indicates the duration for caching metadata. The cache entry becomes invalid after its caching time exceeds this duration. Even metadata that is frequently used during the caching process can become invalid. Time suffixes s/m/h can be used to indicate second, minute, and hour, respectively. NOTE:
If this parameter is set to 0s, the cache function is disabled. |
60s |
dfs.client.metadata.cache.max.entries |
Indicates the maximum number of non-expired data items that can be cached at a time. |
65536 |