Description
This alarm is generated when the alarm module detects EIO or EROFS errors during ClickHouse read and write every 60 seconds.
Attribute
Alarm ID
|
Alarm Severity
|
Auto Clear
|
45428
|
Major (default)
|
No
|
Parameters
Name
|
Meaning
|
Source
|
Specifies the cluster for which the alarm is generated.
|
ServiceName
|
Specifies the service for which the alarm is generated.
|
RoleName
|
Specifies the role for which the alarm is generated.
|
HostName
|
Specifies the host for which the alarm is generated.
|
Impact on the System
- ClickHouse fails to read and write data. The INSERT, SELECT, and CREATE operations on the local tables may be abnormal. Distributed tables are not affected.
- Services are affected, and I/Os fail.
Possible Causes
The disk is aged or has bad sectors.
Procedure
- On FusionInsight Manager, choose O&M > Alarm > Alarms > ALM-45428 ClickHouse Disk I/O Exception. Check the role name and the IP address of the host where the alarm is generated in Location.
- Use PuTTY to log in to the node for which the fault is generated as user root.
- Run the df -h command to check the mount directory and find the disk mounted to the faulty directory.
- Run the smartctl -a /dev/sd* command to check disks.
- If SMART Health Status: OK is displayed, as shown in the following figure, the disk is healthy. In this case, go to 7.

- If the number following Elements in grown defect list is not 0, as shown in the following figure, the disk may have bad sectors. If SMART Health Status: FAILURE is displayed, the disk is in the sub-health state. In this case, go to 5.

- Rectify the fault by following the instructions provided in "Hard Disk Mounted to the ClickHouse Partition Directory Is Faulty" in .
- After the fault is rectified, manually clear the alarm on FusionInsight Manager and check whether the alarm is generated again during the periodic check.
- If yes, go to 7.
- If no, no further action is required.
Collect the fault information.
- On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.
- Expand the Service drop-down list, and select ClickHouse for the target cluster.
- Choose the corresponding host form the host list.
- Click
in the upper right corner, and set Start Date and End Date for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click Download. - Contact O&M personnel and provide the collected logs.
Alarm Clearing
If the alarm has no impact, manually clear the alarm.