This section describes how you can configure atop and kdump on Linux ECSs for performance analysis.
The method for configuring atop varies with the OS version.
atop
kdump
atop is a monitor for Linux that can report the activity of all processes and resource consumption by all processes at regular intervals. It shows system-level activity related to the CPU, memory, disks, and network layers for every process. It also logs system and process activities daily and saves the logs in disks for long-term analysis.
# wget https://www.atoptool.nl/download/atop-2.6.0-1.el8.x86_64.rpm
Modify the following parameters, save the modification, and exit:
LOGINTERVAL=15 LOGGENERATIONS=28
# systemctl status atop
atop.service - Atop advanced performance monitor Loaded: loaded (/usr/lib/systemd/system/atop.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2021-06-19 14:46:10 CST; 8s ago Docs: man:atop(1) Process: 6391 ExecStartPost=/usr/bin/find ${LOGPATH} -name atop_* -mtime +${LOGGENERATIONS} -exec rm -v {} ; (code=exited, status=0/SUCCESS) Process: 6388 ExecStartPre=/bin/sh -c test -n "$LOGGENERATIONS" -a "$LOGGENERATIONS" -eq "$LOGGENERATIONS" (code=exited, status=0/SUCCESS) Process: 6387 ExecStartPre=/bin/sh -c test -n "$LOGINTERVAL" -a "$LOGINTERVAL" -eq "$LOGINTERVAL" (code=exited, status=0/SUCCESS) Main PID: 6390 (atop) Tasks: 1 (limit: 23716) Memory: 4.1M CGroup: /system.slice/atop.service └─6390 /usr/bin/atop -w /var/log/atop/atop_20210619 15 Jun 19 14:46:10 ecs-centos8 systemd[1]: atop.service: Succeeded. Jun 19 14:46:10 ecs-centos8 systemd[1]: Stopped Atop advanced performance monitor. Jun 19 14:46:10 ecs-centos8 systemd[1]: Starting Atop advanced performance monitor... Jun 19 14:46:10 ecs-centos8 systemd[1]: Started Atop advanced performance monitor.
# wget https://www.atoptool.nl/download/atop-2.6.0-1.el7.x86_64.rpm
Upload the atop-2.6.0-1.el7.x86_64.rpm package to the target ECS.
Modify the following parameters, save the modification, and exit:
LOGINTERVAL=15 LOGGENERATIONS=28
# systemctl status atop
atop will sample system performance data based on the specified interval and save the data to the /var/log/atop/ directory.
atop.service - Atop advanced performance monitor Loaded: loaded (/usr/lib/systemd/system/atop.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2021-06-19 11:49:47 CST; 2h 27min ago Docs: man:atop(1) Process: 8231 ExecStartPost=/usr/bin/find ${LOGPATH} -name atop_* -mtime +${LOGGENERATIONS} -exec rm -v {} ; (code=exited, status=0/SUCCESS) Process: 8225 ExecStartPre=/bin/sh -c test -n "$LOGGENERATIONS" -a "$LOGGENERATIONS" -eq "$LOGGENERATIONS" (code=exited, status=0/SUCCESS) Process: 8223 ExecStartPre=/bin/sh -c test -n "$LOGINTERVAL" -a "$LOGINTERVAL" -eq "$LOGINTERVAL" (code=exited, status=0/SUCCESS) Main PID: 8229 (atop) CGroup: /system.slice/atop.service └─8229 /usr/bin/atop -w /var/log/atop/atop_20210619 15 Jun 19 11:49:47 ecs-centos7 systemd[1]: Stopped Atop advanced performance monitor. Jun 19 11:49:47 ecs-centos7 systemd[1]: Starting Atop advanced performance monitor... Jun 19 11:49:47 ecs-centos7 systemd[1]: Started Atop advanced performance monitor.
# wget https://www.atoptool.nl/download/atop-2.6.0-1.src.rpm
# rpmbuild -bb atop-2.6.0.spec
# cd /usr/src/packages/RPMS/x86_64
# rpm -ivh atop-2.6.0-1.x86_64.rpm
Modify the following parameters, save the modification, and exit:
LOGINTERVAL=15 LOGGENERATIONS=28
# systemctl status atop
atop.service - Atop advanced performance monitor Loaded: loaded (/usr/lib/systemd/system/atop.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2021-06-19 16:50:01 CST; 6s ago Docs: man:atop(1) Process: 2242 ExecStartPost=/usr/bin/find ${LOGPATH} -name atop_* -mtime +${LOGGENERATIONS} -exec rm -v {} ; (code=exited, status=0/SUCCESS) Process: 2240 ExecStartPre=/bin/sh -c test -n "$LOGGENERATIONS" -a "$LOGGENERATIONS" -eq "$LOGGENERATIONS" (code=exited, status=0/SUCCESS) Process: 2239 ExecStartPre=/bin/sh -c test -n "$LOGINTERVAL" -a "$LOGINTERVAL" -eq "$LOGINTERVAL" (code=exited, status=0/SUCCESS) Main PID: 2241 (atop) Tasks: 1 (limit: 4915) CGroup: /system.slice/atop.service └─2241 /usr/bin/atop -w /var/log/atop/atop_20210619 15 Jun 19 16:50:01 ecs-suse15 systemd[1]: Starting Atop advanced performance monitor... Jun 19 16:50:01 ecs-suse15 systemd[1]: Started Atop advanced performance monitor.
If the version is 220 or later, go to the next step.
Otherwise, delete parameter --now from the Makefile of atop.
# vi atop-2.6.0/Makefile
Delete parameter --now following the systemctl command.
then /bin/systemctl disable atop 2> /dev/null; \ /bin/systemctl disable atopacct 2> /dev/null; \ /bin/systemctl daemon-reload; \ /bin/systemctl enable atopacct; \ /bin/systemctl enable atop; \ /bin/systemctl enable atop-rotate.timer; \
# make systemdinstall
Make the following modifications, save the file, and exit.
LOGOPTS="" LOGINTERVAL=15 LOGGENERATIONS=28 LOGPATH=/var/log/atop
# systemctl status atop
atop.service - Atop advanced performance monitor Loaded: loaded (/lib/systemd/system/atop.service; enabled) Active: active (running) since Sun 2021-07-25 19:29:40 CST; 4s ago Docs: man:atop(1) Process: 5192 ExecStartPost=/usr/bin/find ${LOGPATH} -name atop_* -mtime +${LOGGENERATIONS} -exec rm -v {} ; (code=exited, status=0/SUCCESS) Process: 5189 ExecStartPre=/bin/sh -c test -n "$LOGGENERATIONS" -a "$LOGGENERATIONS" -eq "$LOGGENERATIONS" (code=exited, status=0/SUCCESS) Process: 5188 ExecStartPre=/bin/sh -c test -n "$LOGINTERVAL" -a "$LOGINTERVAL" -eq "$LOGINTERVAL" (code=exited, status=0/SUCCESS) Main PID: 5191 (atop) CGroup: /system.slice/atop.service └─5191 /usr/bin/atop -w /var/log/atop/atop_20210725 15 Jul 25 19:29:40 atop systemd[1]: Starting Atop advanced performance monitor... Jul 25 19:29:40 atop systemd[1]: Started Atop advanced performance monitor.
The method for configuring kdump described in this section applies to KVM ECSs running EulerOS or CentOS 7.x. For details, see Documentation for kdump.
kdump is a feature of the Linux kernel that creates crash dumps in the event of a kernel crash. In the event of a kernel crash, kdump boots another Linux kernel and uses it to export an image of RAM, which is known as vmcore and can be used to debug and determine the cause of the crash.
If it is not installed, run the following command to install it:
# yum install -y kexec-tools
Check whether the parameters are configured.
# grep crashkernel /proc/cmdline
If the command output is displayed, this parameter has been configured.
GRUB_TIMEOUT=5 GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=rhel00/root rd.lvm.lv=rhel00/swap rhgb quiet" GRUB_DISABLE_RECOVERY="true"
Locate parameter GRUB_CMDLINE_LINUX and add crashkernel=auto after it.
path /var/crash
By default, the file is saved in the /var/crash directory.
path /home/kdump
There must be enough space in the specified path for storing the vmcore file. It is recommended that the available space be greater than or equal to the RAM size. You can also store the vmcore file on a shared device such as SAN or NFS.
Add the following content to file /etc/kdump.conf. If the content already exists, skip this step.
core_collector makedumpfile -d 31 -c
where
-c indicates compressing the vmcore file.
-d indicates leaving out irrelevant data. Generally, the value following -d is 31, which is calculated based on the following values. You can adjust the value if needed.
zero pages = 1 cache pages = 2 cache private = 4 user pages = 8 free pages = 16
Some kernel parameters control when kdump will be triggered. It is recommended that you set all the parameters as follows:
kernel.hardlockup_panic=1 kernel.panic=5 kernel.panic_on_oops=1 kernel.softlockup_panic=1 kernel.unknown_nmi_panic=1 kernel.nmi_watchdog=1
kernel.panic_on_io_nmi=1 kernel.panic_on_warn=1
# cat /proc/cmdline |grep crashkernel
BOOT_IMAGE=/boot/vmlinuz-3.10.0-514.44.5.10.h142.x86_64 root=UUID=6407d6ac-c761-43cc-a9dd-1383de3fc995 ro crash_kexec_post_notifiers softlockup_panic=1 panic=3 reserve_kbox_mem=16M nmi_watchdog=1 rd.shell=0 fsck.mode=auto fsck.repair=yes net.ifnames=0 spectre_v2=off nopti noibrs noibpb crashkernel=auto LANG=en_US.UTF-8
kernel.hardlockup_panic = 1 kernel.hung_task_panic = 0 kernel.panic = 5 kernel.panic_on_io_nmi = 0 kernel.panic_on_oops = 1 kernel.panic_on_stackoverflow = 0 kernel.panic_on_unrecovered_nmi = 0 kernel.panic_on_warn = 0 kernel.softlockup_panic = 1 kernel.unknown_nmi_panic = 1 vm.panic_on_oom = 0
# grep core_collector /etc/kdump.conf |grep -v ^"#"
core_collector makedumpfile -l --message-level 1 -d 31
# grep path /etc/kdump.conf |grep -v ^"#"
path /var/crash
# systemctl status kdump
● kdump.service - Crash recovery kernel arming Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled) Active: active (exited) since Tue 2019-04-09 19:30:24 CST; 8min ago Process: 495 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS) Main PID: 495 (code=exited, status=0/SUCCESS) CGroup: /system.slice/system-hostos.slice/kdump.service
# echo c > /proc/sysrq-trigger
After the command is executed, kdump will be triggered, the system will be restarted, and the generated vmcore file will be saved to the path specified by path.
# ll /var/crash/