kdump on arm64 - AmpereComputing/ampere-lts-kernel---DEPRECATED GitHub Wiki

Refer to: https://www.kernel.org/doc/Documentation/admin-guide/kdump/kdump.rst

Why kdump so error-prone?

  • Quote from Bowen's comments, kdump involves a long chain, any node fails, kdump fails.
Primary kernel (reserve memory for crashkernel) -> kexec tool (and its config) -> triggers kdump (panic, etc) ->secondary kernel (AKA capture kernel) -> kdump initrd image -> makedumpfile tool -> vmcore file -> crash utility
  • arm64 kdump(crashkernel) features are still improving and updating. Try using latest version of user space tools if possible. Reporting issues to us to upstream mailing list.
  • Secondary kernel usually boots in a limited environment (e.g, kernel panic), FW, HW and kernel states are different from primary kernel. (see ref [4])

Reserve crashkernel memory for Altra

There are two ways to reserve crashkernel memory:

  1. crashkernel=size. The crashkernel region can be automatically placed by the system kernel at run time.
  2. crashkernel=size@offset. User specify offset. If memory [offset, offset+size] is used, reservation may fail.

In case 1), arm64 kernel by default reserves memory in region bellow 4GB (a.k.a. low region). If for some reason there is no enough space in low region, reservation fails.

On Altra Mt.Jade, reserving >256MB memory in low region may very likely fail, since Altra by design has limited system RAM bellow 4GB address space. We need additional kernel patches (Not fully upstreamed yet on 04/28/2022) to allow reserving crashkernel above 4G.

We backported patches from: "[PATCH v14 00/11] support reserving crashkernel above 4G on arm64 kdump", to make 5.4 and 5.10 LTS kernel supports reservation above 4GB.

Update on 2022/04/28: Above patchset has been updated to PATCH v21 0/5(https://lkml.org/lkml/2022/2/26/350). We backported it to 5.15 Longterm kernel. This patchset requires user space kexec-tool patch arm64: support more than one crash kernel regions. Please update your kexec-tool to latest version (/sbin/kexec and /sbin/vmcore-dmesg).

Recommended kernel option:

"crashkernel=512M-12G:128M,12G-64G:256M,64G-128G:512M,128G-:768M crashkernel=16M,low" This option tries to reserved 16MB bellow 4G, and 768MB above 4G if system RAM>128G. From dmesg:

[    0.000000] Reserving 16MB of low memory at 3052MB for crashkernel (low RAM limit: 4096MB) //Note, offset and size may be different
[    0.000000] Reserving 768MB of memory at 67370240MB for crashkernel (System RAM: 523009MB) //Note, offset and size may be different

Why reserve crashkernel memory in low region

We need to make sure secondary kernel can boot and do kdump without error. Several kernel settings account for memory reservation size in low region:

  • Does secondary kernel require 32-bit DMA device and this device is NOT behind IOMMU
  • The swiotlb buffer size in secondary kernel

If we don't need crashkernel memory in low region, set option as bellow:

crashkernel=4G,high crashkernel=0,low

Optional patch:

Centos-8.3 kexec-tools(2.0.22) by default uses kexec_file_load() syscall. These two patches add Kexec_file crashdump support in kernel. Otherwise, kexec cannot load crashkernel.

0008-libfdt-include-fdt_addresses.c.patch
0009-arm64-kexec_file-add-crash-dump-support.patch

If your kexec-tools does not use kexec_file_load() to load crashkernel, just ignore these two patches.

To support kexec_file_load(), also enable 'CONFIG_KEXEC_FILE'.

kernel option details:

  1. If set option: "crashkernel=512M-12G:128M,12G-64G:256M,64G-128G:512M,128G-:768M" By default, "Kernel will allocate at least 256M memory below 4G automatically if crashkernel=Y,low is not specified." [1] On some platforms, this automatical allocation may fail: "Cannot reserve 256MB crashkernel low memory, please try smaller size".

  2. If above error, try allocating smaller size in low region: E.g: "crashkernel=512M-12G:128M,12G-64G:256M,64G-128G:512M,128G-:768M crashkernel=16M,low" This size depends on how much memory the crash kernel may use in low region, e.g, for 32-bit DMA buffer.

Reference:

[1]: kernel by default reserve 256MB in low region

	/* crashkernel=Y,low */
	ret = parse_crashkernel_low(boot_command_line, low_mem_limit, &low_size, &base);
	if (ret) {
		/*
		 * two parts from kernel/dma/swiotlb.c:
		 * -swiotlb size: user-specified with swiotlb= or default.
		 *
		 * -swiotlb overflow buffer: now hardcoded to 32k. We round it
		 * to 8M for other buffers that may need to stay low too. Also
		 * make sure we allocate enough extra low memory so that we
		 * don't run out of DMA buffers for 32-bit devices.
		 */
		low_size = max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20);

[2]: LTS 5.4 kernel patches: https://github.com/AmpereComputing/ampere-lts-kernel/issues/102

[3]: LTS 5.10 kernel patches: https://github.com/AmpereComputing/ampere-lts-kernel/issues/94

[4]: AGDI triggered kdump fails because of SDEI: https://github.com/AmpereComputing/ampere-lts-kernel/issues/160