Feature: kdump configuration - cockpit-project/cockpit GitHub Wiki

Scope

Initial scope:

  • enable/disable/restart/status of kdump
  • basic configuration options: path, NFS/SSH, adding the compression option would be a nice to have.
  • ability test the config (the echo c > /proc/sysrq-trigger stuff)

For later:

  • Ability to reboot the box specifically from this view but only after a change the crashkernel is made. maybe a red box or something indicating "this won't change until a reboot occurs" or something - Ben/Charles
  • Output of 'systemctl status kdump' or 'service kdump status' regardless of the state of the configurations - Ben
  • rsaw's (Ryan Sawhill) xsos tool has a great view for kdump configurations from a sosreport (xsos -k ), maybe this could be useful inspiration? - Ben/Charles
  • The ability to update kexec-tools from this view and restart kdump services - Ben/Charles
  • enablement/disablement of sysctl triggers (maybe some sort of check box list for panic triggers?) - Ben/Charles
  • /etc/kdump.conf configuration changes - Ben/Charles
  • is Crash outside the scope for now? - Andreas
  • is support for "Red Hat Support Tool" to upload issues to Customer Portal part of the scope? - Andreas
  • abrt-vmcore support? - Andreas

Notes

User Stories

  • Robert is a developer who works at a small software company and got tossed the "sysadmin-hat". They have a couple of internal build servers. One of the servers keeps crashing at what seems to be very random times. This is very frustrating for everyone, so Robert wants to figure out what's wrong with it.

Workflows

Robert logs in to the system with Cockpit. He goes to the Crash section. It seems kdump crash dumping is currently turned off, so he turns it off. He sets the dump location to one of their other servers. Then he runs a quick test to ensure that it creates a dump correctly. A week later, the server crashes again. He logs in to the other server and downloads the crash dump from the system to his laptop. He examines the report closer and finds that it's a driver that is the cause of the crash. He disables the driver for now and reports the issue to his OS provider. They ask him to attach the crash report to an issue. He does this and a couple of days later, they release a fix. He applies that update to the server and the issue with the crashers are now gone.

Wireframes

mockup Full size

Prior art

Feedback