Reset vs Virtual Machines (Max Payload Size) - tenstorrent/tt-kmd GitHub Wiki

Each PCIe device has a Max Payload Size setting. Devices must not generate packets larger than their MPS and must reject inbound packets larger than their MPS as Malformed TLPs. Malformed TLP errors may trigger a system reboot. Any devices that communicate using DMA (including the root complex) must agree on MPS. The specified power-on value for MPS is 128B, but the system firmware and Linux kernel may increase MPS for performance.

KVM-qemu virtualizes the guest’s view of MPS. Guests can’t write MPS and read a cached value that may not match current hardware.

When we power-cycle reset within a VM, MPS is reset to the power-on default which may not match the RC MPS. When the device performs a large DMA read, the RC will split the response according to its MPS which now exceeds the device MPS causing errors.

To resolve this, KMD will save and restore MPS around resets using special methods to bypass the hypervisor. In particular, MPS can also be accessed through the PCIe controller’s DBI interface.

On Wormhole we can enable DBI using scratch registers and perform a NOC loopback access.

On Blackhole there is a permanently-mapped DBI aperture (outbound TLB) that can be accessed via NOC loopback.