supervisor_failover - openconfig/featureprofiles GitHub Wiki
Verify that containers and volumes persist across a control processor switchover (failover).
- Build the test container as described below.
- Pass the tarball of the container to the test as an argument.
The test container is available in the feature profile repository under
internal/cntrsrv.
Start by entering in that directory and running the following commands:
$ cd internal/cntrsrv
$ go mod vendor
$ CGO_ENABLED=0 go build .
$ docker build -f build/Dockerfile.local -t cntrsrv_image:latest .At this point you will have a container image build for the test container.
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
cntrsrv_image latest 8d786a6eebc8 3 minutes ago 21.4MBNow export the container to a tarball.
$ docker save -o /tmp/cntrsrv.tar cntrsrv_image:latestThis is the tarball that will be used during tests.
-
Load Image: Using
gnoi.Containerz.Deploy, load a container image onto the device. -
Verify Load: Verify the image exists on the device using
gnoi.Containerz.List. -
Trigger Failover: Identify the standby control processor using gNMI and trigger a switchover using
gnoi.System.SwitchControlProcessor. - Verify Persistence: After the switchover, verify the loaded image is still available on the new primary control processor.
-
Setup:
- Using
gnoi.Containerz.CreateVolume, create a volume. - Using
gnoi.Containerz.Deploy, load a container image. - Using
gnoi.Containerz.Start, start a container that mounts the created volume.
- Using
-
Verify Setup: Verify the container is in a
RUNNINGstate and the volume exists. -
Trigger Failover: Identify the standby control processor using gNMI. Trigger a switchover using
gnoi.System.SwitchControlProcessor. -
Verify Recovery: After the switchover, verify that the container is still
RUNNINGand the volume still exists usinggnoi.Containerz.
-
Load and Remove Image: Load a container image, then remove it using
gnoi.Containerz.Deploywith theimage_deleteoption. - Verify Removal: Verify the image no longer exists on the device.
- Trigger Failover: Trigger a control processor switchover.
- Verify Persistence of Removal: After the switchover, verify the image does not exist on the new primary control processor.
-
Start and Remove Container: Start a container, then remove it using
gnoi.Containerz.Remove. - Verify Removal: Verify the container no longer exists.
- Trigger Failover: Trigger a control processor switchover.
- Verify Persistence of Removal: After the switchover, verify the container does not exist on the new primary control processor.
- Load Image: Load a container image onto the device.
- First Failover: Trigger a control processor switchover to the standby.
- Verify Persistence: After the first switchover, verify the image is still available on the new primary.
- Second Failover: Trigger another control processor switchover, returning to the original primary.
- Verify Final Persistence: After the second switchover, verify the image is still available.
-
Setup:
- Using
gnoi.Containerz.CreateVolume, create a volume. - Using
gnoi.Containerz.Deploy, load a container image. - Using
gnoi.Containerz.Start, start a container that mounts the created volume.
- Using
-
Verify Setup: Verify the container is in a
RUNNINGstate and the volume exists. -
Cold Reboot: Trigger a cold reboot using
gnoi.System.Reboot. -
Verify Recovery: After the cold reboot, verify that the container is still
RUNNINGand the volume still exists usinggnoi.Containerz.
{}The below yaml defines the RPCs intended to be covered by this test.
rpcs:
gnoi:
containerz.Containerz.Deploy:
containerz.Containerz.StartContainer:
containerz.Containerz.ListContainer:
containerz.Containerz.CreateVolume:
containerz.Containerz.ListVolume:
system.System.SwitchControlProcessor:
system.System.Reboot:
gnmi:
gNMI.Get:
gNMI.Subscribe: