KubeVirt VM Outage Scenario - Kraken

Detailed implementation of the KubeVirt VM Outage Scenario in Kraken

KubeVirt VM Outage Scenario in Kraken

The kubevirt_vm_outage scenario in Kraken enables users to simulate VM-level disruptions by deleting a Virtual Machine Instance (VMI) to test resilience and recovery capabilities.

Implementation

This scenario is implemented in Kraken’s core repository, with the following key functionality:

  1. Finding and validating the target VMI
  2. Deleting the VMI using the KubeVirt API
  3. Monitoring the recovery process
  4. Implementing fallback recovery if needed

Usage

You can use this scenario in your Kraken configuration file as follows:

scenarios:
  - name: "kubevirt vm outage"
    scenario: kubevirt_vm_outage
    parameters:
      vm_name: <my-application-vm>
      namespace: <vm-workloads>
      timeout: 60

Detailed Parameters

ParameterDescriptionRequiredDefaultExample Values
vm_nameThe name of the VMI to deleteYesN/A“database-vm”, “web-server-vm”
namespaceThe namespace where the VMI is locatedNo“default”“openshift-cnv”, “vm-workloads”
timeoutHow long to wait (in seconds) for VMI to become running before attempting recoveryNo6030, 120, 300

Execution Flow

When executed, the scenario follows this process:

  1. Initialization: Validates KubeVirt is installed and configures the KubeVirt client
  2. VMI Validation: Checks if the target VMI exists and is in Running state
  3. State Preservation: Saves the initial state of the VMI
  4. Chaos Injection: Deletes the VMI using the KubeVirt API
  5. Wait for Running: Waits for VMI to become running again, up to the timeout specified
  6. Recovery Monitoring: Checks if the VMI is automatically restored
  7. Manual Recovery: If automatic recovery doesn’t occur, manually recreates the VMI
  8. Validation: Confirms the VMI is running correctly

Sample Configuration

Here’s an example configuration for using the kubevirt_vm_outage scenario:

scenarios:
  - name: "kubevirt outage test"
    scenario: kubevirt_vm_outage
    parameters:
      vm_name: my-vm
      namespace: kubevirt
      duration: 60

For multiple VMs in different namespaces:

scenarios:
  - name: "kubevirt outage test - app VM"
    scenario: kubevirt_vm_outage
    parameters:
      vm_name: app-vm
      namespace: application
      duration: 120
  
  - name: "kubevirt outage test - database VM"
    scenario: kubevirt_vm_outage
    parameters:
      vm_name: db-vm
      namespace: database
      duration: 180

Combining with Other Scenarios

For more comprehensive testing, you can combine this scenario with other Kraken scenarios in the list of chaos_scenarios in the config file:

kraken:
    kubeconfig_path: ~/.kube/config                     # Path to kubeconfig
    ...
    chaos_scenarios:
        - hog_scenarios:
            - scenarios/kube/cpu-hog.yml
        -  kubevirt_vm_outage:
               - scenarios/kubevirt/kubevirt-vm-outage.yaml
Last modified August 12, 2025: ordering krkn pages (0b80282)