KubeVirt VM Outage Scenario - Kraken
Detailed implementation of the KubeVirt VM Outage Scenario in Kraken
KubeVirt VM Outage Scenario in Kraken
The kubevirt_vm_outage
scenario in Kraken enables users to simulate VM-level disruptions by deleting a Virtual Machine Instance (VMI) to test resilience and recovery capabilities.
Implementation
This scenario is implemented in Kraken’s core repository, with the following key functionality:
- Finding and validating the target VMI
- Deleting the VMI using the KubeVirt API
- Monitoring the recovery process
- Implementing fallback recovery if needed
Usage
You can use this scenario in your Kraken configuration file as follows:
scenarios:
- name: "kubevirt vm outage"
scenario: kubevirt_vm_outage
parameters:
vm_name: <my-application-vm>
namespace: <vm-workloads>
timeout: 60
Detailed Parameters
Parameter | Description | Required | Default | Example Values |
---|---|---|---|---|
vm_name | The name of the VMI to delete | Yes | N/A | “database-vm”, “web-server-vm” |
namespace | The namespace where the VMI is located | No | “default” | “openshift-cnv”, “vm-workloads” |
timeout | How long to wait (in seconds) for VMI to become running before attempting recovery | No | 60 | 30, 120, 300 |
Execution Flow
When executed, the scenario follows this process:
- Initialization: Validates KubeVirt is installed and configures the KubeVirt client
- VMI Validation: Checks if the target VMI exists and is in Running state
- State Preservation: Saves the initial state of the VMI
- Chaos Injection: Deletes the VMI using the KubeVirt API
- Wait for Running: Waits for VMI to become running again, up to the timeout specified
- Recovery Monitoring: Checks if the VMI is automatically restored
- Manual Recovery: If automatic recovery doesn’t occur, manually recreates the VMI
- Validation: Confirms the VMI is running correctly
Sample Configuration
Here’s an example configuration for using the kubevirt_vm_outage
scenario:
scenarios:
- name: "kubevirt outage test"
scenario: kubevirt_vm_outage
parameters:
vm_name: my-vm
namespace: kubevirt
duration: 60
For multiple VMs in different namespaces:
scenarios:
- name: "kubevirt outage test - app VM"
scenario: kubevirt_vm_outage
parameters:
vm_name: app-vm
namespace: application
duration: 120
- name: "kubevirt outage test - database VM"
scenario: kubevirt_vm_outage
parameters:
vm_name: db-vm
namespace: database
duration: 180
Combining with Other Scenarios
For more comprehensive testing, you can combine this scenario with other Kraken scenarios in the list of chaos_scenarios in the config file:
kraken:
kubeconfig_path: ~/.kube/config # Path to kubeconfig
...
chaos_scenarios:
- hog_scenarios:
- scenarios/kube/cpu-hog.yml
- kubevirt_vm_outage:
- scenarios/kubevirt/kubevirt-vm-outage.yaml