VMI Network Chaos
How to Run VMI Network Chaos Scenarios
Choose your preferred method to run VMI network chaos scenarios:
Example scenario file: virt_network_chaos.yaml
Configuration
- id: vmi_network_chaos
image: "quay.io/krkn-chaos/krkn-network-chaos:latest"
wait_duration: 300
test_duration: 120
label_selector: ""
service_account: ""
taints: []
namespace: "my-namespace"
instance_count: 1
execution: serial
target: ".*"
interfaces: []
ingress: true
egress: true
latency: "100ms"
loss: "10"
bandwidth: "100mbit"
For the common module settings please refer to the documentation.
target: regex to match VMI names within the namespace (e.g."<vmi-name-prefix>-.*"or".*"for all)namespace: namespace containing the target VMIs (required; also supports regex to match multiple namespaces)interfaces: list of tap interface names to target. Leave empty to auto-detect the tap device in the virt-launcher network namespaceingress: shape incoming traffic to the VMegress: shape outgoing traffic from the VMlatency: artificial network latency added to packets (e.g."100ms","500ms")loss: percentage of packets to drop (e.g."10"for 10%,"50"for 50%)bandwidth: maximum throughput cap (e.g."100mbit","1gbit","500kbit")
Note
At least one oflatency, loss, or bandwidth should be set. Setting all three simultaneously compounds the degradation.Catastrophic Configurations
The following combinations produce the most impactful chaos:
Complete network degradation (maximum chaos):
latency: "2000ms"
loss: "50"
bandwidth: "1mbit"
Combines severe latency with heavy packet loss and near-complete bandwidth exhaustion.
DNS blackout via latency (cascading failures):
latency: "5000ms"
loss: "0"
bandwidth: ""
5-second latency causes DNS timeouts across every service in the VM, producing cascading failures without a hard cut.
Bandwidth starvation:
latency: ""
loss: "0"
bandwidth: "100kbit"
Throttles the VMI to 100 kbit/s — enough to keep connections alive but too slow for most application traffic.
Usage
To enable VMI network chaos scenarios edit the kraken config file, go to the section kraken -> chaos_scenarios of the yaml structure
and add a new element to the list named network_chaos_ng_scenarios then add the desired scenario
pointing to the scenario yaml file.
kraken:
...
chaos_scenarios:
- network_chaos_ng_scenarios:
- scenarios/openshift/virt_network_chaos.yaml
Note
You can specify multiple scenario files of the same type by adding additional paths to the list:
kraken:
chaos_scenarios:
- network_chaos_ng_scenarios:
- scenarios/openshift/virt_network_chaos.yaml
- scenarios/openshift/virt_network_chaos_2.yaml
You can also combine multiple different scenario types in the same config.yaml file:
kraken:
chaos_scenarios:
- network_chaos_ng_scenarios:
- scenarios/openshift/virt_network_chaos.yaml
- pod_disruption_scenarios:
- scenarios/pod-kill.yaml
Run
python run_kraken.py --config config/config.yaml
Run
$ podman run --name=<container_name> --net=host --pull=always --env-host=true -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
$ podman logs -f <container_name or container_id> # Streams Kraken logs
$ podman inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
$ docker run $(./get_docker_params.sh) --name=<container_name> --net=host --pull=always -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
OR
$ docker run -e <VARIABLE>=<value> --net=host --pull=always -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
$ docker logs -f <container_name or container_id> # Streams Kraken logs
$ docker inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
TIP: Because the container runs with a non-root user, ensure the kube config is globally readable before mounting it in the container. You can achieve this with the following commands:
kubectl config view --flatten > ~/kubeconfig && chmod 444 ~/kubeconfig && docker run $(./get_docker_params.sh) --name=<container_name> --net=host --pull=always -v ~/kubeconfig:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
Supported parameters
The following environment variables can be set on the host running the container to tweak the scenario/faults being injected:
ex.)
export <parameter_name>=<value>
See list of variables that apply to all scenarios here that can be used/set in addition to these scenario specific variables
| Parameter | Description | Type | Default |
|---|---|---|---|
| TOTAL_CHAOS_DURATION | Chaos duration in seconds | number | 120 |
| NAMESPACE | Namespace containing the target VMIs (required) | string | |
| VMI_NAME | Regex to match VMI names (e.g. virt-server-.* or .* for all) | string | .* |
| LABEL_SELECTOR | Label selector to filter VMIs (e.g. app=myapp) | string | "" |
| INSTANCE_COUNT | Maximum number of VMIs to target | number | 1 |
| EXECUTION | Execution mode: serial or parallel | enum | serial |
| INGRESS | Shape incoming traffic to the VM | boolean | true |
| EGRESS | Shape outgoing traffic from the VM | boolean | true |
| INTERFACES | Comma-separated tap interface names (empty to auto-detect) | string | "" |
| LATENCY | Artificial latency added to packets (e.g. 100ms, 500ms) | string | "" |
| LOSS | Packet loss percentage (e.g. 10 for 10%) | string | "" |
| BANDWIDTH | Maximum throughput cap (e.g. 100mbit, 1gbit) | string | "" |
| WAIT_DURATION | Seconds to wait before running the next scenario in the same file | number | 300 |
| IMAGE | Network chaos injection workload image | string | quay.io/krkn-chaos/krkn-network-chaos:latest |
| TAINTS | List of taints for which tolerations are created (e.g. ["node-role.kubernetes.io/master:NoSchedule"]) | string | [] |
| SERVICE_ACCOUNT | Optional service account for the scenario workload | string | "" |
NOTE In case of using custom metrics profile or alerts profile when CAPTURE_METRICS or ENABLE_ALERTS is enabled, mount the metrics profile from the host on which the container is run using podman/docker under /home/krkn/kraken/config/metrics-aggregated.yaml and /home/krkn/kraken/config/alerts. For example:
$ podman run --name=<container_name> --net=host --pull=always --env-host=true -v <path-to-custom-metrics-profile>:/home/krkn/kraken/config/metrics-aggregated.yaml -v <path-to-custom-alerts-profile>:/home/krkn/kraken/config/alerts -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
krknctl run vmi-network-chaos [--<parameter> <value>]
Can also set any global variable listed here
VMI Network Chaos Parameters
| Argument | Type | Description | Required | Default Value |
|---|---|---|---|---|
--chaos-duration | number | Chaos duration in seconds | false | 120 |
--namespace | string | Namespace containing the target VMIs | true | |
--target | string | Regex to match VMI names (e.g. <vmi-name-prefix>-.* or .* for all) | false | .* |
--label-selector | string | Label selector to filter VMIs (e.g. app=myapp) | false | |
--instance-count | number | Maximum number of VMIs to target | false | 1 |
--execution | enum | Execution mode: parallel or serial | false | serial |
--ingress | boolean | Shape incoming traffic to the VM | false | true |
--egress | boolean | Shape outgoing traffic from the VM | false | true |
--interfaces | string | Comma-separated tap interface names (empty to auto-detect) | false | |
--latency | string | Artificial latency added to packets (e.g. 100ms, 500ms) | false | |
--loss | string | Packet loss percentage (e.g. 10 for 10%) | false | |
--bandwidth | string | Maximum throughput cap (e.g. 100mbit, 1gbit, 500kbit) | false | |
--image | string | Network chaos injection workload image | false | quay.io/krkn-chaos/krkn-network-chaos:latest |
--taints | string | Comma-separated taints for which tolerations are created (e.g. node-role.kubernetes.io/master:NoSchedule) | false | |
--service-account | string | Optional service account for the scenario workload | false | |
--wait-duration | number | Seconds to wait before running the next scenario in the same file | false | 300 |
Parameter Format Details
VMI Selection:
--namespace: required; supports regex to match multiple namespaces (e.g.virt-density-.*)--target: regex matched against VMI names (e.g.<vmi-name-prefix>-.*targets all VMIs whose name starts with that prefix)--label-selector: Kubernetes label selector inkey=valueformat- Use
--instance-countto limit how many matching VMIs are targeted
Traffic Shaping Values:
--latency: any value accepted by Linuxtc netem delay(e.g.100ms,1s,500ms)--loss: integer percentage without the%symbol (e.g.10= 10%)--bandwidth: any value accepted by LinuxtcHTB rate (e.g.100mbit,1gbit,500kbit)- At least one of
--latency,--loss, or--bandwidthshould be set
Interface Detection:
- Leave
--interfacesempty to let the scenario auto-detect the tap device inside the virt-launcher network namespace - Specify explicitly (e.g.
tap0) only if auto-detection fails or you want to target a specific interface
Example Commands
Add latency and packet loss to all VMIs in a namespace:
krknctl run vmi-network-chaos \
--namespace <namespace> \
--target ".*" \
--latency 100ms \
--loss 10 \
--chaos-duration 120
Bandwidth cap on a specific VMI:
krknctl run vmi-network-chaos \
--namespace <namespace> \
--target "<vmi-name>" \
--bandwidth 1mbit \
--ingress true \
--egress true \
--chaos-duration 300
Catastrophic combined degradation:
krknctl run vmi-network-chaos \
--namespace <namespace> \
--target "<vmi-name-prefix>-.*" \
--instance-count 3 \
--execution parallel \
--latency 2000ms \
--loss 50 \
--bandwidth 1mbit \
--chaos-duration 180
DNS blackout simulation (high latency, no packet drop):
krknctl run vmi-network-chaos \
--namespace <namespace> \
--target ".*" \
--latency 5000ms \
--chaos-duration 60