This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

VMI Network Chaos

    Injects network degradation into a KubeVirt Virtual Machine Instance (VMI) by shaping traffic on the VM's tap interface inside the virt-launcher network namespace. Supports configurable bandwidth limiting, latency injection, and packet loss. Unlike node or pod network chaos, this scenario targets the tap device that connects QEMU to the bridge, so only the specific VMI is affected without disrupting OVN's BFD heartbeats or other workloads on the same node.

    How to Run VMI Network Chaos Scenarios

    Choose your preferred method to run VMI network chaos scenarios:

    Example scenario file: virt_network_chaos.yaml

    Configuration

    - id: vmi_network_chaos
      image: "quay.io/krkn-chaos/krkn-network-chaos:latest"
      wait_duration: 300
      test_duration: 120
      label_selector: ""
      service_account: ""
      taints: []
      namespace: "my-namespace"
      instance_count: 1
      execution: serial
      target: ".*"
      interfaces: []
      ingress: true
      egress: true
      latency: "100ms"
      loss: "10"
      bandwidth: "100mbit"
    

    For the common module settings please refer to the documentation.

    • target: regex to match VMI names within the namespace (e.g. "<vmi-name-prefix>-.*" or ".*" for all)
    • namespace: namespace containing the target VMIs (required; also supports regex to match multiple namespaces)
    • interfaces: list of tap interface names to target. Leave empty to auto-detect the tap device in the virt-launcher network namespace
    • ingress: shape incoming traffic to the VM
    • egress: shape outgoing traffic from the VM
    • latency: artificial network latency added to packets (e.g. "100ms", "500ms")
    • loss: percentage of packets to drop (e.g. "10" for 10%, "50" for 50%)
    • bandwidth: maximum throughput cap (e.g. "100mbit", "1gbit", "500kbit")

    Catastrophic Configurations

    The following combinations produce the most impactful chaos:

    Complete network degradation (maximum chaos):

      latency: "2000ms"
      loss: "50"
      bandwidth: "1mbit"
    

    Combines severe latency with heavy packet loss and near-complete bandwidth exhaustion.

    DNS blackout via latency (cascading failures):

      latency: "5000ms"
      loss: "0"
      bandwidth: ""
    

    5-second latency causes DNS timeouts across every service in the VM, producing cascading failures without a hard cut.

    Bandwidth starvation:

      latency: ""
      loss: "0"
      bandwidth: "100kbit"
    

    Throttles the VMI to 100 kbit/s — enough to keep connections alive but too slow for most application traffic.

    Usage

    To enable VMI network chaos scenarios edit the kraken config file, go to the section kraken -> chaos_scenarios of the yaml structure and add a new element to the list named network_chaos_ng_scenarios then add the desired scenario pointing to the scenario yaml file.

    kraken:
        ...
        chaos_scenarios:
            - network_chaos_ng_scenarios:
                - scenarios/openshift/virt_network_chaos.yaml
    

    Run

    python run_kraken.py --config config/config.yaml
    

    Run

    $ podman run --name=<container_name> --net=host --pull=always --env-host=true -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
    $ podman logs -f <container_name or container_id> # Streams Kraken logs
    $ podman inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
    
    $ docker run $(./get_docker_params.sh) --name=<container_name> --net=host --pull=always -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
    OR
    $ docker run -e <VARIABLE>=<value> --net=host --pull=always -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
    $ docker logs -f <container_name or container_id> # Streams Kraken logs
    $ docker inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
    

    TIP: Because the container runs with a non-root user, ensure the kube config is globally readable before mounting it in the container. You can achieve this with the following commands:

    kubectl config view --flatten > ~/kubeconfig && chmod 444 ~/kubeconfig && docker run $(./get_docker_params.sh) --name=<container_name> --net=host --pull=always -v ~/kubeconfig:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
    

    Supported parameters

    The following environment variables can be set on the host running the container to tweak the scenario/faults being injected:

    ex.) export <parameter_name>=<value>

    See list of variables that apply to all scenarios here that can be used/set in addition to these scenario specific variables

    ParameterDescriptionTypeDefault
    TOTAL_CHAOS_DURATIONChaos duration in secondsnumber120
    NAMESPACENamespace containing the target VMIs (required)string
    VMI_NAMERegex to match VMI names (e.g. virt-server-.* or .* for all)string.*
    LABEL_SELECTORLabel selector to filter VMIs (e.g. app=myapp)string""
    INSTANCE_COUNTMaximum number of VMIs to targetnumber1
    EXECUTIONExecution mode: serial or parallelenumserial
    INGRESSShape incoming traffic to the VMbooleantrue
    EGRESSShape outgoing traffic from the VMbooleantrue
    INTERFACESComma-separated tap interface names (empty to auto-detect)string""
    LATENCYArtificial latency added to packets (e.g. 100ms, 500ms)string""
    LOSSPacket loss percentage (e.g. 10 for 10%)string""
    BANDWIDTHMaximum throughput cap (e.g. 100mbit, 1gbit)string""
    WAIT_DURATIONSeconds to wait before running the next scenario in the same filenumber300
    IMAGENetwork chaos injection workload imagestringquay.io/krkn-chaos/krkn-network-chaos:latest
    TAINTSList of taints for which tolerations are created (e.g. ["node-role.kubernetes.io/master:NoSchedule"])string[]
    SERVICE_ACCOUNTOptional service account for the scenario workloadstring""

    NOTE In case of using custom metrics profile or alerts profile when CAPTURE_METRICS or ENABLE_ALERTS is enabled, mount the metrics profile from the host on which the container is run using podman/docker under /home/krkn/kraken/config/metrics-aggregated.yaml and /home/krkn/kraken/config/alerts. For example:

    $ podman run --name=<container_name> --net=host --pull=always --env-host=true -v <path-to-custom-metrics-profile>:/home/krkn/kraken/config/metrics-aggregated.yaml -v <path-to-custom-alerts-profile>:/home/krkn/kraken/config/alerts -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:vmi-network-chaos
    
    krknctl run vmi-network-chaos [--<parameter> <value>]
    

    Can also set any global variable listed here

    VMI Network Chaos Parameters

    ArgumentTypeDescriptionRequiredDefault Value
    --chaos-durationnumberChaos duration in secondsfalse120
    --namespacestringNamespace containing the target VMIstrue
    --targetstringRegex to match VMI names (e.g. <vmi-name-prefix>-.* or .* for all)false.*
    --label-selectorstringLabel selector to filter VMIs (e.g. app=myapp)false
    --instance-countnumberMaximum number of VMIs to targetfalse1
    --executionenumExecution mode: parallel or serialfalseserial
    --ingressbooleanShape incoming traffic to the VMfalsetrue
    --egressbooleanShape outgoing traffic from the VMfalsetrue
    --interfacesstringComma-separated tap interface names (empty to auto-detect)false
    --latencystringArtificial latency added to packets (e.g. 100ms, 500ms)false
    --lossstringPacket loss percentage (e.g. 10 for 10%)false
    --bandwidthstringMaximum throughput cap (e.g. 100mbit, 1gbit, 500kbit)false
    --imagestringNetwork chaos injection workload imagefalsequay.io/krkn-chaos/krkn-network-chaos:latest
    --taintsstringComma-separated taints for which tolerations are created (e.g. node-role.kubernetes.io/master:NoSchedule)false
    --service-accountstringOptional service account for the scenario workloadfalse
    --wait-durationnumberSeconds to wait before running the next scenario in the same filefalse300

    Parameter Format Details

    VMI Selection:

    • --namespace: required; supports regex to match multiple namespaces (e.g. virt-density-.*)
    • --target: regex matched against VMI names (e.g. <vmi-name-prefix>-.* targets all VMIs whose name starts with that prefix)
    • --label-selector: Kubernetes label selector in key=value format
    • Use --instance-count to limit how many matching VMIs are targeted

    Traffic Shaping Values:

    • --latency: any value accepted by Linux tc netem delay (e.g. 100ms, 1s, 500ms)
    • --loss: integer percentage without the % symbol (e.g. 10 = 10%)
    • --bandwidth: any value accepted by Linux tc HTB rate (e.g. 100mbit, 1gbit, 500kbit)
    • At least one of --latency, --loss, or --bandwidth should be set

    Interface Detection:

    • Leave --interfaces empty to let the scenario auto-detect the tap device inside the virt-launcher network namespace
    • Specify explicitly (e.g. tap0) only if auto-detection fails or you want to target a specific interface

    Example Commands

    Add latency and packet loss to all VMIs in a namespace:

    krknctl run vmi-network-chaos \
      --namespace <namespace> \
      --target ".*" \
      --latency 100ms \
      --loss 10 \
      --chaos-duration 120
    

    Bandwidth cap on a specific VMI:

    krknctl run vmi-network-chaos \
      --namespace <namespace> \
      --target "<vmi-name>" \
      --bandwidth 1mbit \
      --ingress true \
      --egress true \
      --chaos-duration 300
    

    Catastrophic combined degradation:

    krknctl run vmi-network-chaos \
      --namespace <namespace> \
      --target "<vmi-name-prefix>-.*" \
      --instance-count 3 \
      --execution parallel \
      --latency 2000ms \
      --loss 50 \
      --bandwidth 1mbit \
      --chaos-duration 180
    

    DNS blackout simulation (high latency, no packet drop):

    krknctl run vmi-network-chaos \
      --namespace <namespace> \
      --target ".*" \
      --latency 5000ms \
      --chaos-duration 60