Network Chaos Scenario using Krkn-Hub

This scenario introduces network latency, packet loss, bandwidth restriction in the egress traffic of a Node’s interface using the tc and Netem. For more information refer the following documentation.

Run

If enabling Cerberus to monitor the cluster and pass/fail the scenario post chaos, refer docs. Make sure to start it before injecting the chaos and set CERBERUS_ENABLED environment variable for the chaos injection container to autoconnect.

$ podman run --name=<container_name> --net=host --env-host=true -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:network-chaos
$ podman logs -f <container_name or container_id> # Streams Kraken logs
$ podman inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
$ docker run -e <VARIABLE>=<value> --net=host -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:network-chaos

$ docker logs -f <container_name or container_id> # Streams Kraken logs
$ docker inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario

Supported parameters

The following environment variables can be set on the host running the container to tweak the scenario/faults being injected:

example: export <parameter_name>=<value>

See list of variables that apply to all scenarios here that can be used/set in addition to these scenario specific variables

Egress Scenarios
ParameterDescriptionDefault
DURATIONDuration in seconds - during with network chaos will be applied.300
NODE_NAMENode name to inject faults in case of targeting a specific node; Can set multiple node names separated by a comma""
LABEL_SELECTORWhen NODE_NAME is not specified, a node with matching label_selector is selected for running.node-role.kubernetes.io/master
INSTANCE_COUNTTargeted instance count matching the label selector1
INTERFACESList of interface on which to apply the network restriction.[]
EXECUTIONExecute each of the egress option as a single scenario(parallel) or as separate scenario(serial).parallel
EGRESSDictonary of values to set network latency(latency: 50ms), packet loss(loss: 0.02), bandwidth restriction(bandwidth: 100mbit){bandwidth: 100mbit}
Ingress Scenarios
ParameterDescriptionDefault
DURATIONDuration in seconds - during with network chaos will be applied.300
TARGET_NODE_AND_INTERFACE# Dictionary with key as node name(s) and value as a list of its interfaces to test. For example: {ip-10-0-216-2.us-west-2.compute.internal: [ens5]}""
LABEL_SELECTORWhen NODE_NAME is not specified, a node with matching label_selector is selected for running.node-role.kubernetes.io/master
INSTANCE_COUNTTargeted instance count matching the label selector1
EXECUTIONUsed to specify whether you want to apply filters on interfaces one at a time or all at once.parallel
NETWORK_PARAMSlatency, loss and bandwidth are the three supported network parameters to alter for the chaos test. For example: {latency: 50ms, loss: ‘0.02’}""
WAIT_DURATIONEnsure that it is at least about twice of test_duration300

For example:

$ podman run --name=<container_name> --net=host --env-host=true -v <path-to-custom-metrics-profile>:/home/krkn/kraken/config/metrics-aggregated.yaml -v <path-to-custom-alerts-profile>:/home/krkn/kraken/config/alerts -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:container-scenarios