Pod Network Scenarios
Pod outage
Scenario to block the traffic ( Ingress/Egress ) of a pod matching the labels for the specified duration of time to understand the behavior of the service/other services which depend on it during downtime. This helps with planning the requirements accordingly, be it improving the timeouts or tweaking the alerts etc.
With the current network policies, it is not possible to explicitly block ports which are enabled by allowed network policy rule. This chaos scenario addresses this issue by using OVS flow rules to block ports related to the pod. It supports OpenShiftSDN and OVNKubernetes based networks.
1 - Pod Scenarios using Krkn
Sample scenario config (using a plugin)
- id: pod_network_outage
config:
namespace: openshift-console # Required - Namespace of the pod to which filter need to be applied
direction: # Optioinal - List of directions to apply filters
- ingress # Blocks ingress traffic, Default both egress and ingress
ingress_ports: # Optional - List of ports to block traffic on
- 8443 # Blocks 8443, Default [], i.e. all ports.
label_selector: 'component=ui' # Blocks access to openshift console
Pod Network shaping
Scenario to introduce network latency, packet loss, and bandwidth restriction in the Pod’s network interface. The purpose of this scenario is to observe faults caused by random variations in the network.
Sample scenario config for egress traffic shaping (using plugin)
- id: pod_egress_shaping
config:
namespace: openshift-console # Required - Namespace of the pod to which filter need to be applied.
label_selector: 'component=ui' # Applies traffic shaping to access openshift console.
network_params:
latency: 500ms # Add 500ms latency to egress traffic from the pod.
Sample scenario config for ingress traffic shaping (using plugin)
- id: pod_ingress_shaping
config:
namespace: openshift-console # Required - Namespace of the pod to which filter need to be applied.
label_selector: 'component=ui' # Applies traffic shaping to access openshift console.
network_params:
latency: 500ms # Add 500ms latency to egress traffic from the pod.
Steps
- Pick the pods to introduce the network anomaly either from label_selector or pod_name.
- Identify the pod interface name on the node.
- Set traffic shaping config on pod’s interface using tc and netem.
- Wait for the duration time.
- Remove traffic shaping config on pod’s interface.
- Remove the job that spawned the pod.
2 - Pod Network Chaos Scenarios using Krkn-hub
This scenario runs network chaos at the pod level on a Kubernetes/OpenShift cluster.
Run
If enabling Cerberus to monitor the cluster and pass/fail the scenario post chaos, refer docs. Make sure to start it before injecting the chaos and set CERBERUS_ENABLED
environment variable for the chaos injection container to autoconnect.
$ podman run --name=<container_name> --net=host --env-host=true -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:pod-network-chaos
$ podman logs -f <container_name or container_id> # Streams Kraken logs
$ podman inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
$ docker run $(./get_docker_params.sh) --name=<container_name> --net=host -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:pod-network-chaos
OR
$ docker run -e <VARIABLE>=<value> --name=<container_name> --net=host -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:pod-network-chaos
$ docker logs -f <container_name or container_id> # Streams Kraken logs
$ docker inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
Tip
Because the container runs with a non-root user, ensure the kube config is globally readable before mounting it in the container. You can achieve this with the following commands:
kubectl config view --flatten > ~/kubeconfig && chmod 444 ~/kubeconfig && docker run $(./get_docker_params.sh) --name=<container_name> --net=host -v ~kubeconfig:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:<scenario>
Supported parameters
The following environment variables can be set on the host running the container to tweak the scenario/faults being injected:
example:
export <parameter_name>=<value>
See list of variables that apply to all scenarios here that can be used/set in addition to these scenario specific variables
Parameter | Description | Default |
---|
NAMESPACE | Required - Namespace of the pod to which filter need to be applied | "" |
LABEL_SELECTOR | Label of the pod(s) to target | "" |
POD_NAME | When label_selector is not specified, pod matching the name will be selected for the chaos scenario | "" |
INSTANCE_COUNT | Number of pods to perform action/select that match the label selector | 1 |
TRAFFIC_TYPE | List of directions to apply filters - egress/ingress ( needs to be a list ) | [ingress, egress] |
INGRESS_PORTS | Ingress ports to block ( needs to be a list ) | [] i.e all ports |
EGRESS_PORTS | Egress ports to block ( needs to be a list ) | [] i.e all ports |
WAIT_DURATION | Ensure that it is at least about twice of test_duration | 300 |
TEST_DURATION | Duration of the test run | 120 |
Note
In case of using custom metrics profile or alerts profile when CAPTURE_METRICS
or ENABLE_ALERTS
is enabled, mount the metrics profile from the host on which the container is run using podman/docker under /home/krkn/kraken/config/metrics-aggregated.yaml
and /home/krkn/kraken/config/alerts
.For example:
$ podman run --name=<container_name> --net=host --env-host=true -v <path-to-custom-metrics-profile>:/home/krkn/kraken/config/metrics-aggregated.yaml -v <path-to-custom-alerts-profile>:/home/krkn/kraken/config/alerts -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:pod-network-chaos