Krkn recently replaced PowerfulSeal with its own internal pod scenarios using a plugin system. This scenario disrupts the pods matching the label in the specified namespace on a Kubernetes/OpenShift cluster.
1 - Pod Scenarios using Krkn
Example Config
The following are the components of Kubernetes for which a basic chaos scenario config exists today.
kraken:
chaos_scenarios:
- plugin_scenarios:
- path/to/scenario.yaml
You can then create the scenario file with the following contents:
# yaml-language-server: $schema=../plugin.schema.json
- id: kill-pods
config:
namespace_pattern: ^kube-system$
label_selector: k8s-app=kube-scheduler
krkn_pod_recovery_time: 120
Please adjust the schema reference to point to the schema file. This file will give you code completion and documentation for the available options in your IDE.
Pod Chaos Scenarios
The following are the components of Kubernetes/OpenShift for which a basic chaos scenario config exists today.
Component | Description | Working |
---|---|---|
Basic pod scenario | Kill a pod. | :heavy_check_mark: |
Etcd | Kills a single/multiple etcd replicas. | :heavy_check_mark: |
Kube ApiServer | Kills a single/multiple kube-apiserver replicas. | :heavy_check_mark: |
ApiServer | Kills a single/multiple apiserver replicas. | :heavy_check_mark: |
Prometheus | Kills a single/multiple prometheus replicas. | :heavy_check_mark: |
OpenShift System Pods | Kills random pods running in the OpenShift system namespaces. | :heavy_check_mark: |
2 - Pod Scenarios using Krkn-hub
This scenario disrupts the pods matching the label in the specified namespace on a Kubernetes/OpenShift cluster.
Run
If enabling Cerberus to monitor the cluster and pass/fail the scenario post chaos, refer docs. Make sure to start it before injecting the chaos and set CERBERUS_ENABLED
environment variable for the chaos injection container to autoconnect.
$ podman run --name=<container_name> --net=host --env-host=true -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:pod-scenarios
$ podman logs -f <container_name or container_id> # Streams Kraken logs
$ podman inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
$ docker run $(./get_docker_params.sh) --name=<container_name> --net=host -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:pod-scenarios
OR
$ docker run -e <VARIABLE>=<value> --name=<container_name> --net=host -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:pod-scenarios
$ docker logs -f <container_name or container_id> # Streams Kraken logs
$ docker inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
Tip
Because the container runs with a non-root user, ensure the kube config is globally readable before mounting it in the container. You can achieve this with the following commands:kubectl config view --flatten > ~/kubeconfig && chmod 444 ~/kubeconfig && docker run $(./get_docker_params.sh) --name=<container_name> --net=host -v ~kubeconfig:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:<scenario>
Supported parameters
The following environment variables can be set on the host running the container to tweak the scenario/faults being injected:
example:
export <parameter_name>=<value>
See list of variables that apply to all scenarios here that can be used/set in addition to these scenario specific variables
Parameter | Description | Default |
---|---|---|
NAMESPACE | Targeted namespace in the cluster ( supports regex ) | openshift-.* |
POD_LABEL | Label of the pod(s) to target | "" |
NAME_PATTERN | Regex pattern to match the pods in NAMESPACE when POD_LABEL is not specified | .* |
DISRUPTION_COUNT | Number of pods to disrupt | 1 |
KILL_TIMEOUT | Timeout to wait for the target pod(s) to be removed in seconds | 180 |
EXPECTED_RECOVERY_TIME | Fails if the pod disrupted do not recover within the timeout set | 120 |
Note
Set NAMESPACE environment variable toopenshift-.*
to pick and disrupt pods randomly in openshift system namespaces, the DAEMON_MODE can also be enabled to disrupt the pods every x seconds in the background to check the reliability.Note
In case of using custom metrics profile or alerts profile whenCAPTURE_METRICS
or ENABLE_ALERTS
is enabled, mount the metrics profile from the host on which the container is run using podman/docker under /home/krkn/kraken/config/metrics-aggregated.yaml
and /home/krkn/kraken/config/alerts
.$ podman run --name=<container_name> --net=host --env-host=true -v <path-to-custom-metrics-profile>:/home/krkn/kraken/config/metrics-aggregated.yaml -v <path-to-custom-alerts-profile>:/home/krkn/kraken/config/alerts -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:container-scenarios
Demo
You can find a link to a demo of the scenario here