Application Outage Scenarios
Application outages
Rollback Scenario Support
Krkn supports rollback for Application outages. For more details, please refer to the Rollback Scenarios documentation.
Debugging steps in case of failures
Kraken creates a network policy blocking the ingress/egress traffic to create an outage, in case of failures before reverting back the network policy, you can delete it manually by executing the following commands to stop the outage:
$ oc delete networkpolicy/kraken-deny -n <targeted-namespace>
How to Run Application Outage Scenarios
Choose your preferred method to run application outage scenarios:
Sample scenario config
application_outage: # Scenario to create an outage of an application by blocking traffic
duration: 600 # Duration in seconds after which the routes will be accessible
namespace: <namespace-with-application> # Namespace to target - all application routes will go inaccessible if pod selector is empty
pod_selector: {app: foo} # Pods to target
exclude_label: "" # Optional label selector to exclude pods. Supports dict, string, or list format
block: [Ingress, Egress] # It can be Ingress or Egress or Ingress, Egress
How to Use Plugin Name
Add the plugin name to the list of chaos_scenarios section in the config/config.yaml file
kraken:
kubeconfig_path: ~/.kube/config # Path to kubeconfig
..
chaos_scenarios:
- application_outages_scenarios:
- scenarios/<scenario_name>.yaml
Run
python run_kraken.py --config config/config.yaml
This scenario disrupts the traffic to the specified application to be able to understand the impact of the outage on the dependent service/user experience. Refer docs for more details.
Run
If enabling Cerberus to monitor the cluster and pass/fail the scenario post chaos, refer docs. Make sure to start it before injecting the chaos and set CERBERUS_ENABLED environment variable for the chaos injection container to autoconnect.
$ podman run --name=<container_name> --net=host --env-host=true --pull=always -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d containers.krkn-chaos.dev/krkn-chaos/krkn-hub:application-outages
$ podman logs -f <container_name or container_id> # Streams Kraken logs
$ podman inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
Note
–env-host: This option is not available with the remote Podman client, including Mac and Windows (excluding WSL2) machines. Without the –env-host option you’ll have to set each environment variable on the podman command line like-e <VARIABLE>=<value>$ docker run $(./get_docker_params.sh) --name=<container_name> --net=host --pull=always -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d containers.krkn-chaos.dev/krkn-chaos/krkn-hub:application-outages
OR
$ docker run -e <VARIABLE>=<value> --net=host --pull=always -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d containers.krkn-chaos.dev/krkn-chaos/krkn-hub:application-outages
$ docker logs -f <container_name or container_id> # Streams Kraken logs
$ docker inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
Tip
Because the container runs with a non-root user, ensure the kube config is globally readable before mounting it in the container. You can achieve this with the following commands:kubectl config view --flatten > ~/kubeconfig && chmod 444 ~/kubeconfig && docker run $(./get_docker_params.sh) --name=<container_name> --net=host -v ~kubeconfig:/home/krkn/.kube/config:Z -d containers.krkn-chaos.dev/krkn-chaos/krkn-hub:<scenario>Supported parameters
The following environment variables can be set on the host running the container to tweak the scenario/faults being injected:
Example if –env-host is used:
OR on the command line like example:
See list of variables that apply to all scenarios here that can be used/set in addition to these scenario specific variables
| Parameter | Description | Default |
|---|---|---|
| DURATION | Duration in seconds after which the routes will be accessible | 600 |
| NAMESPACE | Namespace to target - all application routes will go inaccessible if pod selector is empty ( Required ) | No default |
| POD_SELECTOR | Pods to target. For example “{app: foo}” | No default |
| EXCLUDE_LABEL | Pods to exclude after getting list of pods from POD_SELECTOR to target. For example “{app: foo}” | No default |
| BLOCK_TRAFFIC_TYPE | It can be Ingress or Egress or Ingress, Egress ( needs to be a list ) | [Ingress, Egress] |
Note
Defining theNAMESPACE parameter is required for running this scenario while the pod_selector is optional. In case of using pod selector to target a particular application, make sure to define it using the following format with a space between key and value: “{key: value}”.Note
In case of using custom metrics profile or alerts profile whenCAPTURE_METRICS or ENABLE_ALERTS is enabled, mount the metrics profile from the host on which the container is run using podman/docker under /home/krkn/kraken/config/metrics-aggregated.yaml and /home/krkn/kraken/config/alerts.$ podman run --name=<container_name> --net=host --env-host=true --pull=always -v <path-to-custom-metrics-profile>:/home/krkn/kraken/config/metrics-aggregated.yaml -v <path-to-custom-alerts-profile>:/home/krkn/kraken/config/alerts -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d containers.krkn-chaos.dev/krkn-chaos/krkn-hub:application-outages
krknctl run application-outages (optional: --<parameter>:<value>)
Can also set any global variable listed here
Scenario specific parameters:
| Parameter | Description | Type | Required | Default |
|---|---|---|---|---|
| Namespace to target - all application routes will go inaccessible if pod selector is empty | string | True | ||
| Set chaos duration (in sec) as desired | number | False | 600 | |
| Pods to target. For example “{app: foo}” | string | False | ||
| Pods to exclude after using pod-selector to target. For example “{app: foo}” | string | False | ||
| It can be [Ingress] or [Egress] or [Ingress, Egress] | string | False | “[Ingress, Egress]” |
To see all available scenario options
krknctl run application-outages --help
Demo
See a demo of this scenario: