This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Application Outage Scenarios

Application outages

Scenario to block the traffic ( Ingress/Egress ) of an application matching the labels for the specified duration of time to understand the behavior of the service/other services which depend on it during downtime. This helps with planning the requirements accordingly, be it improving the timeouts or tweaking the alerts etc.

You can add in your applications URL into the health checks section of the config to track the downtime of your application during this scenario

1 - Application Outage Scenarios using Krkn

Sample scenario config
application_outage:                                  # Scenario to create an outage of an application by blocking traffic
  duration: 600                                      # Duration in seconds after which the routes will be accessible
  namespace: <namespace-with-application>            # Namespace to target - all application routes will go inaccessible if pod selector is empty
  pod_selector: {app: foo}                            # Pods to target
  block: [Ingress, Egress]                           # It can be Ingress or Egress or Ingress, Egress
Debugging steps in case of failures

Kraken creates a network policy blocking the ingress/egress traffic to create an outage, in case of failures before reverting back the network policy, you can delete it manually by executing the following commands to stop the outage:

$ oc delete networkpolicy/kraken-deny -n <targeted-namespace>

2 - Application outage Scenario using Krkn-hub

This scenario disrupts the traffic to the specified application to be able to understand the impact of the outage on the dependent service/user experience. Refer docs for more details.

Run

If enabling Cerberus to monitor the cluster and pass/fail the scenario post chaos, refer docs. Make sure to start it before injecting the chaos and set CERBERUS_ENABLED environment variable for the chaos injection container to autoconnect.

$ podman run --name=<container_name> --net=host --env-host=true -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:application-outages
$ podman logs -f <container_name or container_id> # Streams Kraken logs
$ podman inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario
$ docker run $(./get_docker_params.sh) --name=<container_name> --net=host -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:application-outages
OR 
$ docker run -e <VARIABLE>=<value> --net=host -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:application-outages

$ docker logs -f <container_name or container_id> # Streams Kraken logs
$ docker inspect <container-name or container-id> --format "{{.State.ExitCode}}" # Outputs exit code which can considered as pass/fail for the scenario

Supported parameters

The following environment variables can be set on the host running the container to tweak the scenario/faults being injected:

Example if –env-host is used:

export <parameter_name>=<value>

OR on the command line like example:

-e <VARIABLE>=<value> 

See list of variables that apply to all scenarios here that can be used/set in addition to these scenario specific variables

ParameterDescriptionDefault
DURATIONDuration in seconds after which the routes will be accessible600
NAMESPACENamespace to target - all application routes will go inaccessible if pod selector is empty ( Required )No default
POD_SELECTORPods to target. For example “{app: foo}”No default
BLOCK_TRAFFIC_TYPEIt can be Ingress or Egress or Ingress, Egress ( needs to be a list )[Ingress, Egress]

For example:

$ podman run --name=<container_name> --net=host --env-host=true -v <path-to-custom-metrics-profile>:/home/krkn/kraken/config/metrics-aggregated.yaml -v <path-to-custom-alerts-profile>:/home/krkn/kraken/config/alerts -v <path-to-kube-config>:/home/krkn/.kube/config:Z -d quay.io/krkn-chaos/krkn-hub:application-outages

Demo

You can find a link to a demo of the scenario here