This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Developers Guide

Developers Guide Overview

1: Krkn-lib
2: Adding scenarios via plugin api
3: Adding New Scenario to Krkn-hub
4: Adding New Scenario to Krknctl
5: Testing your changes

This document describes how to develop and add to Krkn. Before you start, it is recommended that you read the following documents first:

Be sure to properly install Krkn. Then you can start to develop krkn. The following documents will help you get started:

Add k8s functionality to krkn-lib
Add a New Chaos Scenario using Plugin API: Adding a new scenario into krkn
Test your changes

NOTE: All base kubernetes functionality should be added into krkn-lib and called from krkn

Once a scenario gets added to krkn, changes will be need in krkn-hub and krknctl as well. See steps below on help to edit krkn-hub and krknctl

Add New Scenario to Krkn-hub and test your changes
Add New Scenario to Krknctl and test your changes

Questions?

For any questions or further guidance, feel free to reach out to us on the Kubernetes workspace in the #krkn channel. We’re happy to assist. Now, release the Krkn!

Follow Contribution Guide

Once all you’re happy with your changes, follow the contribution guide on how to create your own branch and squash your commits

1 - Krkn-lib

Krkn-lib contains the base kubernetes python functions

PyPI

krkn-lib

Krkn Chaos and resiliency testing tool Foundation Library

The Library contains Classes, Models and helper functions used in Kraken to interact with Kubernetes, Openshift and other external APIS. The goal of this library is to give to developers the building blocks to realize new Chaos Scenarios and to increase the testability and the modularity of the Krkn codebase.

Packages

The library is subdivided in several Packages under src/krkn_lib

ocp: Openshift Integration
k8s: Kubernetes Integration
elastic: Collection of ElasticSearch functions for posting telemetry
prometheus: Collection of prometheus functions for collecting metrics and alerts
telemetry:
- k8s: Kubernetes Telemetry collection and distribution
- ocp: Openshift Telemetry collection and distribution
models: Krkn shared data models
- k8s: Kubernetes objects model
- krkn: Krkn base models
- telemetry: Telemetry collection model
- elastic: Elastic model for data
utils: common functions

Documentation and Available Functions

The Library documentation of available functions is here. The documentation is automatically generated by Sphinx on top of the reStructuredText Docstring Format comments present in the code.

Installation

Git

Clone the repository

git clone https://github.com/krkn-chaos/krkn-lib
cd krkn-lib

Install the dependencies

Krkn lib uses poetry for its dependency management and packaging. To install the proper packages please use:

$ pip install poetry
$ poetry install --no-interaction

Testing your changes

To see how you can configure and test your changes see testing changes

2 - Adding scenarios via plugin api

Scenario Plugin API:

This API enables seamless integration of Scenario Plugins for Krkn. Plugins are automatically detected and loaded by the plugin loader, provided they extend the AbstractPluginScenario abstract class, implement the required methods, and adhere to the specified naming conventions.

Plugin folder:

The plugin loader automatically loads plugins found in the krkn/scenario_plugins directory, relative to the Krkn root folder. Each plugin must reside in its own directory and can consist of one or more Python files. The entry point for each plugin is a Python class that extends the AbstractPluginScenario abstract class and implements its required methods.

`init` file

For the plugin to be properly found by the plugin api, there needs to be a init file in the base folder

For example: init.py

`AbstractPluginScenario` abstract class:

This abstract class defines the contract between the plugin and krkn. It consists of two methods:

run(...)
get_scenario_type()

Most IDEs can automatically suggest and implement the abstract methods defined in AbstractPluginScenario: pycharm (IntelliJ PyCharm)

`run(...)`

    def run(
        self,
        run_uuid: str,
        scenario: str,
        krkn_config: dict[str, any],
        lib_telemetry: KrknTelemetryOpenshift,
        scenario_telemetry: ScenarioTelemetry,
    ) -> int:

This method represents the entry point of the plugin and the first method that will be executed.

Parameters:

run_uuid:
- the uuid of the chaos run generated by krkn for every single run.
scenario:
- the config file of the scenario that is currently executed
krkn_config:
- the full dictionary representation of the config.yaml
lib_telemetry
- it is a composite object of all the krkn-lib objects and methods needed by a krkn plugin to run.
scenario_telemetry
- the ScenarioTelemetry object of the scenario that is currently executed

Note

Helper functions for interactions in Krkn are part of krkn-lib. Please feel free to reuse and expand them as you see fit when adding a new scenario or expanding the capabilities of the current supported scenarios.

Return value:

Returns 0 if the scenario succeeds and 1 if it fails.

WARNING

All the exception must be handled inside the run method and not propagated.

`get_scenario_types()`:

python def get_scenario_types(self) -> list[str]:

Indicates the scenario types specified in the config.yaml. For the plugin to be properly loaded, recognized and executed, it must be implemented and must return one or more strings matching scenario_type strings set in the config.

DANGER

Multiple strings can map to a single ScenarioPlugin but the same string cannot map to different plugins, an exception will be thrown for scenario_type redefinition.

INFO

The scenario_type strings must be unique across all plugins; otherwise, an exception will be thrown.

Naming conventions:

A key requirement for developing a plugin that will be properly loaded by the plugin loader is following the established naming conventions. These conventions are enforced to maintain a uniform and readable codebase, making it easier to onboard new developers from the community.

plugin folder:

the plugin folder must be placed in the krkn/scenario_plugin folder starting from the krkn root folder
the plugin folder cannot contain the words
- plugin
- scenario

plugin file name and class name:

the plugin file containing the main plugin class must be named in snake case and must have the suffix _scenario_plugin:
- example_scenario_plugin.py
the main plugin class must named in capital camel case and must have the suffix ScenarioPlugin :
- ExampleScenarioPlugin
the file name must match the class name in the respective syntax:
- example_scenario_plugin.py -> ExampleScenarioPlugin

scenario type:

the scenario type must be unique between all the scenarios.

logging:

If your new scenario does not adhere to the naming conventions, an error log will be generated in the Krkn standard output, providing details about the issue:

2024-10-03 18:06:31,136 [INFO] 📣 `ScenarioPluginFactory`: types from config.yaml mapped to respective classes for execution:
2024-10-03 18:06:31,136 [INFO]   ✅ type: application_outages_scenarios ➡️ `ApplicationOutageScenarioPlugin` 
2024-10-03 18:06:31,136 [INFO]   ✅ types: [hog_scenarios, arcaflow_scenario] ➡️ `ArcaflowScenarioPlugin` 
2024-10-03 18:06:31,136 [INFO]   ✅ type: container_scenarios ➡️ `ContainerScenarioPlugin` 
2024-10-03 18:06:31,136 [INFO]   ✅ type: managedcluster_scenarios ➡️ `ManagedClusterScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ types: [pod_disruption_scenarios, pod_network_scenario, vmware_node_scenarios, ibmcloud_node_scenarios] ➡️ `NativeScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ type: network_chaos_scenarios ➡️ `NetworkChaosScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ type: node_scenarios ➡️ `NodeActionsScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ type: pvc_scenarios ➡️ `PvcScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ type: service_disruption_scenarios ➡️ `ServiceDisruptionScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ type: service_hijacking_scenarios ➡️ `ServiceHijackingScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ type: cluster_shut_down_scenarios ➡️ `ShutDownScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ type: syn_flood_scenarios ➡️ `SynFloodScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ type: time_scenarios ➡️ `TimeActionsScenarioPlugin` 
2024-10-03 18:06:31,137 [INFO]   ✅ type: zone_outages_scenarios ➡️ `ZoneOutageScenarioPlugin`

2024-09-18 14:48:41,735 [INFO] Failed to load Scenario Plugins:

2024-09-18 14:48:41,735 [ERROR] ⛔ Class: ExamplePluginScenario Module: krkn.scenario_plugins.example.example_scenario_plugin
2024-09-18 14:48:41,735 [ERROR] ⚠️ scenario plugin class name must start with a capital letter, end with `ScenarioPlugin`, and cannot be just `ScenarioPlugin`.

INFO

If you’re trying to understand how the scenario types in the config.yaml are mapped to their corresponding plugins, this log will guide you! Each scenario plugin class mentioned can be found in the krkn/scenario_plugin folder simply convert the camel case notation and remove the ScenarioPlugin suffix from the class name e.g ShutDownScenarioPlugin class can be found in the krkn/scenario_plugin/shut_down folder.

ExampleScenarioPlugin

The ExampleScenarioPlugin class included in the tests folder can be used as a scaffolding for new plugins and it is considered part of the documentation.

Adding CI tests

Depending on the complexity of the new scneario, it would be much appreciated if a CI test of the scenario would be added to our github action that gets run on each PR.
To add a test:

Add a test script in the CI tests folder
Add scenario file that can run on a 1 node kind cluster
Add it to the list of tests to be run as part of the functional CI https://github.com/krkn-chaos/krkn/blob/9337052e7bf5c14ab38928792ea02cdf93da157c/.github/workflows/tests.yml#L79-L91

3 - Adding New Scenario to Krkn-hub

Adding/Editing a New Scenario to Krkn-hub

Create folder with scenario name under krkn-hub
Create generic scenario template with enviornment variables
a. See scenario.yaml for example
b. Almost all parameters should be set using a variable (these will be set in the env.sh file or through the command line environment variables)
Add defaults for any environment variables in an “env.sh” file
a. See env.sh for example
Create script to run.sh chaos scenario a. See run.sh for example
b. edit line 16 with your scenario yaml template
c. edit line 17 and 23 with your yaml config location

Create Dockerfile template

a. See dockerfile template for example

b. Lines to edit

 i. 12: replace "application-outages" with your folder name

 ii. 14: replace "application-outages" with your folder name

 iii. 17: replace "application-outages" with your scenario name

 iv. 18: replace description with a description of your new scenario

Add service/scenario to docker-compose.yaml file following syntax of other services
Point the dockerfile parameter in your docker-compose to the Dockerfile file in your new folder
Add the folder name to the list of scenarios in build.sh
Update the krkn website and main README with new scenario type

NOTE:

If you added any main configuration variables or new sections be sure to update config.yaml.template
Similar to above, also add the default parameter values to env.sh

4 - Adding New Scenario to Krknctl

Adding Scenario to Krknctl

Adding a New Scenario to Krknctl

For krknctl to find the parameters of the scenario it uses a krknctl input json file. Once this file is added to krkn-hub, krknctl will be able to find it along with the details of how to run the scenario.

Add KrknCtl Input Json

This file adds every enviornment variable that is set up for krkn-hub to be defined as a flag to the krknctl cli command. There are a number of different type of variables that you can use, each with their own required fields. See below for an example of the different variable types

An exmaple krknctl-input.json file can be found here

Enum Type Required Key/Values

{
    "name": "<name>",
    "short_description":"<short-description>",
    "description":"<longer-description>",
    "variable":"<variable_name>", //this needs to match enviornment variable in krkn-hub
    "type": "enum",
    "allowed_values": "<value>,<value>",
    "separator": ",",
    "default":"", // any default value
    "required":"<true_or_false>" // true or false if required to set when running
}

String Type Required Key/Values

{
    "name": "<name>",
    "short_description":"<short-description>",
    "description":"<longer-description>",
    "variable":"<variable_name>", //this needs to match enviornment variable in krkn-hub
    "type": "string",
    "default": "", // any default value
    "required":"<true_or_false>" // true or false if required to set when running
}

Number Type Required Key/Values

{
    "name": "<name>",
    "short_description": "<short-description>",
    "description": "<longer-description>",
    "variable": "<variable_name>", //this needs to match enviornment variable in krkn-hub
    "type": "number",  // options: string, number, file, file64
    "default": "", // any default value
    "required": "<true_or_false>" // true or false if required to set when running
}

File Type Required Key/Values

{
    "name": "<name>",
    "short_description":"<short-description>",
    "description":"<longer-description>",
    "variable":"<variable_name>", //this needs to match enviornment variable in krkn-hub
    "type":"file",  
    "mount_path": "/home/krkn/<file_loc>", // file location to mount to, using /home/krkn as the base has correct read/write locations
    "required":"<true_or_false>" // true or false if required to set when running
}

File Base 64 Type Required Key/Values

{
    "name": "<name>",
    "short_description":"<short-description>",
    "description":"<longer-description>",
    "variable":"<variable_name>", //this needs to match enviornment variable in krkn-hub
    "type":"file_base64",  
    "required":"<true_or_false>" // true or false if required to set when running
}

5 - Testing your changes

This page gives details about how you can get a kind cluster configured to be able to run on krkn-lib (the lowest level of krkn-chaos repos) up through krknctl (our easiest way to run and highest level repo)

Configure Kind Testing Enviornment

Install kind
Create cluster using kind-config.yml under krkn-lib base folder

kind create cluster --wait 300s --config=kind-config.yml

Install Elasticsearch and Prometheus

To be able to run the full test suite of tests you need to have elasticsearch and promethues properly configured on the cluster

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add stable https://charts.helm.sh/stable
helm repo update

Prometheus

Deploy prometheus on your cluster

kubectl create namespace monitoring
helm install \
--wait --timeout 360s \
kind-prometheus \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set prometheus.service.nodePort=30000 \
--set prometheus.service.type=NodePort \
--set grafana.service.nodePort=31000 \
--set grafana.service.type=NodePort \
--set alertmanager.service.nodePort=32000 \
--set alertmanager.service.type=NodePort \
--set prometheus-node-exporter.service.nodePort=32001 \
--set prometheus-node-exporter.service.type=NodePort \
--set prometheus.prometheusSpec.maximumStartupDurationSeconds=300

ElasticSearch

Set enviornment variables of elasticsearch variables

export ELASTIC_URL="https://localhost"
export ELASTIC_PORT="9091"
export ELASTIC_USER="elastic"
export ELASTIC_PASSWORD="test"

Deploy elasticsearch on your cluster

helm install \
--wait --timeout 360s \
elasticsearch \
oci://registry-1.docker.io/bitnamicharts/elasticsearch \
--set master.masterOnly=false \
--set master.replicaCount=1 \
--set data.replicaCount=0 \
--set coordinating.replicaCount=0 \
--set ingest.replicaCount=0 \
--set service.type=NodePort \
--set service.nodePorts.restAPI=32766 \
--set security.elasticPassword=test \
--set security.enabled=true \
--set image.tag=7.17.23-debian-12-r0 \
--set security.tls.autoGenerated=true

Testing Changes in Krkn-lib

To be able to run all the tests in the krkn-lib suite, you’ll need to have prometheus and elastic properly configured. See above steps for details

Install poetry

Using a virtual enviornment install poetry and install krkn-lib requirmenets

$ pip install poetry
$ poetry install --no-interaction

Run tests

poetry run python3 -m coverage run -a -m unittest discover -v src/krkn_lib/tests/

Adding tests

Be sure that if you are adding any new functions or functionality you are adding unit tests for it. We want to keep above an 80% coverage in this repo since its our base functionality

Testing Changes in Krkn

Configuring test Cluster

After creating a kind cluster with the steps above, create these test pods on your cluster

kubectl apply -f CI/templates/outage_pod.yaml
kubectl wait --for=condition=ready pod -l scenario=outage --timeout=300s
kubectl apply -f CI/templates/container_scenario_pod.yaml
kubectl wait --for=condition=ready pod -l scenario=container --timeout=300s
kubectl create namespace namespace-scenario
kubectl apply -f CI/templates/time_pod.yaml
kubectl wait --for=condition=ready pod -l scenario=time-skew --timeout=300s
kubectl apply -f CI/templates/service_hijacking.yaml
kubectl wait --for=condition=ready pod -l "app.kubernetes.io/name=proxy" --timeout=300s

Install Requirements

$ python3.9 -m venv chaos
$ source chaos/bin/activate
$ pip install -r requirements.txt

Run Tests

Add prometheus configuration variables to the test config file

yq -i '.kraken.port="8081"' CI/config/common_test_config.yaml
yq -i '.kraken.signal_address="0.0.0.0"' CI/config/common_test_config.yaml
yq -i '.kraken.performance_monitoring="localhost:9090"' CI/config/common_test_config.yaml

Add tests to the list of functional tests to run

echo "test_service_hijacking" > ./CI/tests/functional_tests
echo "test_app_outages" >> ./CI/tests/functional_tests
echo "test_container"      >> ./CI/tests/functional_tests
echo "test_pod" >> ./CI/tests/functional_tests
echo "test_namespace"      >> ./CI/tests/functional_tests
echo "test_net_chaos"      >> ./CI/tests/functional_tests
echo "test_time"           >> ./CI/tests/functional_tests
echo "test_cpu_hog" >> ./CI/tests/functional_tests
echo "test_memory_hog" >> ./CI/tests/functional_tests
echo "test_io_hog" >> ./CI/tests/functional_tests

Run tests

./CI/run.sh

Results can be seen in ./CI/results.markdown

Adding Tests

Be sure that if you are adding any new scenario you are adding tests for it based on a 1 node kind cluster.

The tests live here

Testing Changes for Krkn-hub

Install Podman/Docker Compose

You can use either podman-compose or docker-compose for this step

NOTE: Podman might not work on Mac’s

pip3 install docker-compose

To get latest podman-compose features we need, use this installation command

pip3 install https://github.com/containers/podman-compose/archive/devel.tar.gz

Build Your Changes

Run build.sh to create Dockerfile’s for each scenario
Edit the docker-compose.yaml file to point to your quay.io repository (optional; required if you want to push or are testing krknctl)

ex.) 
image: containers.krkn-chaos.dev/krkn-chaos/krkn-hub:chaos-recommender 

change to >

image: quay.io/<user>/krkn-hub:chaos-recommender

Build your image(s) from base krkn-hub directory
Builds all images in docker-compose file
```
docker-compose build
```
Builds single image defined by service/scenario name
```
docker-compose build <scenario_type>
```
OR
Builds all images in podman-compose file
```
podman-compose build
```
Builds single image defined by service/scenario name
```
podman-compose build <scenario_type>
```

Push Images to your quay.io

Push all Images using docker-compose

docker-compose push

Push a single image using docker-compose

docker-compose push <scenario_type>

Single Image (have to go one by one to push images through podman)

podman-compose push <scenario_type>

podman push quay.io/<username>/krkn-hub:<scenario_type>

Run your new scenario

docker run -d -v <kube_config_path>:/root/.kube/config:Z quay.io/<username>/krkn-hub:<scenario_type>

podman run -d -v <kube_config_path>:/root/.kube/config:Z quay.io/<username>/krkn-hub:<scenario_type>

See krkn-hub documentation for each scenario to see all possible variables to use

Testing Changes in Krknctl

Once you’ve created a krknctl-input.json file using the steps here, you’ll want to test those changes using the below steps. You will need a either podman or docker installed as well as a quay account.

Build and Push to personal Quay

First you will build your changes of krkn-hub and push changes to your own quay repository for testing

Run Krknctl with Personal Image

Once you have your images in quay, you are all set to configure krknctl to look for these new images. You’ll edit the config file of krknctl found here and edit the quay_org to be set to your quay username

With these updates to your config, you’ll build your personal krknctl binary and you’l be all set to start testing your new scenario and config options.

If any krknctl code changes are required, you’ll have to make changes and rebuild the the krknctl binary each time to test as well