Developers Guide
Developers Guide Overview
This document describes how to develop and add to Krkn. Before you start, it is recommended that you read the following documents first:
- Krkn Main README
- List of all Supported Scenarios
Be sure to properly install Krkn. Then you can start to develop krkn. The following documents will help you get started:
- Add k8s functionality to krkn-lib
- Add a New Chaos Scenario using Plugin API: Adding a new scenario into krkn
- Test your changes
NOTE: All base kubernetes functionality should be added into krkn-lib and called from krkn
Once a scenario gets added to krkn, changes will be need in krkn-hub and krknctl as well. See steps below on help to edit krkn-hub and krknctl
Questions?
For any questions or further guidance, feel free to reach out to us on the
Kubernetes workspace in the #krkn
channel.
We’re happy to assist. Now, release the Krkn!
Follow Contribution Guide
Once all you’re happy with your changes, follow the contribution guide on how to create your own branch and squash your commits
1 - Krkn-lib
Krkn-lib contains the base kubernetes python functions

krkn-lib
Contents
The Library contains Classes, Models and helper functions used in Kraken to interact with
Kubernetes, Openshift and other external APIS.
The goal of this library is to give to developers the building blocks to realize new Chaos
Scenarios and to increase the testability and the modularity of the Krkn codebase.
Packages
The library is subdivided in several Packages under src/krkn_lib
- ocp: Openshift Integration
- k8s: Kubernetes Integration
- elastic: Collection of ElasticSearch functions for posting telemetry
- prometheus: Collection of prometheus functions for collecting metrics and alerts
- telemetry:
- k8s: Kubernetes Telemetry collection and distribution
- ocp: Openshift Telemetry collection and distribution
- models: Krkn shared data models
- k8s: Kubernetes objects model
- telemetry: Telemetry collection model
- elastic: Elastic model for data
- utils: common functions
Documentation and Available Functions
The Library documentation of available functions is here.
The documentation is automatically generated by Sphinx on top of the reStructuredText Docstring Format comments present in the code.
Installation
Git
Clone the repository
git clone https://github.com/krkn-chaos/krkn-lib
cd krkn-lib
Install the dependencies
Krkn lib uses poetry for its dependency management and packaging. To install the proper packages please use:
$ pip install poetry
$ poetry install --no-interaction
Testing your changes
To see how you can configure and test your changes see testing changes
2 - Adding scenarios via plugin api
Scenario Plugin API:
This API enables seamless integration of Scenario Plugins for Krkn. Plugins are automatically
detected and loaded by the plugin loader, provided they extend the AbstractPluginScenario
abstract class, implement the required methods, and adhere to the specified naming conventions.
Plugin folder:
The plugin loader automatically loads plugins found in the krkn/scenario_plugins
directory,
relative to the Krkn root folder. Each plugin must reside in its own directory and can consist
of one or more Python files. The entry point for each plugin is a Python class that extends the
AbstractPluginScenario abstract class and implements its required methods.
__init__
file
For the plugin to be properly found by the plugin api, there needs to be a init file in the base folder
For example: init.py
AbstractPluginScenario
abstract class:
This abstract class defines the contract between the plugin and krkn.
It consists of two methods:
run(...)
get_scenario_type()
Most IDEs can automatically suggest and implement the abstract methods defined in AbstractPluginScenario
:
(IntelliJ PyCharm)
run(...)
def run(
self,
run_uuid: str,
scenario: str,
krkn_config: dict[str, any],
lib_telemetry: KrknTelemetryOpenshift,
scenario_telemetry: ScenarioTelemetry,
) -> int:
This method represents the entry point of the plugin and the first method
that will be executed.
Parameters:
run_uuid
:- the uuid of the chaos run generated by krkn for every single run.
scenario
:- the config file of the scenario that is currently executed
krkn_config
:- the full dictionary representation of the
config.yaml
lib_telemetry
- it is a composite object of all the krkn-lib objects and methods needed by a krkn plugin to run.
scenario_telemetry
- the
ScenarioTelemetry
object of the scenario that is currently executed
Note
Helper functions for interactions in Krkn are part of
krkn-lib. Please feel free to reuse and expand them as you see fit when adding a new scenario or expanding the capabilities of the current supported scenarios.
Return value:
Returns 0 if the scenario succeeds and 1 if it fails.
All the exception must be handled inside the run method and not propagated.
get_scenario_types()
:
python def get_scenario_types(self) -> list[str]:
Indicates the scenario types specified in the config.yaml
. For the plugin to be properly
loaded, recognized and executed, it must be implemented and must return one or more
strings matching scenario_type
strings set in the config.
Multiple strings can map to a single ScenarioPlugin
but the same string cannot map to different plugins, an exception will be thrown for scenario_type redefinition.
The scenario_type
strings must be unique across all plugins; otherwise, an exception will be thrown.
Naming conventions:
A key requirement for developing a plugin that will be properly loaded
by the plugin loader is following the established naming conventions.
These conventions are enforced to maintain a uniform and readable codebase,
making it easier to onboard new developers from the community.
plugin folder:
- the plugin folder must be placed in the
krkn/scenario_plugin
folder starting from the krkn root folder - the plugin folder cannot contain the words
plugin file name and class name:
- the plugin file containing the main plugin class must be named in snake case and must have the suffix
_scenario_plugin
:example_scenario_plugin.py
- the main plugin class must named in capital camel case and must have the suffix
ScenarioPlugin
: - the file name must match the class name in the respective syntax:
example_scenario_plugin.py
-> ExampleScenarioPlugin
scenario type:
- the scenario type must be unique between all the scenarios.
logging:
If your new scenario does not adhere to the naming conventions, an error log will be generated in the Krkn standard output,
providing details about the issue:
2024-10-03 18:06:31,136 [INFO] 📣 `ScenarioPluginFactory`: types from config.yaml mapped to respective classes for execution:
2024-10-03 18:06:31,136 [INFO] ✅ type: application_outages_scenarios ➡️ `ApplicationOutageScenarioPlugin`
2024-10-03 18:06:31,136 [INFO] ✅ types: [hog_scenarios, arcaflow_scenario] ➡️ `ArcaflowScenarioPlugin`
2024-10-03 18:06:31,136 [INFO] ✅ type: container_scenarios ➡️ `ContainerScenarioPlugin`
2024-10-03 18:06:31,136 [INFO] ✅ type: managedcluster_scenarios ➡️ `ManagedClusterScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ types: [pod_disruption_scenarios, pod_network_scenario, vmware_node_scenarios, ibmcloud_node_scenarios] ➡️ `NativeScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ type: network_chaos_scenarios ➡️ `NetworkChaosScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ type: node_scenarios ➡️ `NodeActionsScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ type: pvc_scenarios ➡️ `PvcScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ type: service_disruption_scenarios ➡️ `ServiceDisruptionScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ type: service_hijacking_scenarios ➡️ `ServiceHijackingScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ type: cluster_shut_down_scenarios ➡️ `ShutDownScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ type: syn_flood_scenarios ➡️ `SynFloodScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ type: time_scenarios ➡️ `TimeActionsScenarioPlugin`
2024-10-03 18:06:31,137 [INFO] ✅ type: zone_outages_scenarios ➡️ `ZoneOutageScenarioPlugin`
2024-09-18 14:48:41,735 [INFO] Failed to load Scenario Plugins:
2024-09-18 14:48:41,735 [ERROR] ⛔ Class: ExamplePluginScenario Module: krkn.scenario_plugins.example.example_scenario_plugin
2024-09-18 14:48:41,735 [ERROR] ⚠️ scenario plugin class name must start with a capital letter, end with `ScenarioPlugin`, and cannot be just `ScenarioPlugin`.
If you’re trying to understand how the scenario types in the config.yaml are mapped to their corresponding plugins, this log will guide you! Each scenario plugin class mentioned can be found in the krkn/scenario_plugin
folder simply convert the camel case notation and remove the ScenarioPlugin suffix from the class name e.g ShutDownScenarioPlugin
class can be found in the krkn/scenario_plugin/shut_down
folder.
ExampleScenarioPlugin
The ExampleScenarioPlugin class included in the tests folder can be used as a scaffolding for new plugins and it is considered
part of the documentation.
Adding CI tests
Depending on the complexity of the new scneario, it would be much appreciated if a CI test of the scenario would be added to our github action that gets run on each PR.
To add a test:
4 - Adding New Scenario to Krknctl
Adding Scenario to Krknctl
Adding a New Scenario to Krknctl
For krknctl to find the parameters of the scenario it uses a krknctl input json file. Once this file is added to krkn-hub, krknctl will be able to find it along with the details of how to run the scenario.
This file adds every enviornment variable that is set up for krkn-hub to be defined as a flag to the krknctl cli command. There are a number of different type of variables that you can use, each with their own required fields. See below for an example of the different variable types
An exmaple krknctl-input.json file can be found here
Enum Type Required Key/Values
{
"name": "<name>",
"short_description":"<short-description>",
"description":"<longer-description>",
"variable":"<variable_name>", //this needs to match enviornment variable in krkn-hub
"type": "enum",
"allowed_values": "<value>,<value>",
"separator": ",",
"default":"", // any default value
"required":"<true_or_false>" // true or false if required to set when running
}
String Type Required Key/Values
{
"name": "<name>",
"short_description":"<short-description>",
"description":"<longer-description>",
"variable":"<variable_name>", //this needs to match enviornment variable in krkn-hub
"type": "string",
"default": "", // any default value
"required":"<true_or_false>" // true or false if required to set when running
}
Number Type Required Key/Values
{
"name": "<name>",
"short_description": "<short-description>",
"description": "<longer-description>",
"variable": "<variable_name>", //this needs to match enviornment variable in krkn-hub
"type": "number", // options: string, number, file, file64
"default": "", // any default value
"required": "<true_or_false>" // true or false if required to set when running
}
File Type Required Key/Values
{
"name": "<name>",
"short_description":"<short-description>",
"description":"<longer-description>",
"variable":"<variable_name>", //this needs to match enviornment variable in krkn-hub
"type":"file",
"mount_path": "/home/krkn/<file_loc>", // file location to mount to, using /home/krkn as the base has correct read/write locations
"required":"<true_or_false>" // true or false if required to set when running
}
File Base 64 Type Required Key/Values
{
"name": "<name>",
"short_description":"<short-description>",
"description":"<longer-description>",
"variable":"<variable_name>", //this needs to match enviornment variable in krkn-hub
"type":"file_base64",
"required":"<true_or_false>" // true or false if required to set when running
}
5 - Testing your changes
This page gives details about how you can get a kind cluster configured to be able to run on krkn-lib (the lowest level of krkn-chaos repos) up through krknctl (our easiest way to run and highest level repo)
Install kind
Create cluster using kind-config.yml under krkn-lib base folder
kind create cluster --wait 300s --config=kind-config.yml
Install Elasticsearch and Prometheus
To be able to run the full test suite of tests you need to have elasticsearch and promethues properly configured on the cluster
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add stable https://charts.helm.sh/stable
helm repo update
Prometheus
Deploy prometheus on your cluster
kubectl create namespace monitoring
helm install \
--wait --timeout 360s \
kind-prometheus \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set prometheus.service.nodePort=30000 \
--set prometheus.service.type=NodePort \
--set grafana.service.nodePort=31000 \
--set grafana.service.type=NodePort \
--set alertmanager.service.nodePort=32000 \
--set alertmanager.service.type=NodePort \
--set prometheus-node-exporter.service.nodePort=32001 \
--set prometheus-node-exporter.service.type=NodePort \
--set prometheus.prometheusSpec.maximumStartupDurationSeconds=300
ElasticSearch
Set enviornment variables of elasticsearch variables
export ELASTIC_URL="https://localhost"
export ELASTIC_PORT="9091"
export ELASTIC_USER="elastic"
export ELASTIC_PASSWORD="test"
Deploy elasticsearch on your cluster
helm install \
--wait --timeout 360s \
elasticsearch \
oci://registry-1.docker.io/bitnamicharts/elasticsearch \
--set master.masterOnly=false \
--set master.replicaCount=1 \
--set data.replicaCount=0 \
--set coordinating.replicaCount=0 \
--set ingest.replicaCount=0 \
--set service.type=NodePort \
--set service.nodePorts.restAPI=32766 \
--set security.elasticPassword=test \
--set security.enabled=true \
--set image.tag=7.17.23-debian-12-r0 \
--set security.tls.autoGenerated=true
Testing Changes in Krkn-lib
To be able to run all the tests in the krkn-lib suite, you’ll need to have prometheus and elastic properly configured. See above steps for details
Install poetry
Using a virtual enviornment install poetry and install krkn-lib requirmenets
$ pip install poetry
$ poetry install --no-interaction
Run tests
poetry run python3 -m coverage run -a -m unittest discover -v src/krkn_lib/tests/
Adding tests
Be sure that if you are adding any new functions or functionality you are adding unit tests for it. We want to keep above an 80% coverage in this repo since its our base functionality
Testing Changes in Krkn
Configuring test Cluster
After creating a kind cluster with the steps above, create these test pods on your cluster
kubectl apply -f CI/templates/outage_pod.yaml
kubectl wait --for=condition=ready pod -l scenario=outage --timeout=300s
kubectl apply -f CI/templates/container_scenario_pod.yaml
kubectl wait --for=condition=ready pod -l scenario=container --timeout=300s
kubectl create namespace namespace-scenario
kubectl apply -f CI/templates/time_pod.yaml
kubectl wait --for=condition=ready pod -l scenario=time-skew --timeout=300s
kubectl apply -f CI/templates/service_hijacking.yaml
kubectl wait --for=condition=ready pod -l "app.kubernetes.io/name=proxy" --timeout=300s
Install Requirements
$ python3.9 -m venv chaos
$ source chaos/bin/activate
$ pip install -r requirements.txt
Run Tests
- Add prometheus configuration variables to the test config file
yq -i '.kraken.port="8081"' CI/config/common_test_config.yaml
yq -i '.kraken.signal_address="0.0.0.0"' CI/config/common_test_config.yaml
yq -i '.kraken.performance_monitoring="localhost:9090"' CI/config/common_test_config.yaml
- Add tests to the list of functional tests to run
echo "test_service_hijacking" > ./CI/tests/functional_tests
echo "test_app_outages" >> ./CI/tests/functional_tests
echo "test_container" >> ./CI/tests/functional_tests
echo "test_pod" >> ./CI/tests/functional_tests
echo "test_namespace" >> ./CI/tests/functional_tests
echo "test_net_chaos" >> ./CI/tests/functional_tests
echo "test_time" >> ./CI/tests/functional_tests
echo "test_cpu_hog" >> ./CI/tests/functional_tests
echo "test_memory_hog" >> ./CI/tests/functional_tests
echo "test_io_hog" >> ./CI/tests/functional_tests
- Run tests
Results can be seen in ./CI/results.markdown
Adding Tests
Be sure that if you are adding any new scenario you are adding tests for it based on a 1 node kind cluster.
The tests live here
Testing Changes for Krkn-hub
Install Podman/Docker Compose
You can use either podman-compose or docker-compose for this step
NOTE: Podman might not work on Mac’s
pip3 install docker-compose
OR
To get latest podman-compose features we need, use this installation command
pip3 install https://github.com/containers/podman-compose/archive/devel.tar.gz
Build Your Changes
- Run build.sh to create Dockerfile’s for each scenario
- Edit the docker-compose.yaml file to point to your quay.io repository (optional; required if you want to push or are testing krknctl)
ex.)
image: containers.krkn-chaos.dev/krkn-chaos/krkn-hub:chaos-recommender
change to >
image: quay.io/<user>/krkn-hub:chaos-recommender
Build your image(s) from base krkn-hub directory
Builds all images in docker-compose file
Builds single image defined by service/scenario name
docker-compose build <scenario_type>
OR
Builds all images in podman-compose file
Builds single image defined by service/scenario name
podman-compose build <scenario_type>
Push Images to your quay.io
Push all Images using docker-compose
Push a single image using docker-compose
docker-compose push <scenario_type>
OR
Single Image (have to go one by one to push images through podman)
podman-compose push <scenario_type>
OR
podman push quay.io/<username>/krkn-hub:<scenario_type>
Run your new scenario
docker run -d -v <kube_config_path>:/root/.kube/config:Z quay.io/<username>/krkn-hub:<scenario_type>
OR
podman run -d -v <kube_config_path>:/root/.kube/config:Z quay.io/<username>/krkn-hub:<scenario_type>
See krkn-hub documentation for each scenario to see all possible variables to use
Testing Changes in Krknctl
Once you’ve created a krknctl-input.json file using the steps here, you’ll want to test those changes using the below steps. You will need a either podman or docker installed as well as a quay account.
Build and Push to personal Quay
First you will build your changes of krkn-hub and push changes to your own quay repository for testing
Run Krknctl with Personal Image
Once you have your images in quay, you are all set to configure krknctl to look for these new images. You’ll edit the config file of krknctl found here and edit the quay_org
to be set to your quay username
With these updates to your config, you’ll build your personal krknctl binary and you’l be all set to start testing your new scenario and config options.
If any krknctl code changes are required, you’ll have to make changes and rebuild the the krknctl binary each time to test as well