Configuration

Configure target clusters for chaos testing

This guide walks you through configuring target Kubernetes or OpenShift clusters where you want to run chaos engineering scenarios.

Overview

Before running chaos experiments, you need to add one or more target clusters to the Krkn Operator. Target clusters are the Kubernetes/OpenShift clusters where chaos scenarios will be executed. You can add multiple target clusters and manage them through the web console.


Accessing Cluster Configuration

Step 1: Open Admin Settings

Log in to the Krkn Operator Console and click on your profile in the top-right corner. Select Admin Settings from the dropdown menu.

Admin Settings Menu

Step 2: Navigate to Cluster Targets

In the Admin Settings page, click on the Cluster Targets tab in the left sidebar. This will show you a list of all configured target clusters (if any).


Adding a New Target Cluster

Step 3: Open the Add Target Dialog

Click the Add Target button in the top-right corner of the Cluster Targets page. This will open the “Add New Target” dialog.

Add New Target Dialog

Step 4: Enter Cluster Information

You’ll need to provide:

  1. Cluster Name (required): A friendly name to identify this cluster (e.g., “Production-US-East”, “Dev-Cluster”, “OpenShift-QA”)

  2. Authentication Type (required): Choose one of three authentication methods:

    • Kubeconfig - Full kubeconfig file (recommended)
    • Service Account Token - Token-based authentication
    • Username/Password - Basic authentication (for clusters that support it)

Authentication Methods

The Krkn Operator supports three different ways to authenticate to target clusters. Choose the method that best fits your cluster’s security configuration.

This is the most common and recommended method. It uses a complete kubeconfig file to authenticate to the target cluster.

When to use:

  • You have direct access to the cluster’s kubeconfig file
  • You want to authenticate with certificates or tokens defined in the kubeconfig
  • The cluster supports standard Kubernetes authentication

How to configure:

  1. Select Kubeconfig as the Authentication Type
  2. Obtain the kubeconfig file for your target cluster:
    # For most Kubernetes clusters
    kubectl config view --flatten --minify > target-cluster.kubeconfig
    
    # For OpenShift clusters
    oc login https://api.cluster.example.com:6443
    oc config view --flatten > target-cluster.kubeconfig
    
  3. Open the kubeconfig file in a text editor and copy its entire contents
  4. Paste the kubeconfig content into the Kubeconfig text area in the dialog
  5. Click Create

Example kubeconfig content:

apiVersion: v1
kind: Config
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTi...
    server: https://api.cluster.example.com:6443
  name: my-cluster
contexts:
- context:
    cluster: my-cluster
    user: admin
  name: my-cluster-context
current-context: my-cluster-context
users:
- name: admin
  user:
    client-certificate-data: LS0tLS1CRUdJTi...
    client-key-data: LS0tLS1CRUdJTi...

Method 2: Service Account Token

Use this method if you want to authenticate using a Kubernetes Service Account token.

When to use:

  • You want fine-grained RBAC control over what the operator can do
  • You’re following a zero-trust security model
  • You want to create a dedicated service account for chaos testing

How to configure:

  1. Create a service account in the target cluster with appropriate permissions:

    # Create service account
    kubectl create serviceaccount krkn-operator -n krkn-system
    
    # Create ClusterRole with necessary permissions
    kubectl create clusterrolebinding krkn-operator-admin \
      --clusterrole=cluster-admin \
      --serviceaccount=krkn-system:krkn-operator
    
    # Get the service account token
    kubectl create token krkn-operator -n krkn-system --duration=8760h
    
  2. In the “Add New Target” dialog:

    • Enter a Cluster Name
    • Select Service Account Token as the Authentication Type
    • Enter the API Server URL (e.g., https://api.cluster.example.com:6443)
    • Paste the Service Account Token you generated
    • (Optional) Provide CA Certificate data if your cluster uses a self-signed or custom Certificate Authority
    • Click Create

About CA Certificate (Optional):

The CA Certificate field is optional and only needed in specific scenarios:

  • When to provide it: If your cluster uses a self-signed certificate or a custom/private Certificate Authority (CA) that is not trusted by default
  • When to skip it: If your cluster uses certificates from a public CA (like Let’s Encrypt, DigiCert, etc.) or standard cloud provider certificates
  • What it does: The CA certificate allows the Krkn Operator to verify the identity of your cluster’s API server and establish a secure TLS connection
  • How to get it: Extract the CA certificate from your cluster’s kubeconfig file (the certificate-authority-data field, base64-decoded) or from your cluster administrator

Example of extracting CA certificate from kubeconfig:

# Extract and decode CA certificate
kubectl config view --raw -o jsonpath='{.clusters[0].cluster.certificate-authority-data}' | base64 -d > ca.crt

Method 3: Username/Password

Use basic authentication with a username and password. This method is only supported by clusters that have basic auth enabled.

When to use:

  • Your cluster supports basic authentication
  • You’re testing in a development environment
  • You have credentials for a user with appropriate permissions

How to configure:

  1. In the “Add New Target” dialog:
    • Enter a Cluster Name
    • Select Username/Password as the Authentication Type
    • Enter the API Server URL (e.g., https://api.cluster.example.com:6443)
    • Enter your Username
    • Enter your Password
    • (Optional) Provide CA Certificate data if your cluster uses a self-signed or custom Certificate Authority
    • Click Create

About CA Certificate (Optional):

Same as with token authentication, the CA Certificate is optional:

  • When needed: Only if your cluster uses self-signed certificates or a custom/private Certificate Authority
  • When to skip: If using public CA certificates or standard cloud provider setups
  • Purpose: Enables secure TLS verification when connecting to the cluster’s API server

Verifying Target Cluster

After adding a target cluster, the Krkn Operator will attempt to connect to it and verify the credentials.

Successful Configuration

If the cluster is configured correctly, you’ll see it appear in the Cluster Targets list with a green status indicator. You can now use this cluster as a target for chaos scenarios.

Troubleshooting Connection Issues

If the cluster connection fails, check the following:

IssuePossible CauseSolution
Connection timeoutIncorrect API server URLVerify the API server URL is correct and accessible from the operator
Authentication failedInvalid credentialsRe-check your kubeconfig, token, or username/password
Certificate errorCA certificate mismatchProvide the correct CA certificate for clusters with custom CAs
Permission deniedInsufficient RBAC permissionsEnsure the service account or user has cluster-admin or necessary permissions
Network unreachableFirewall or network policyEnsure the Krkn Operator can reach the target cluster’s API server

You can view detailed error messages in the operator logs:

kubectl logs -n krkn-operator-system -l app.kubernetes.io/name=krkn-operator -c manager

Managing Target Clusters

Viewing Configured Clusters

Navigate to Admin SettingsCluster Targets to see all configured target clusters. Each cluster shows:

  • Cluster name
  • Connection status
  • Last verified time
  • Authentication method used

Editing a Target Cluster

To modify an existing target cluster:

  1. Click the Edit button next to the cluster in the list
  2. Update the cluster name or authentication credentials
  3. Click Save

Removing a Target Cluster

To remove a target cluster:

  1. Click the Delete button next to the cluster in the list
  2. Confirm the deletion

Required Permissions

The service account or user used to connect to target clusters needs the following permissions:

Minimum RBAC Permissions

For most chaos scenarios, the operator needs cluster-admin privileges or at least these permissions:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: krkn-operator-target-access
rules:
# Pod chaos scenarios
- apiGroups: [""]
  resources: ["pods", "pods/log", "pods/exec"]
  verbs: ["get", "list", "watch", "create", "delete", "deletecollection"]

# Node chaos scenarios
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get", "list", "watch", "update", "patch"]

# Deployment/StatefulSet/DaemonSet scenarios
- apiGroups: ["apps"]
  resources: ["deployments", "statefulsets", "daemonsets", "replicasets"]
  verbs: ["get", "list", "watch", "update", "patch", "delete"]

# Service and networking scenarios
- apiGroups: [""]
  resources: ["services", "endpoints"]
  verbs: ["get", "list", "watch", "create", "update", "delete"]

- apiGroups: ["networking.k8s.io"]
  resources: ["networkpolicies"]
  verbs: ["get", "list", "watch", "create", "update", "delete"]

# Namespace scenarios
- apiGroups: [""]
  resources: ["namespaces"]
  verbs: ["get", "list", "watch"]

# Job creation for scenario execution
- apiGroups: ["batch"]
  resources: ["jobs"]
  verbs: ["get", "list", "watch", "create", "update", "delete"]

# Events for monitoring
- apiGroups: [""]
  resources: ["events"]
  verbs: ["get", "list", "watch"]

Best Practices

  1. Use Dedicated Service Accounts: Create a dedicated service account in each target cluster specifically for chaos testing. This makes it easier to audit and control permissions.

  2. Rotate Credentials Regularly: Periodically rotate kubeconfig files and service account tokens to maintain security.

  3. Test Connectivity First: After adding a target cluster, run a simple non-destructive scenario to verify connectivity before running destructive chaos tests.

  4. Organize by Environment: Use clear naming conventions like prod-us-east-1, staging-eu-west, dev-local to easily identify clusters.

  5. Limit Production Access: Consider restricting production cluster access to specific users or requiring additional approval workflows.

  6. Monitor Operator Logs: Regularly check operator logs for authentication errors or connection issues.


ACM/OCM Integration (Advanced)

For organizations using Red Hat Advanced Cluster Management (ACM) or Open Cluster Management (OCM), the Krkn Operator provides seamless integration that automatically discovers and manages all ACM-controlled clusters as chaos testing targets.

What is ACM/OCM?

Advanced Cluster Management (ACM) and Open Cluster Management (OCM) are multi-cluster management platforms that allow you to manage multiple Kubernetes and OpenShift clusters from a single hub cluster. ACM/OCM provides:

  • Centralized cluster lifecycle management - Deploy, upgrade, and manage multiple clusters
  • Application deployment across clusters - Deploy applications to multiple clusters with policies
  • Governance and compliance - Apply security and compliance policies across your fleet
  • Observability - Monitor metrics, logs, and alerts from all managed clusters

How ACM Integration Works

When the ACM integration is enabled in the Krkn Operator, the krkn-operator-acm component automatically:

  1. Discovers all managed clusters registered with your ACM/OCM hub
  2. Imports them as chaos testing targets into the Krkn Operator console
  3. Keeps the cluster list synchronized as new clusters are added or removed from ACM
  4. Authenticates automatically using ACM’s ManagedServiceAccount resources—no manual credential management required

Benefits of ACM Integration

FeatureManual ConfigurationACM Integration
Cluster DiscoveryManual - add each cluster individuallyAutomatic - all ACM-managed clusters
Credential ManagementManual - maintain tokens/kubeconfig per clusterAutomatic - uses ManagedServiceAccount
Cluster UpdatesManual - update credentials when they changeAutomatic - ACM handles rotation
New ClustersManual - must add explicitlyAutomatic - discovered immediately
SecurityPer-cluster authenticationCentralized ACM RBAC with fine-grained control

Enabling ACM Integration

Step 1: Install with ACM Enabled

To enable ACM integration, install the Krkn Operator with the ACM component enabled via Helm:

helm install krkn-operator oci://quay.io/krkn-chaos/charts/krkn-operator \
  --version <VERSION> \
  --set acm.enabled=true \
  --namespace krkn-operator-system \
  --create-namespace

Or add it to your values.yaml:

acm:
  enabled: true
  replicaCount: 1

  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 200m
      memory: 256Mi

  logging:
    level: info
    format: json

For complete installation instructions and additional configuration options, see the Installation Guide.

Step 2: Verify ACM Component

After installation, verify that the ACM component is running:

kubectl get pods -n krkn-operator-system -l app.kubernetes.io/component=acm

# Expected output:
# NAME                                  READY   STATUS    RESTARTS   AGE
# krkn-operator-acm-xxxxxxxxx-xxxxx     1/1     Running   0          2m

Check the ACM component logs to see cluster discovery in action:

kubectl logs -n krkn-operator-system -l app.kubernetes.io/component=acm

# You should see logs like:
# INFO  Discovered 5 managed clusters from ACM
# INFO  Synced cluster: production-us-east
# INFO  Synced cluster: staging-eu-west

Configuring ManagedServiceAccounts (Fine-Grained Security)

One of the most powerful features of ACM integration is the ability to use ManagedServiceAccounts for authentication to target clusters. This provides fine-grained, per-cluster security control.

What are ManagedServiceAccounts?

ManagedServiceAccounts are a feature of OCM/ACM that allows the hub cluster to create and manage service accounts on spoke clusters. Instead of using a single highly-privileged service account (like open-cluster-management-agent-addon-application-manager), you can create dedicated service accounts with custom RBAC permissions for each cluster.

Configuring Per-Cluster Service Accounts

Navigate to Admin SettingsProvider ConfigurationACM to configure which ManagedServiceAccount to use for each cluster:

ACM Provider Configuration

For each managed cluster, you can:

  1. Select a ManagedServiceAccount: Choose from existing ManagedServiceAccounts created on that cluster
  2. Customize permissions per cluster: Each cluster can use a different service account with different RBAC permissions
  3. Apply the configuration: The Krkn Operator will use this service account for all chaos testing operations on that cluster

Why Use Custom ManagedServiceAccounts?

By default, ACM uses the open-cluster-management-agent-addon-application-manager service account, which has cluster-admin privileges on all spoke clusters. While convenient, this violates the principle of least privilege.

Using custom ManagedServiceAccounts provides:

Enhanced Security:

  • Least privilege access: Grant only the permissions needed for chaos testing (e.g., pod deletion, network policy creation) rather than full cluster-admin
  • Per-cluster customization: Production clusters can have more restrictive permissions than dev/test clusters
  • Audit trail: Each cluster has a dedicated service account, making it easier to track and audit chaos testing activities

Flexibility:

  • Environment-specific policies: Different permissions for prod, staging, and dev environments
  • Scenario-specific accounts: Create different service accounts for different types of chaos scenarios
  • Compliance: Meet security and compliance requirements by limiting operator privileges

Example: Creating a Custom ManagedServiceAccount

Create a ManagedServiceAccount with limited chaos testing permissions:

apiVersion: authentication.open-cluster-management.io/v1beta1
kind: ManagedServiceAccount
metadata:
  name: krkn-chaos-operator
  namespace: cluster-prod-us-east  # ManagedCluster namespace
spec:
  rotation: {}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: krkn-chaos-limited
rules:
# Pod chaos - read and delete only
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch", "delete"]

# Node chaos - read and cordon/drain only
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get", "list", "watch", "update", "patch"]

# Network policies - create and delete
- apiGroups: ["networking.k8s.io"]
  resources: ["networkpolicies"]
  verbs: ["get", "list", "create", "delete"]

# No destructive operations on critical resources
# (no namespace deletion, no service account manipulation, etc.)
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: krkn-chaos-limited-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: krkn-chaos-limited
subjects:
- kind: ServiceAccount
  name: krkn-chaos-operator
  namespace: open-cluster-management-agent-addon

Apply this to the ACM hub cluster, and the ManagedServiceAccount will be created on the spoke cluster automatically. You can then select it in the Provider Configuration UI.


Automatic Cluster Synchronization

Once ACM integration is enabled and configured, the Krkn Operator automatically:

  • Syncs cluster list every 60 seconds (configurable)
  • Adds new clusters as they’re imported into ACM
  • Removes clusters that are deleted from ACM
  • Updates cluster status based on ACM health checks
  • Rotates credentials automatically when ManagedServiceAccount tokens are refreshed

You can view all ACM-discovered clusters in the Cluster Targets page. They will be marked with an ACM badge to distinguish them from manually configured clusters.


Troubleshooting ACM Integration

ACM Component Not Starting

If the ACM component fails to start, check:

# Check pod status
kubectl get pods -n krkn-operator-system -l app.kubernetes.io/component=acm

# View logs
kubectl logs -n krkn-operator-system -l app.kubernetes.io/component=acm

# Common issues:
# - ACM/OCM not installed on the hub cluster
# - Missing RBAC permissions for the operator to read ManagedCluster resources
# - Network policies blocking communication

No Clusters Discovered

If the ACM component is running but no clusters appear:

  1. Verify ACM is managing clusters:

    kubectl get managedclusters
    
  2. Check if clusters are in “Ready” state:

    kubectl get managedclusters -o wide
    
  3. Review ACM component logs for discovery errors:

    kubectl logs -n krkn-operator-system -l app.kubernetes.io/component=acm | grep -i error
    

ManagedServiceAccount Not Working

If a cluster shows authentication errors after configuring a ManagedServiceAccount:

  1. Verify the ManagedServiceAccount exists and is ready:

    kubectl get managedserviceaccount -n <cluster-namespace>
    
  2. Check the ManagedServiceAccount has proper RBAC permissions on the spoke cluster

  3. Ensure the ManagedServiceAccount token hasn’t expired

For more detailed troubleshooting, see the ACM Integration Troubleshooting Guide.


Next Steps

Now that you’ve configured your target clusters (manually or via ACM), you’re ready to run chaos scenarios:

Last modified March 16, 2026: Krkn operator documentation (#239) (9b67631)