Skip to content
Cluster API Integration

Cluster API Integration

Nstance implements a Cluster API (CAPI) infrastructure provider to enable:

  1. Integration with Cluster Autoscaler to drive scaling events for Nstance shard groups by adjusting the number of MachinePool replicas automatically.

  2. Manually scaling Nstance shard groups by adjusting the number of MachinePool replicas e.g. via kubectl.

  3. Creating on-demand instances via Pod annotations, where the Nstance Operator will then create the appropriate Machine resource for the Nstance Operator to assign and sync to a Nstance Server. This approach was taken to provide visibility to cluster administrators into requested (Machine) vs created (Node) instances.

Nstance does not provision control planes, however to satisfy CAPI contract requirements and enable the CAPI controllers to function correctly, cluster-level CAPI resources (Cluster / NstanceCluster) are created by Nstance.

Deployment Scenarios

Cluster API has the concept of a “management” cluster, and a “workload” cluster, in which there are two deployment scenarios Nstance supports…

Self-managed clusters

The CAPI operator runs inside the same cluster it manages.

CAPI’s runningOnWorkloadCluster check detects this by finding a matching CAPI controller pod UID and switches to in-cluster credentials. The kubeconfig SA token is only used for the initial health probe and pod lookup.

External clusters

The CAPI operator manages instances on a different cluster.

The kubeconfig points to the management cluster (where CAPI runs), not the workload cluster. runningOnWorkloadCluster returns false (pod not found, 404) and CAPI uses the SA token for all workload cluster API calls.

Custom Resource Types

Nstance defines four CAPI infrastructure provider CRDs:

CRDCAPI ContractPurpose
NstanceClusterInfraClusterStub that satisfies CAPI’s cluster infrastructure ref requirement
NstanceMachinePoolInfraMachinePoolMaps an Nstance Group to a CAPI MachinePool, distributing replicas across zone shards
NstanceMachineInfraMachineRepresents a single Nstance instance (used for on-demand nodes)
NstanceMachineTemplateInfraMachineTemplateImmutable template for stamping out Machine/NstanceMachine pairs

In addition to the CAPI infrastructure provider CRDs, Nstance maintains its own NstanceShardGroup CRD, which the NstanceMachinePool effectively aggregates

  • For example, if you have two shards both with a “test” group with 1 instance each, you will have two NstanceShardGroup resources with 1 instance in each, and these will map to a single NstanceMachinePool with 2 replicas.
  • This approach was taken to provide visibility to cluster administrators into the aggregated vs distributed replicas count (rather than doing the (dis-)aggregation only at runtime).

CAPI Cluster Resource

CAPI requires a Cluster resource as the ownership root for MachinePools and Machines. The operator creates this automatically on startup via ensureCAPICluster in the leader manager. The cluster name is derived from operator config as <cluster_id>--<tenant_id>. Note that neither cluster ID or tenant ID can contain consecutive hyphens.

All namespaced CAPI and Nstance CRD resources are created in the operator’s namespace (configurable via NSTANCE_NAMESPACE, defaults to the pod’s own namespace). CAPI’s core controllers (typically in capi-system) watch across all namespaces, so they do not need to be co-located.

Within ensureCAPICluster, three resources are created together:

  1. NstanceCluster — infrastructure ref target with a controlPlaneEndpoint (host and port parsed from the management cluster’s API server address). CAPI’s setPhase requires a valid endpoint (non-empty host, non-zero port) for the Cluster to reach “Provisioned” phase.

  2. CAPI Cluster — references the NstanceCluster via spec.infrastructureRef.

  3. Kubeconfig Secret (<cluster>-kubeconfig) — provides CAPI’s ClusterCache with credentials to connect to the “workload” cluster. Since Nstance does not provision control planes, this points at the management cluster itself.

NstanceCluster Controller

The NstanceClusterReconciler has a single job: mark the NstanceCluster as provisioned and ready. On each reconcile it sets:

  • status.initialization.provisioned = true
  • A Ready condition with status True

There is no cluster-level infrastructure to provision — Nstance handles everything at the pool/machine level. However, for the required behaviour from the CAPI controllers, we must set this status to provisioned.

Kubeconfig Secret

CAPI’s MachinePool controller requires a <cluster>-kubeconfig secret to connect to the workload cluster via its ClusterCache.

The operator’s behavior depends on whether NSTANCE_CAPI_ENDPOINT is set (see Deployment Scenarios above):

Self-managed (default)

When NSTANCE_CAPI_ENDPOINT is not set, the operator auto-manages the kubeconfig secret with short-lived tokens from a dedicated ServiceAccount:

  1. The operator calls the Kubernetes TokenRequest API against the nstance-capi-workload ServiceAccount (configurable via NSTANCE_CAPI_SERVICEACCOUNT env var).
  2. A 1-hour token is issued and embedded in a kubeconfig pointing at https://kubernetes.default.svc:443 (the in-cluster API server address, since CAPI controllers run as pods in the same cluster).
  3. The token expiry is stored in the secret’s nstance.dev/token-expiry annotation.
  4. On each leader start, the operator checks the annotation and refreshes the token if it expires within 10 minutes.

External cluster

When NSTANCE_CAPI_ENDPOINT is set (e.g. https://workload.example.com:6443), the operator uses the provided endpoint as the controlPlaneEndpoint on the NstanceCluster and skips kubeconfig secret management entirely. The administrator is responsible for creating and rotating the <cluster>-kubeconfig secret with credentials for the workload cluster.

The secret must be in the same namespace as the CAPI Cluster resource and carry the cluster.x-k8s.io/cluster-name label — CAPI’s ClusterCache discovers it by label and namespace match.

RBAC for the CAPI ServiceAccount

The nstance-capi-workload ServiceAccount is bound to the nstance-capi-workload ClusterRole with minimal permissions:

ResourceVerbsReason
nodesget, list, watchCAPI’s ClusterCache needs node access for node ref matching
podsgetCAPI’s runningOnWorkloadCluster GETs its own pod via the kubeconfig to detect if the management and workload clusters are the same. A non-404 error (e.g. 403 Forbidden) blocks the ClusterCache connection entirely.
nonResourceURLs: ["/"]getCAPI’s health probe does GET / before establishing a ClusterCache connection. This is not covered by the standard system:discovery ClusterRole.

These resources are deployed via the Helm chart (capi-workload-*.yaml templates) or the static manifests in config/rbac/capi-workload.yaml.

MachinePool Integration

The operator creates a CAPI MachinePool for each NstanceMachinePool, setting:

  • spec.clusterName — references the CAPI Cluster
  • spec.template.spec.infrastructureRef — references the NstanceMachinePool
  • spec.template.spec.bootstrap.dataSecretName — set to empty string (Nstance handles bootstrap independently)

Replica counts flow from the MachinePool through the NstanceMachinePool controller, which distributes them across NstanceShardGroup resources — one per zone shard. Each NstanceShardGroup controller calls UpsertGroup on its shard to reconcile server-side state.

MachinePool Phase

CAPI’s MachinePool controller computes the phase field using deprecated.v1beta1.readyReplicas, which it derives by matching spec.providerIDList entries against cluster Nodes with a matching node.spec.providerID. The phase reaches Running only when all provider IDs resolve to Ready Nodes.

If CAPI cannot match provider IDs to Nodes, the MachinePool phase will remain ScalingUp. For example this happens in development because the tmux provider does actually registered Kubernetes Nodes - in this case it is cosmetic, the NstanceMachinePool and NstanceShardGroup controllers function correctly regardless of the MachinePool phase.

Key Source Files

FileRole
internal/operator/leader/manager.goensureCAPICluster, ensureKubeconfigSecret, parseAPIServerEndpoint
internal/operator/controller/nstancecluster_controller.goNstanceCluster reconciler
internal/operator/controller/nstancemachinepool_controller.goNstanceMachinePool reconciler, MachinePool creation
api/v1beta1/nstancecluster_types.goNstanceCluster CRD types
api/v1beta1/nstancemachinepool_types.goNstanceMachinePool CRD types
config/rbac/capi-workload.yamlDev RBAC manifests for nstance-capi-workload
deploy/helm/templates/capi-workload-serviceaccount.yamlHelm chart SA template
deploy/helm/templates/capi-workload-clusterrole.yamlHelm chart ClusterRole template
deploy/helm/templates/capi-workload-clusterrolebinding.yamlHelm chart ClusterRoleBinding template