Skip to content
Local Development Setup

Nstance Development Environment

This document describes the local development environment for Nstance, which simulates the full production architecture without requiring cloud infrastructure or real object storage servers / Kubernetes clusters.

There’s also Development with Kind which explains how to run a dev environment using Kind for testing nstance-operator with a real Kubernetes cluster instead the mock dev-k8s server used in this document.

Architecture Overview

┌──────────────────────────────────────────────────────────────────────────────┐
│                           Development Environment                            │
│                                                                              │
│  ┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐ │
│  │   dev-s3    │     │   server    │     │   dev-k8s   │     │  operator   │ │
│  │  (gofakes3) │     │  (nstance-  │     │  (fake k8s  │     │  (nstance-  │ │
│  │   :8989     │◄────│   server)   │────►│    API)     │◄────│  operator)  │ │
│  │             │     │             │     │   :6443     │     │             │ │
│  └─────────────┘     └──────┬──────┘     └──────┬──────┘     └─────────────┘ │
│                             │                   │                            │
│                             │                   │                            │
│                      ┌──────▼──────┐     ┌──────▼──────┐                     │
│                      │    tmux     │     │    Nodes    │                     │
│                      │  session:   │     │   (fake)    │                     │
│                      │  nstance-   │────►│  created by │                     │
│                      │ dev-agents  │     │tmux provider│                     │
│                      │             │     │             │                     │
│                      │ ┌─────────┐ │     └─────────────┘                     │
│                      │ │ agent 1 │ │                                         │
│                      │ └─────────┘ │                                         │
│                      │ ┌─────────┐ │                                         │
│                      │ │ agent 2 │ │                                         │
│                      │ └─────────┘ │                                         │
│                      └─────────────┘                                         │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

Components

1. dev-s3 (Fake S3 Server)

A fake S3 server using gofakes3 that stores files on the local filesystem.

  • Port: 8989
  • Storage: temp/dev-s3/
  • Bucket: dev

Used by nstance-server to store:

  • Configuration files
  • Secrets and encryption keys
  • CA certificates and keys
  • Instance state

2. nstance-server

The main Nstance control plane server running with the tmux provider.

Ports:

PortServiceDescription
8990HealthHTTP health checks
8991LeaderLeader election health
8992RegistrationgRPC - Agent/Operator registration (TLS, no client cert)
8993OperatorgRPC - Operator sync/drain (mTLS)
8994AgentgRPC - Agent communication (mTLS)

Tmux Provider Behavior:

  • Creates agents as tmux windows (running nstance-agent processes directly) instead of cloud VMs
  • Runs agents via Air for hot-reload of agent code
  • Creates fake Kubernetes Node resources as JSON files in dev-k8s temp dir when instances are created
  • Cleans up Node resource JSON files when instances are deleted

3. dev-k8s (Fake Kubernetes API)

A minimal fake Kubernetes API server that stores/reads resources to/from JSON files.

  • Port: 6443
  • Storage: temp/dev-k8s/

Supported Resources:

  • Core API (/api/v1): Secrets, ConfigMaps, Namespaces, Nodes, Pods
  • Nstance CRDs (infrastructure.cluster.x-k8s.io/v1beta1): NstanceCluster, NstanceMachine, NstanceMachinePool, NstanceMachineTemplate
  • Cluster API (cluster.x-k8s.io/v1beta2): Cluster, Machine, MachinePool
  • Coordination (coordination.k8s.io/v1): Lease

How it works:

  • Resources are stored as JSON files in temp/dev-k8s/{resource}/{namespace}/{name}.json
  • Cluster-scoped resources (Nodes) are stored in temp/dev-k8s/{resource}/{name}.json
  • No schema validation - accepts any valid JSON
  • Supports watch via file system notifications (fsnotify) so you can change resources simply by editing the respective JSON file

4. nstance-operator

The Kubernetes operator that syncs instance groups between nstance-server and Kubernetes.

Configuration:

  • Uses a generated kubeconfig pointing to dev-k8s
  • Connects to nstance-server on separate ports for registration (8992) and operations (8993)
  • Uses JSON content type instead of protobuf (dev-k8s limitation)

Startup Flow:

  1. Waits for nstance-server to be healthy
  2. Waits for dev-k8s to be healthy
  3. Creates CA ConfigMap from temp/dev-s3/cluster/ca.crt
  4. Generates registration nonce via nstance-admin CLI tool and writes to a Secret
  5. Creates operator config with shard endpoints
  6. Starts operator with Air for hot-reload

5. tmux Agent Session

Agents run in a dedicated tmux session for isolation from Overmind’s tmux session.

  • Session name: nstance-dev-agents
  • Window naming: nstance-agent-{instanceID}

Each agent window runs Air for hot-reload, enabling code changes to automatically restart agents.

Quick Start

Prerequisites

make check  # Verifies: go, tmux, air, overmind

Starting the Environment

# Full stack: s3 + server + dev-k8s + operator (clean start recommended)
make clean-dev && make dev-tmux-k8s

# Server only: s3 + server (for admin CLI testing or running operator separately)
make clean-dev && make dev-tmux

This starts components via Overmind:

  • s3: dev-s3 fake object storage server
  • server: nstance-server with tmux dev provider (2 instances)
  • k8s: dev-k8s fake Kubernetes API (dev-tmux-k8s only)
  • operator: nstance-operator (dev-tmux-k8s only)

Viewing Logs

Overmind shows combined logs. To view specific component logs:

# In the Overmind terminal, press:
# Ctrl+C to stop all
# Or use overmind commands:
overmind connect server    # Connect to server process
overmind connect operator  # Connect to operator process

Viewing Agent Logs

Agents run in a separate tmux session:

tmux attach -t nstance-dev-agents
# Use Ctrl+B, N to switch between agent windows
# Use Ctrl+B, D to detach

Directory Structure

temp/
├── cache/                    # nstance-server cache (SQLite, etc.)
│   └── db/
│       └── nstance.db
├── dev-k8s/                  # dev-k8s resource storage
│   ├── configmaps/
│   │   └── default/
│   │       └── nstance-cluster-ca.json
│   ├── secrets/
│   │   └── default/
│   │       ├── nstance-operator-cert.json
│   │       ├── nstance-operator-key.json
│   │       └── nstance-operator-nonce.json
│   ├── nstancemachinepools/
│   │   └── default/
│   │       └── nstance-test.json
│   ├── machinepools/
│   │   └── default/
│   │       └── nstance-test.json
│   └── nodes/
│       └── {instanceID}.json   # Created by tmux provider
├── dev-s3/                   # dev-s3 file storage (object storage bucket contents)
│   ├── ca.crt                # Cluster CA certificate
│   ├── ca.key                # Cluster CA private key (encrypted)
│   ├── config/
│   │   └── config.jsonc      # Server configuration
│   └── secret/
│       └── ...               # Encrypted secrets
└── operator/                 # Operator runtime config
    ├── config.yaml           # Shard endpoints
    └── kubeconfig            # dev-k8s kubeconfig

Configuration

Server Configuration

The server reads configuration from examples/config-tmux.jsonc. Key settings for dev mode:

{
  "server": {
    "provider": {
      "kind": "tmux",     // Uses tmux provider (local agents in tmux)
      "region": "dev",
      "zone": "deva"
    },
    "bind": {
      "health_addr": "0.0.0.0:8990",
      "election_addr": "0.0.0.0:8991",
      "registration_addr": "0.0.0.0:8992",
      "operator_addr": "0.0.0.0:8993",
      "agent_addr": "0.0.0.0:8994"
    }
  }
}

Operator Configuration

Generated automatically by scripts/dev-operator.sh:

cluster_id: example-cluster
tenant: default
shards:
  dev:
    registration_addr: "127.0.0.1:8992"  # For initial mTLS registration
    operator_addr: "127.0.0.1:8993"      # For ongoing sync/drain operations

Note: Uses 127.0.0.1 instead of localhost to avoid IPv6 resolution issues on macOS.

How Registration Works

  1. Operator starts and loads CA certificate from nstance-cluster-ca ConfigMap
  2. Operator generates keypair and stores in nstance-operator-key Secret
  3. Operator loads nonce from nstance-operator-nonce Secret
  4. Operator connects to registration port (8992) with TLS (server auth only)
  5. Server issues certificate signed by cluster CA
  6. Operator stores certificate in nstance-operator-cert Secret
  7. Operator connects to operator port (8993) with mTLS for sync/drain

How Instance Creation Works

  1. Reconciler decides to create an instance
  2. Tmux provider creates temp directory with identity files (nonce, CA cert)
  3. Tmux provider creates tmux window running air -c scripts/air/agent.toml
  4. Tmux provider creates fake Node JSON in temp/dev-k8s/nodes/
  5. Agent starts, registers with server using nonce
  6. Server issues client certificate to agent
  7. Agent connects to agent service (8994) with mTLS
  8. Operator can see the Node via dev-k8s and perform drain operations

Troubleshooting

Operator can’t connect to server

Check that you’re using 127.0.0.1 instead of localhost in shard endpoints. macOS resolves localhost to IPv6 first, but the server binds to 0.0.0.0 (IPv4 only).

Operator can’t read secrets after creating them

This is a controller-runtime cache issue. The code now builds TLS config directly from PEM data instead of re-reading from Kubernetes.

“unknown service” errors from operator

The operator might be connecting to the wrong port. Registration happens on 8992, but sync/drain operations use 8993.

Agents not appearing

Check the tmux session:

tmux attach -t nstance-dev-agents

If the session doesn’t exist, the tmux provider will create it on next instance creation.

dev-k8s doesn’t have a resource type

Add the resource to cmd/dev-k8s/handle_discovery.go. The CRUD handlers are generic and work with any resource.

Cleaning Up

# Clean all dev state
make clean-dev

# This removes:
# - temp/ directory (all dev state)
# - Kills tmux agent session

Using with a Real Kubernetes Cluster (kind)

Here we’ll cover how to run the Nstance operator locally and have it connecting to a kind (Kubernetes in Docker) cluster, and a local Nstance server using the dev-tmux provider, for fast development iteration and debugging.

Start an Nstance Server

The first thing you’ll want to do is ensure you have your nstance-server running with a dev-s3, as this will generate a fresh CA certificate, which the operator will require a copy of.

1. Start the nstance-server (in one terminal):

make clean-dev && make dev-tmux

This starts s3 + server only (no dev-k8s or operator), which is what you want since the operator will run separately against kind.

Start a Kubernetes Cluster

2. Create a kind configuration file at temp/kind-config.yaml:

mkdir -p temp
cat > temp/kind-config.yaml << 'EOF'
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
EOF

3. Create and start the kind cluster:

kind create cluster --name nstance-dev --config temp/kind-config.yaml

Check your current kubectl context is set to kind-nstance-dev:

kubectl config current-context

Verify cluster is running:

kubectl cluster-info

Prepare the Kubernetes Cluster

4. Install cert-manager (required by CAPI for webhook TLS certificates):

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml
kubectl wait --for=condition=Available deployment --all -n cert-manager --timeout=120s

5. Deploy the Cluster API (CAPI) components:

The Nstance operator is a CAPI infrastructure provider and requires the core CAPI CRDs, controllers, and validation webhooks:

CAPI_VERSION=$(curl -sL https://api.github.com/repos/kubernetes-sigs/cluster-api/releases/latest | jq -r .tag_name)
curl -sL "https://github.com/kubernetes-sigs/cluster-api/releases/download/$CAPI_VERSION/core-components.yaml" \
  | sed -E 's/\$\{[A-Za-z_][A-Za-z0-9_]*:=([^}]*)\}/\1/g' \
  | kubectl apply --server-side -f -
kubectl wait --for=condition=Available deployment --all -n capi-system --timeout=120s

6. Deploy the Nstance CRDs:

kubectl apply -k config/crd/

7. Deploy the Nstance Cluster CA certificate ConfigMap

The nstance-server you started in step 1 will generate a new CA certificate and upload it to the dev-s3 server, stored at ./temp/dev-s3/cluster/ca.crt.

Let’s create a new ConfigMap with it embedded, for the Nstance Operator to use:

kubectl create configmap nstance-cluster-ca \
  --from-file=ca.crt=temp/dev-s3/cluster/ca.crt

8. Export the Kind kubeconfig for the Operator:

mkdir -p temp/operator
kind get kubeconfig --name nstance-dev > temp/operator/kubeconfig

9. Create the operator config file (read from --config flag, not a ConfigMap):

The shard IDs and ports must match the dev-tmux server instances. By default, dev-tmux runs 2 server instances (server=2) with port scheme: base + (instance-1) * 10.

cat > temp/operator/config.yaml << 'EOF'
cluster_id: example-cluster
tenant: default
shards:
  dev-1:
    registration_addr: "127.0.0.1:8992"
    operator_addr: "127.0.0.1:8993"
  dev-2:
    registration_addr: "127.0.0.1:9002"
    operator_addr: "127.0.0.1:9003"
EOF

10. Generate the registration nonce & store in a Kubernetes Secret:

Compile and use the nstance-admin binary and run it against the dev-s3 service (started in step 1):

make nstance-admin
NONCE_JWT=$(AWS_ACCESS_KEY_ID=dev \
AWS_SECRET_ACCESS_KEY=dev \
AWS_ENDPOINT_URL=http://localhost:8989 \
AWS_S3_USE_PATH_STYLE=true \
NSTANCE_ENCRYPTION_KEY=thisisatest32bytekey123456789012 \
./bin/nstance-admin cluster nonce \
 --cluster-id example-cluster \
 --storage-bucket dev \
 --key-provider env \
 --output \
-)
kubectl create secret generic nstance-operator-nonce \
  --from-literal=nonce.jwt="$NONCE_JWT"

This will generate a nonce valid for 3 hours by default (extend e.g. with --expiry 24h passed to the nstance-admin cluster nonce command above).

Run the Operator Locally Against the Kubernetes Cluster

11. Run the operator:

make dev-operator

This runs the operator with Air for hot-reload, and configures the namespace, kubeconfig, and nstance-operator config file automatically. Note the differences in runtime dependencies when run locally vs in-cluster:

namespace: (for Secrets & ConfigMaps)

  • in-cluster: /var/run/secrets/kubernetes.io/serviceaccount/namespace
  • locally: NSTANCE_NAMESPACE=default

kubeconfig:

  • in-cluster: /var/run/secrets/kubernetes.io/serviceaccount/.
  • locally: temp/operator/kubeconfig

nstance-operator config:

  • in-cluster: /etc/nstance/operator/config.yaml (Helm chart mounts from ConfigMap).
  • locally: temp/operator/config.yaml (specified via --config argument).

What Happens

The operator will:

  • Connect to the Kind cluster via the kubeconfig file
  • Load the cluster CA from the nstance-cluster-ca ConfigMap
  • Load or generate an Ed25519 keypair (stored in nstance-operator-key Secret)
  • Register with nstance-server using the nonce from nstance-operator-nonce Secret
  • Receive and store a client certificate in nstance-operator-cert Secret
  • Connect to the operator gRPC port (8993) with mTLS for sync/drain operations
  • Reconcile NstanceMachinePool, NstanceMachine, and NstanceShardGroup CRDs

This setup allows for rapid development cycles without needing to rebuild container images for each change.

Cleaning Up

kind delete cluster --name nstance-dev