Security
This page documents Nstance’s security model, which uses:
- TLS on all gRPC endpoints for in-transit protection.
- Short-lived registration nonce JWTs for bootstrap identity.
- mTLS client certificates for ongoing authenticated API access.
- Role-based authorization (
agentvsoperator) enforced server-side. - Tenant scoping embedded in client certificates.
- Encrypted secret storage with pluggable backends.
At a high level:
- The registration endpoint uses TLS and accepts anonymous clients for initial bootstrap only, and those clients must present a valid nonce JWT.
- Agent and operator APIs require mTLS and reject clients without valid certificates signed by the cluster CA.
- Sensitive key material (
ca.key,registration-nonce.key) is loaded from the configured secrets store.
Transport Security
Nstance Server uses TLS 1.3 for all gRPC services:
| Endpoint | Default purpose | TLS mode |
|---|---|---|
Registration (:8992) | Initial bootstrap registration | TLS server-auth only (NoClientCert) |
Agent (:8994) | Agent operations | Mutual TLS (RequireAndVerifyClientCert) |
Operator (:8993) | Operator operations | Mutual TLS (RequireAndVerifyClientCert) |
This means:
- Registration is encrypted and server-authenticated, but client identity is established using nonce JWT + issued certificate flow.
- After registration, all authenticated operations move to mTLS-protected endpoints.
Identity Bootstrap (Registration Nonce JWT)
Initial identity is established through registration nonce JWTs signed with an Ed25519 private key (registration-nonce.key).
Required claims
The server validates nonce JWTs and requires the following claims:
kind(agentoroperator)sub(instance ID for agents, cluster ID for operators)cluster_idtenant- standard time validity (
exp,nbfwhere present)
Additional informational claims are included but not enforced by the JWT validator:
shard(validated downstream during agent registration)config_hash(group runtime config hash at provision time)group(group key)on_demand(whether instance is on-demand)
Additional validation
Agent registration checks:
kind == "agent"cluster_idmatches server configshardmatches server config- nonce exists in local SQLite state
- nonce has not already been used (
registered_atcheck)
Operator registration checks:
kind == "operator"cluster_idmatches server config- nonce passes operator nonce validation in SQLite
This prevents replay and cross-cluster misuse of nonce tokens.
mTLS Authentication and Authorization
After registration, clients authenticate with certificates issued by the cluster CA.
Certificate requirements
- Client certificate chain must validate against cluster CA.
- Certificate must include:
- Common Name (
CN) used as client ID. - Exactly one
Organizationvalue, used as tenant identity. - Custom role extension OID
1.3.6.1.4.1.999999.1with role (agentoroperator). Note: this OID uses an unregistered Private Enterprise Number and is intended for internal use only.
- Common Name (
Role enforcement
Server enforces service-level authorization:
- Agent service requires role
agent. - Operator service requires role
operator.
If role does not match required endpoint role, request is rejected (PermissionDenied).
PKI and Certificate Lifecycle
CA and key material
ca.crtis loaded from cluster-scoped object storage.ca.keyis loaded from the configured secrets store.- If CA material does not exist, server bootstrap generates it and stores:
- cert in object storage (
ca.crt) - private key in secrets store (
ca.key)
- cert in object storage (
Registration nonce signing key
registration-nonce.keyis loaded from secrets store.- If missing, only the cluster leader may generate and persist it.
Client certificate issuance
Clients must submit an Ed25519 public key as part of the registration request. The server enforces this key type. On successful registration:
- Server signs a client certificate using the Ed25519 CA key, binding the client’s Ed25519 public key.
- Tenant is embedded in certificate
Organization. - Role is embedded in custom OID extension.
- Certificate TTL is taken from config when provided, with defaults applied otherwise.
- A registration record is persisted to object storage and local SQLite for audit and state tracking.
Certificate serial log (certlog)
For batch certificate generation (e.g. agent file generation), Nstance writes certificate serial logs to object storage under the certlog/ prefix. Each log entry is a JSON file stored at certlog/{tenant}.{timestamp}.{instanceID}.json containing:
- Instance ID and tenant
- Issuance timestamp
- List of certificate names, serial numbers, and expiry times
This provides an append-only audit trail of all certificates issued, scoped by tenant.
Renewal
Operator certificate renewal is supported via OperatorService/RenewCertificate:
- Requires valid existing operator mTLS certificate.
- Requires cluster leadership.
- Issues a new operator certificate for the authenticated cluster ID + tenant identity.
Secret Storage and Encryption
Nstance uses a pluggable secrets store abstraction for sensitive values.
Typical core secrets:
ca.keyregistration-nonce.key- additional operator or workload secrets as configured
When using the object-storage secrets provider, Nstance performs client-side encryption of secret blobs using AES-256-GCM before writing them to the storage backend (S3, GCS, etc.):
- Algorithm: AES-256-GCM (Galois/Counter Mode), providing both confidentiality and integrity.
- Key size: Encryption keys must be exactly 32 bytes (256-bit). Keys that do not match this length are rejected at startup.
- Nonce: A cryptographically random 12-byte nonce is generated per-encryption using
crypto/rand. Each encrypted blob is stored asnonce (12 bytes) || ciphertext + GCM tag, so the nonce travels with the data and does not need to be managed separately. - Key sources: Encryption keys can be loaded from multiple providers:
env— environment variablefile— local file pathaws-secrets-manager— AWS Secrets Manager secret (by ARN or name)gcp-secret-manager— GCP Secret Manager secret (by name, withproject_idoption)
- Key rotation: The configuration supports a primary
encryption_key(used for all new writes) and a list ofold_encryption_keys(used for decryption only). On read, Nstance attempts decryption with each configured key in order until one succeeds, allowing a rotation window where old ciphertexts remain readable while new writes use the current key. - Optional: If no encryption keys are configured, secrets are stored and retrieved in plaintext. Encryption is strongly recommended for production deployments.
Leadership and Security-Critical Operations
Several security-sensitive operations are leader-gated:
- Agent registration requires shard leadership.
- Operator registration requires cluster leadership.
- Operator certificate renewal requires cluster leadership.
- Registration nonce key generation is cluster-leader-only when key is missing.
This avoids split-brain issuance behavior across active replicas.
Operational Hardening Guidance
For production deployments:
- Restrict network access so registration, agent, and operator gRPC ports are reachable only by intended callers.
- Treat the secrets backend as a high-trust boundary, especially for
ca.keyandregistration-nonce.key. - Use least-privilege IAM for object storage and secret provider access.
- Rotate encryption keys and signing material through controlled procedures.
- Keep debug-only behavior (for example, gRPC reflection) disabled in production.
- Configure appropriate certificate TTLs and monitor for expiring certificates, especially operator certificates which support renewal via
RenewCertificate. - Periodically review
certlog/entries in object storage for unexpected certificate issuance.