Proxmox Integration Guide for Nstance
This document describes how Nstance integrates with Proxmox VE for on-premise virtual machine lifecycle management.
Overview
Nstance Server can use Proxmox VE as a provider for VM lifecycle management, enabling on-premise container management node orchestration such as for Kubernetes clusters.
Concepts Mapping
| Nstance Concept | Proxmox Equivalent | Notes |
|---|---|---|
| Region | n/a | Single cluster per provider instance |
| Zone | Proxmox Cluster | e.g. one VE cluster for us-east-1a, one for us-east-2a equivalents |
| SubnetID | Bridge + VLAN | e.g. vmbr0 or vmbr0.100 |
| InstanceType | Resource spec | cpu:4,memory:8192,disk:50 |
| ProviderInstanceID | VMID | Proxmox numeric VM ID (unique at any point in time, but may be reused after deletion) |
Bootstrap Scripts
Nstance provides a set of bootstrap shell scripts for bootstrapping a Proxmox VE cluster and all its nodes to run Nstance - see deploy/proxmox
| Script | Purpose | Run where |
|---|---|---|
vm-template-setup.sh | Create a VM template from a cloud image | Each PVE node |
seaweedfs-test-setup.sh | Install single-node SeaweedFS (S3-compatible object storage) | One PVE node (dev/test only) |
dnsmasq-test-setup.sh | Install dnsmasq DHCP server for a bridge interface | One PVE node per subnet (dev/test only) |
create-shard-config.sh | Generate shard config (JSONC) and upload to object storage | Any machine with object storage access |
server-with-keepalived.sh | Install nstance-server + keepalived as systemd services | Each PVE node |
All scripts support --dry-run and --help. See each script’s header for full options.
Requirements
Proxmox Environment
- Proxmox VE 8+ cluster (7.3 introduced VM tags in GUI, required for instance tracking, 8 changed the boot format)
- API access enabled with dedicated API token
- Shared storage accessible from all nodes (for VM templates and cloud-init ISOs)
- DHCP server on target network bridge
- VM template with cloud-init support (qemu-guest-agent recommended for Proxmox management)
Network Requirements
- Nstance Server must have network connectivity to the Proxmox API (port 8006)
- VMs must have network connectivity to the Nstance Server gRPC endpoint
- Bridge/VLAN configuration consistent across all cluster nodes
If using keepalived for VIP failover:
- VRRP (IP protocol 112) multicast must be permitted between nodes on the interface where the VIP lives. If the Proxmox firewall is enabled, add a rule to allow VRRP from the VIP subnet (e.g.
IN ACCEPT -source 10.0.0.0/24 -p vrrpin/etc/pve/firewall/cluster.fw). Without this, nodes cannot see each other’s VRRP advertisements and both will hold the VIP simultaneously (split-brain).
Object Storage Backend
Nstance Server requires an object storage backend that supports If-Match headers for leader election. Supported providers include AWS S3, Google Cloud Storage, and S3-compatible services such as Ceph RGW and SeaweedFS.
Option 1: Public Cloud Object Storage (Recommended for Hybrid Public/Private Deployments)
For hybrid public/private cloud setups, use a managed object storage service from a public cloud provider (e.g. AWS S3 or Google Cloud Storage).
AWS S3:
nstance-server --storage s3 --bucket nstance --shard <shard> --id <id>
# Standard AWS SDK authentication
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...Google Cloud Storage:
nstance-server --storage gcs --bucket nstance --shard <shard> --id <id>
# Standard GCP SDK authentication
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.jsonOption 2: SeaweedFS with etcd (Recommended for Private Cloud Deployments)
For fully private cloud deployments, run SeaweedFS with etcd as the metadata store:
Why SeaweedFS:
- Apache 2.0 License - Permissive licensing suitable for any deployment
- PutObject Preconditions - Supports
If-Matchheaders required for Nstance leader election - Lightweight - Simple deployment, low resource overhead
- S3 API Compatible - Works with standard AWS SDK
Why etcd: SeaweedFS needs a metadata store. To ensure high availability and strong consistency, even in the event of partioning between Proxmox VE clusters, etcd is recommended for the metadata store.
Production Deployment with etcd:
# SeaweedFS with etcd for HA metadata
weed master -mdir=/data/master -peers=master1:9333,master2:9333,master3:9333
weed volume -mserver=master1:9333,master2:9333,master3:9333 -dir=/data/volume
weed filer -master=master1:9333,master2:9333,master3:9333
weed s3 -filer=localhost:8888 -port=8333Nstance Server Configuration:
nstance-server --storage s3 --bucket nstance --shard <shard> --id <id>
# S3-compatible credentials
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=admin
export AWS_SECRET_ACCESS_KEY=admin
export AWS_ENDPOINT_URL=http://seaweedfs.example.com:8333
export AWS_S3_USE_PATH_STYLE=trueUse the create-shard-config.sh bootstrap script to generate the shard configuration file — see the proxmox bootstrap scripts for details.
Option 3: Proxmox Ceph RGW (Alternative for Private Cloud Deployments)
If your Proxmox cluster already runs Ceph for storage, you can enable the RADOS Gateway (RGW) for S3 compatibility. This avoids deploying additional infrastructure but Ceph is known to be more complex to configure and manage.
Deployment Topology
Run one object storage deployment per region, accessible by all Nstance Server shards in that region.
Authentication
Proxmox API connection is configured using environment variables:
export PROXMOX_API_URL='https://localhost:8006/api2/json' # defaults to this if not set
export PROXMOX_TOKEN_ID='nstance@pve!nstance-token'
export PROXMOX_TOKEN_SECRET='xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'Note: use single quotes on the PROXMOX_TOKEN_ID value to escape the exclamation mark.
The Nstance Server will fail to start if PROXMOX_TOKEN_ID or PROXMOX_TOKEN_SECRET is missing when using the proxmox provider. PROXMOX_API_URL defaults to https://localhost:8006/api2/json if not set, which works when nstance-server runs on the Proxmox node itself.
Creating an API Token:
- In the Proxmox web UI, navigate to Datacenter → Permissions → API Tokens → Add
- Select a user (e.g.
root@pamor create a dedicatednstance@pveuser) - Enter a Token ID (e.g.
nstance-token) - Uncheck Privilege Separation to inherit the user’s permissions
- Click Add and copy the secret immediately (shown only once)
The PROXMOX_TOKEN_ID format is user@realm!tokenid, e.g. root@pam!nstance-token.
Required Proxmox Permissions:
The API token needs the following privileges. The simplest setup is
PVEVMAdmin + PVEDatastoreAdmin + PVEAuditor + PVESDNUser on /
(see bootstrap scripts for exact commands).
For reference, the complete list of individual privileges used:
| Privilege | Used for | Justification |
|---|---|---|
Sys.Audit | Query node resources for scheduling, cluster discovery, task status | GET /cluster/resources, GET /cluster/status, GET /nodes/{node}/status, GET /nodes/{node}/tasks/{upid}/status |
VM.Allocate | Reserve next VMID, clone target allocation, delete VMs | GET /cluster/nextid, POST .../clone target, DELETE /nodes/{node}/qemu/{vmid} |
VM.Audit | Read VM status/config, list VMs for template resolution | GET /nodes/{node}/qemu, GET .../qemu/{vmid}/status/current, GET .../qemu/{vmid}/config |
VM.Clone | Clone VM templates | POST /nodes/{node}/qemu/{vmid}/clone |
VM.PowerMgmt | Start and stop VMs | POST .../status/start, POST .../status/stop |
VM.Config.CPU | Set CPU cores | POST .../config with cores |
VM.Config.Memory | Set memory | POST .../config with memory |
VM.Config.Network | Set network devices | POST .../config with net0 |
VM.Config.Disk | Set boot order, resize disks | POST .../config with boot, PUT .../resize |
VM.Config.CDROM | Attach/unmount cloud-init ISO | POST .../config with ide2 |
VM.Config.Options | Set VM tags, description, start-on-boot | POST .../config with tags, description, onboot |
Datastore.Audit | List and browse storage contents | GET /nodes/{node}/storage, GET .../content/{volume} |
Datastore.Allocate | Delete cloud-init ISO volumes | DELETE /nodes/{node}/storage/{storage}/content/{volume} |
Datastore.AllocateSpace | Allocate disk space during clone | POST .../clone target storage allocation |
Datastore.AllocateTemplate | Upload cloud-init ISOs | POST /nodes/{node}/storage/{storage}/upload |
SDN.Use | Access network bridges when cloning VMs | POST .../clone and POST .../config with net0 |
Provider Configuration
ProviderConfig
| Field | Type | Description |
|---|---|---|
kind | string | Must be "proxmox" |
region | string | Cluster name (for metadata only) |
zone | string | Same as region |
insecure_tls | bool | Skip TLS certificate verification (default: false, in options) |
cloud_init_iso_storage | string | Storage for cloud-init ISOs (default: "local", in options) |
API URL and credentials (PROXMOX_API_URL, PROXMOX_TOKEN_ID, and PROXMOX_TOKEN_SECRET) are read from environment variables — see Authentication.
Instance Template Args
Provider-specific arguments in instance templates or defaults:
| Arg | Type | Description |
|---|---|---|
StoragePool | string | Required. Storage for VM disks (e.g. "local-lvm") |
TemplateName | string | Template name to look up per node (mutually exclusive with TemplateVMID) |
TemplateVMID | int | Template VMID to clone from (mutually exclusive with TemplateName) |
Bridge | string | Network bridge (e.g. "vmbr0") |
VLANTag | int | VLAN tag for network interface |
Cores | int | Number of CPU cores |
Memory | int | Memory in MB |
DiskSize | string | Disk size (e.g. "50G") |
StartOnBoot | bool | Start VM on Proxmox host boot |
Pool | string | Proxmox resource pool |
Template Configuration:
Use TemplateVMID when you have either a single node cluster or shared storage and a single template accessible from all nodes. Use TemplateName when each node has its own local template with the same name but different VMIDs (Nstance will look up the correct VMID on the target node at VM creation time).
Either TemplateName or TemplateVMID must be specified (but not both). These can be set in defaults.args for a shard-wide default, or overridden per-template or per-group.
Example:
{
"defaults": {
"args": {
"StoragePool": "local-lvm",
"TemplateName": "debian-13-template",
"Bridge": "vmbr0",
"Cores": 2,
"Memory": 2048
}
},
"templates": {
"database": {
"args": {
"TemplateName": "debian-12-template",
"Cores": 4,
"Memory": 8192,
"DiskSize": "100G"
}
}
}
}VM Lifecycle
Instance Creation
When creating a VM, Nstance:
- Selects a target node using TOPSIS scheduling algorithm
- Clones the VM from the configured template (new VM name = instance ID)
- Configures VM with requested resources (CPU, memory, disk)
- Sets VM description with managed notes (metadata known as “annotations” such as group, kind, created timestamp)
- Generates and attaches a cloud-init ISO with request userdata script
- Applies “association” metadata tags (
nstance,<cluster-id>,<shard>) - Starts the VM
Instance Deletion
When deleting a VM, Nstance:
- Stops the VM gracefully (if running)
- Deletes the cloud-init ISO from storage
- Deletes the VM
Instance Status
Nstance queries VM status via the Proxmox API. Private IPs are populated at agent registration, not from the Proxmox provider.
Status Mapping:
| Proxmox Status | Nstance Status |
|---|---|
running | Running |
stopped | Stopped |
paused | Suspended |
| Other | Unknown |
Instance Listing
Nstance uses the /cluster/resources?type=vm API to enumerate all VMs cluster-wide efficiently. VMs are filtered by the nstance tag.
Instance Metadata
Nstance uses a structured metadata model for Proxmox VMs:
Identifier:
- Instance ID is stored as the VM name (e.g.
tst06dx9xy919t3v9kzd2xdsyzb3g) — this is the authoritative, globally unique identifier - Provider instance ID is the Proxmox VMID — unique at any point in time but may be reused after a VM is deleted (Proxmox assigns VMIDs via
cluster.NextID()which recycles freed IDs)
Association Metadata (used for filtering and GC):
nstance- Ownership tag identifying nstance-managed VMs<cluster-id>- Cluster ID as a tag (e.g.example-cluster)<shard>- Shard ID as a tag (e.g.dev-1)
Annotation Metadata (stored in VM notes, informational only): The VM description field contains a managed notes block:
# DO NOT EDIT BELOW - managed by nstance #
group: test
kind: tst
created: 2026-01-20T12:00:00Z
# DO NOT EDIT ABOVE - managed by nstance #These annotations are never used by nstance for filtering or reconciliation - only for display purposes in the Proxmox UI/API to assist operators.
Node Scheduling
The scheduler selects the optimal node for VM placement using TOPSIS (Technique for Order of Preference by Similarity to Ideal Solution), the same multi-criteria decision-making algorithm used by Proxmox VE’s built-in HA scheduler.
TOPSIS ranks nodes by their geometric distance to an ideal solution (best possible values) and anti-ideal solution (worst possible values). The node closest to ideal and farthest from anti-ideal wins.
Scheduling Criteria:
- Free memory (60% weight)
- Free CPU (40% weight)
Uses /cluster/resources?type=node to get all node resource usage in a single API call.
Cloud-Init Integration
Nstance uses cloud-init ISOs for VM configuration. The cloud-init ISO contains:
- User-data from the instance template
- Meta-data with instance ID and hostname
Network and Load Balancer Operations
Leader network and load balancer operations are not currently implemented for the Proxmox provider:
AssignLeaderNetwork- Not implemented (returns error)ReleaseLeaderNetwork- Not implemented (returns error)CheckSubnetCapacity- Always returns availableRegisterWithLB- No-opDeregisterFromLB- No-opListLBInstances- Returns empty
Deployment Topology
Recommended Architecture
Each Proxmox VE cluster requires exactly one Nstance Server shard (1 cluster = 1 shard):
┌─────────────────────────────────────────────────────────────────┐
│ Proxmox Datacenter Manager │
│ (Region equivalent) │
├─────────────────────┬─────────────────────┬─────────────────────┤
│ PVE Cluster A │ PVE Cluster B │ PVE Cluster C │
│ │ │ │
│ ┌─────┐ ┌─────┐ │ ┌─────┐ ┌─────┐ │ ┌─────┐ ┌─────┐ │
│ │node1│ │node2│ │ │node1│ │node2│ │ │node1│ │node2│ │
│ └─────┘ └─────┘ │ └─────┘ └─────┘ │ └─────┘ └─────┘ │
│ │ │ │
│ ┌───────────────┐ │ ┌───────────────┐ │ ┌───────────────┐ │
│ │Nstance Server │ │ │Nstance Server │ │ │Nstance Server │ │
│ │ (HA CT) │ │ │ (HA CT) │ │ │ (HA CT) │ │
│ └───────────────┘ │ └───────────────┘ │ └───────────────┘ │
└─────────────────────┴─────────────────────┴─────────────────────┘
│
SDN / EVPN FabricRecommendation: Run Nstance Server as a Proxmox HA container (LXC) within each cluster for automatic failover between nodes. Alternative, run multiple containers and rely on Nstance Server leader election to ensure only one is the primary/leader.
Mapping to Cloud Concepts
| Cloud Concept | Proxmox Equivalent |
|---|---|
| Region | Proxmox Datacenter Manager deployment |
| Availability Zone | Individual PVE cluster |
| Shard | One per PVE cluster |
| VPC | SDN Zone (EVPN or Simple) |
| Subnet | SDN VNet / VLAN |
Multi-Cluster Networking
For spanning workloads across multiple Proxmox VE clusters:
- Deploy Proxmox Datacenter Manager to manage multiple clusters as a unified environment
- Configure SDN with EVPN for layer 2/3 connectivity between clusters:
- Create an SDN Zone (EVPN type for cross-cluster)
- Define VNets that span clusters
- Assign VNet to bridge on each cluster’s nodes
- Configure Nstance Server with one shard per cluster, all pointing to the same object storage backend
Example Shard Configuration:
{
"shards": [
{
"id": "dc1-cluster-a",
"provider": {
"kind": "proxmox",
"region": "dc1",
"zone": "cluster-a"
}
},
{
"id": "dc1-cluster-b",
"provider": {
"kind": "proxmox",
"region": "dc1",
"zone": "cluster-b"
}
}
]
}Single Cluster Deployment
For simpler deployments with a single Proxmox VE cluster:
- One Nstance Server instance with one shard
- Zone scheduling distributes VMs across nodes within the cluster
- No Datacenter Manager or SDN required
Limitations
No Load Balancer Integration
Proxmox has no native load balancer. For Kubernetes ingress:
- Use MetalLB with BGP or L2 mode
- Deploy HAProxy/Traefik as a VM or on bare metal
- Use external load balancer appliance
If BGP is not an option for your deployment, you may wish to use VRRP via something like keepalived or use a load balancer like gobetween, for sending requests to a healthy node running an ingress controller like Traefik using a consistent “virtual IP”.
No Spot/Preemptible Instances
Proxmox does not have a spot instance concept. The agent’s spot termination monitoring is disabled for the proxmox provider.
VM Templates
Proxmox “templates” are roughly equivalent to AWS AMIs - they’re pre-built VM images that new instances are cloned from. Unlike AWS which has a marketplace of official AMIs, Proxmox requires you to create templates from cloud images.
| AWS | Proxmox |
|---|---|
| AMI (Amazon Machine Image) | VM Template |
RunInstances with ImageId | Clone from template VMID |
| Official AMIs from AWS/Debian/etc. | Create from cloud images |
Template Requirements
The base VM template must have:
- cloud-init package installed and enabled
- qemu-guest-agent installed and running (recommended for Proxmox management, not required by Nstance)
- Network configured for DHCP
- SSH server enabled
Guide: Creating a Debian Template
Debian provides official cloud images with cloud-init pre-configured. This is the recommended base OS for Nstance cluster VMs.
Storage Options for Templates
There are two ways to specify a VM template in Nstance args: TemplateVMID or TemplateName.
If using TemplateVMID, note that VM templates in Proxmox VE multi-node clusters will either have a single VMID if you store them in shared storage, or different VMIDs if local to the node.
So if you don’t have shared storage, you can instead use TemplateName and use a consistent name across each node, and Nstance will lookup the template VMID per-node when provisioning a new VM.
For example, create a template named debian-13-template on each node:
- Node A: VMID 9000, name
debian-13-template(local-lvm) - Node B: VMID 9001, name
debian-13-template(local-lvm) - Node C: VMID 9002, name
debian-13-template(local-lvm)
Then configure in your defaults or template args: "TemplateName": "debian-13-template"
Nstance queries VMs on the scheduled node and finds the matching template by name. An error is returned if zero or more than one template matches.
For single-node deployments, either option works — but note that TemplateVMID will require one less API call to Proxmox VE.
Template Creation
The easiest way to create a template is with the provided bootstrap script. Shell/SSH into each Proxmox node (or just one, if using shared storage) and run:
# Download and run the bootstrap script
curl -fSL -o vm-template-setup.sh https://raw.githubusercontent.com/nstance-dev/nstance/main/deploy/proxmox/vm-template-setup.sh
chmod +x vm-template-setup.sh
# Create a template with defaults (Debian 13 Trixie, local-lvm storage)
./vm-template-setup.sh
# Or customise options
# ./vm-template-setup.sh --storage ceph-pool --bridge vmbr1
# Or preview what will happen without making changes by doing a dry-run
# ./vm-template-setup.sh --dry-runThe script is idempotent — it skips creation if a template with the same name already exists on the node, downloads the cloud image only if not already present, and automatically selects the next available VMID starting from 9000. Run ./vm-template-setup.sh --help for all options.
Note: If bootstrapping multiple nodes simultaneously, two nodes may select the same VMID since Proxmox VMIDs are cluster-wide. The second node’s qm create will fail — simply re-run the script on that node. To avoid this, either run the script on one node at a time, or pass --min-vmid with a different starting VMID per node (e.g. --min-vmid 9000, --min-vmid 9100).
Note: We don’t resize the template disk. Nstance handles disk sizing via the DiskSize template argument when cloning VMs.
The template name (default debian-13-template) is then used in Nstance args as TemplateName.
Verifying the template:
# List templates
qm list | grep template
# Show template config (replace VMID with the one reported by the script)
qm config 9000The template is now ready to be used.
Note: For other distributions (e.g. Ubuntu), use the --image-url flag to point to a different cloud image and --template-name for the name. Ensure the image includes cloud-init (qemu-guest-agent recommended).
Manual Template Creation (Advanced)
If you prefer to create the template manually, or need to customise steps beyond what the bootstrap script provides, follow these steps on each Proxmox node (or just one, if using shared storage):
Click to expand manual steps
Step 1: Download the Debian cloud image
# Download Debian 13 (Trixie) cloud image
cd /var/lib/vz/template/iso/
curl -fSLO https://cloud.debian.org/images/cloud/trixie/latest/debian-13-genericcloud-amd64.qcow2Step 2: Create a VM and import the disk
Replace local-lvm below with your shared storage name if using shared storage.
VMID 9000 is commonly used for templates. Note that if you are not using shared storage, you must use a different VMID per node, such as 9001, 9002, and so on.
export VMID=9000
# Create a new VM
qm create $VMID --name debian-13-template --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-pci
# Import the cloud image as the primary disk
qm set $VMID --scsi0 local-lvm:0,import-from=/var/lib/vz/template/iso/debian-13-genericcloud-amd64.qcow2
# Add cloud-init drive
qm set $VMID --ide2 local-lvm:cloudinit
# Set boot order (Proxmox 8+ format)
qm set $VMID --boot order=scsi0
# Configure serial console (required by many cloud images)
qm set $VMID --serial0 socket --vga serial0
# Enable QEMU guest agent
qm set $VMID --agent enabled=1Note: We don’t resize the template disk here. Nstance handles disk sizing via the DiskSize template argument when cloning VMs.
Step 3: Configure cloud-init defaults (optional)
# Set default user (can be overridden via Nstance server config for userdata)
qm set $VMID --ciuser debian
# Set to use DHCP
qm set $VMID --ipconfig0 ip=dhcpStep 4: Convert to template
# Convert the VM to a template
qm template $VMIDThe template VMID (9000 in this example) is then used in Nstance args as TemplateVMID, OR template name (“debian-13-template” in this example) as TemplateName.
Troubleshooting
Common Issues
- VM Creation Fails: Check storage pool has sufficient space and is accessible from target node
- No IP Address: Ensure network is configured and agent has registered with the server
- Cloud-Init Not Applied: Verify cloud-init service is enabled in the template
- Scheduling Fails: Check node status and resource availability
- VIP on multiple nodes (keepalived split-brain): VRRP multicast is blocked between nodes. Verify with
tcpdump -i <iface> vrrp— you should see advertisements from the peer. Common causes:- Proxmox firewall blocking VRRP (protocol 112). Add
IN ACCEPT -source <vip-subnet> -p vrrpto cluster firewall rules. - keepalived configured on the wrong interface (e.g.
vmbr0instead of the VLAN interface where the VIP subnet lives). Theserver-with-keepalived.shscript auto-detects this from the VIP address.
- Proxmox firewall blocking VRRP (protocol 112). Add
Proxmox VE Debugging Commands
# Check cluster status
pvesh get /cluster/status
# List all VMs
pvesh get /cluster/resources --type vm
# Check node resources
pvesh get /nodes/<node>/status
# Get VM status
pvesh get /nodes/<node>/qemu/<vmid>/status/currentFurther Reading
- go-proxmox Documentation - API client reference
- Proxmox VE API - Official API documentation