Kubernetes Deployment¶
This guide covers deploying Felix on Kubernetes with high availability, persistence, and production-grade configurations.
Overview¶
Felix is designed as a Kubernetes-native system. This guide demonstrates:
- StatefulSet deployment for stable network identity
- Headless Services for direct broker addressing
- Persistent storage with StatefulSet volumes
- Resource management and limits
- Health checks and readiness probes
- Horizontal scaling for high availability
Kubernetes Version
Felix requires Kubernetes 1.24 or later. Tested on 1.27+.
Prerequisites¶
- Kubernetes cluster: 1.24+ (Minikube, kind, GKE, EKS, AKS, etc.)
- kubectl: Configured to access your cluster
- Storage provisioner: For persistent volumes (e.g.,
gp3on AWS,pd-ssdon GCP) - 4GB RAM per broker pod: Minimum recommended
Verify Cluster Access¶
Quick Start¶
Basic Deployment¶
Deploy a single Felix broker:
# Create namespace
kubectl create namespace felix
# Apply manifests
kubectl apply -f deploy/kubernetes/broker.yaml -n felix
# Check status
kubectl get pods -n felix
kubectl logs -f deployment/felix-broker -n felix
Minimal broker deployment (deploy/kubernetes/broker.yaml):
apiVersion: v1
kind: Namespace
metadata:
name: felix
---
apiVersion: v1
kind: Service
metadata:
name: felix-broker
namespace: felix
labels:
app: felix-broker
spec:
type: ClusterIP
ports:
- port: 5000
targetPort: 5000
protocol: UDP
name: quic
- port: 8080
targetPort: 8080
protocol: TCP
name: metrics
selector:
app: felix-broker
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: felix-broker
namespace: felix
spec:
replicas: 1
selector:
matchLabels:
app: felix-broker
template:
metadata:
labels:
app: felix-broker
spec:
containers:
- name: broker
image: felix/broker:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5000
protocol: UDP
name: quic
- containerPort: 8080
protocol: TCP
name: metrics
env:
- name: FELIX_QUIC_BIND
value: "0.0.0.0:5000"
- name: FELIX_BROKER_METRICS_BIND
value: "0.0.0.0:8080"
- name: RUST_LOG
value: "info"
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
StatefulSet Deployment¶
For production, use StatefulSets for stable network identity and persistent storage:
apiVersion: v1
kind: Service
metadata:
name: felix-broker-headless
namespace: felix
labels:
app: felix-broker
spec:
clusterIP: None
ports:
- port: 5000
protocol: UDP
name: quic
- port: 8080
protocol: TCP
name: metrics
selector:
app: felix-broker
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: felix-broker
namespace: felix
spec:
serviceName: felix-broker-headless
replicas: 3
selector:
matchLabels:
app: felix-broker
template:
metadata:
labels:
app: felix-broker
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- felix-broker
topologyKey: kubernetes.io/hostname
containers:
- name: broker
image: felix/broker:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5000
protocol: UDP
name: quic
- containerPort: 8080
protocol: TCP
name: metrics
env:
- name: FELIX_QUIC_BIND
value: "0.0.0.0:5000"
- name: FELIX_BROKER_METRICS_BIND
value: "0.0.0.0:8080"
- name: FELIX_EVENT_BATCH_MAX_EVENTS
value: "64"
- name: FELIX_EVENT_BATCH_MAX_DELAY_US
value: "250"
- name: FELIX_CACHE_CONN_POOL
value: "8"
- name: FELIX_CACHE_STREAMS_PER_CONN
value: "4"
- name: FELIX_DISABLE_TIMINGS
value: "false"
- name: RUST_LOG
value: "info"
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
volumeMounts:
- name: data
mountPath: /data
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
timeoutSeconds: 2
failureThreshold: 3
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 2
failureThreshold: 3
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"]
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 50Gi
Deploy:
kubectl apply -f statefulset.yaml
kubectl get statefulset -n felix
kubectl get pods -n felix -l app=felix-broker
Configuration Management¶
ConfigMap for Settings¶
Externalize configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: felix-broker-config
namespace: felix
data:
broker.yml: |
quic_bind: "0.0.0.0:5000"
metrics_bind: "0.0.0.0:8080"
event_batch_max_events: 64
event_batch_max_delay_us: 250
event_batch_max_bytes: 65536
fanout_batch_size: 64
cache_conn_recv_window: 268435456
cache_stream_recv_window: 67108864
cache_send_window: 268435456
pub_workers_per_conn: 4
pub_queue_depth: 1024
subscriber_queue_capacity: 128
subscriber_writer_lanes: 4
subscriber_lane_queue_depth: 8192
max_subscriber_writer_lanes: 8
subscriber_lane_shard: auto
disable_timings: false
---
# Mount in StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: felix-broker
namespace: felix
spec:
template:
spec:
containers:
- name: broker
env:
- name: FELIX_BROKER_CONFIG
value: /etc/felix/broker.yml
volumeMounts:
- name: config
mountPath: /etc/felix
readOnly: true
volumes:
- name: config
configMap:
name: felix-broker-config
Secrets for Sensitive Data¶
Store credentials securely:
apiVersion: v1
kind: Secret
metadata:
name: felix-broker-secrets
namespace: felix
type: Opaque
stringData:
tls-cert.pem: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
tls-key.pem: |
-----BEGIN PRIVATE KEY-----
...
-----END PRIVATE KEY-----
---
# Mount in StatefulSet
volumeMounts:
- name: tls-certs
mountPath: /etc/felix/tls
readOnly: true
volumes:
- name: tls-certs
secret:
secretName: felix-broker-secrets
Storage Configuration¶
StorageClass for High Performance¶
Create optimized storage class:
# AWS EBS gp3 with provisioned IOPS
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: felix-fast-ssd
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "3000"
throughput: "125"
fsType: ext4
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
# GCP persistent disk SSD
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: felix-fast-ssd
provisioner: pd.csi.storage.gke.io
parameters:
type: pd-ssd
replication-type: regional-pd
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
PersistentVolumeClaim Templates¶
Define storage requirements in StatefulSet:
volumeClaimTemplates:
- metadata:
name: data
labels:
app: felix-broker
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: felix-fast-ssd
resources:
requests:
storage: 100Gi
Networking¶
Service for External Access¶
Expose broker externally:
# LoadBalancer (cloud providers)
apiVersion: v1
kind: Service
metadata:
name: felix-broker-external
namespace: felix
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
type: LoadBalancer
ports:
- port: 5000
targetPort: 5000
protocol: UDP
name: quic
selector:
app: felix-broker
---
# NodePort (bare metal)
apiVersion: v1
kind: Service
metadata:
name: felix-broker-nodeport
namespace: felix
spec:
type: NodePort
ports:
- port: 5000
targetPort: 5000
nodePort: 30500
protocol: UDP
name: quic
selector:
app: felix-broker
Ingress for HTTP Metrics¶
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: felix-metrics
namespace: felix
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
rules:
- host: felix-metrics.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: felix-broker
port:
number: 8080
High Availability Setup¶
Multi-Zone Deployment¶
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: felix-broker
namespace: felix
spec:
replicas: 3
template:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- felix-broker
topologyKey: topology.kubernetes.io/zone
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: felix-broker
Pod Disruption Budget¶
Protect against voluntary disruptions:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: felix-broker-pdb
namespace: felix
spec:
minAvailable: 2
selector:
matchLabels:
app: felix-broker
HorizontalPodAutoscaler¶
Scale based on metrics:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: felix-broker-hpa
namespace: felix
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: felix-broker
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Monitoring and Observability¶
ServiceMonitor for Prometheus¶
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: felix-broker
namespace: felix
labels:
app: felix-broker
spec:
selector:
matchLabels:
app: felix-broker
endpoints:
- port: metrics
interval: 15s
path: /metrics
Grafana Dashboard ConfigMap¶
apiVersion: v1
kind: ConfigMap
metadata:
name: felix-dashboard
namespace: monitoring
labels:
grafana_dashboard: "1"
data:
felix-overview.json: |
{
"dashboard": {
"title": "Felix Broker Overview",
"panels": [...]
}
}
Resource Management¶
Quality of Service Classes¶
Guaranteed QoS (production):
Burstable QoS (development):
Resource Quotas¶
Limit namespace resources:
apiVersion: v1
kind: ResourceQuota
metadata:
name: felix-quota
namespace: felix
spec:
hard:
requests.cpu: "20"
requests.memory: "40Gi"
limits.cpu: "40"
limits.memory: "80Gi"
persistentvolumeclaims: "10"
LimitRange¶
Set default resource limits:
apiVersion: v1
kind: LimitRange
metadata:
name: felix-limits
namespace: felix
spec:
limits:
- type: Container
default:
cpu: "2000m"
memory: "4Gi"
defaultRequest:
cpu: "1000m"
memory: "2Gi"
max:
cpu: "4000m"
memory: "8Gi"
min:
cpu: "500m"
memory: "1Gi"
Security¶
Network Policies¶
Restrict network access:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: felix-broker-netpol
namespace: felix
spec:
podSelector:
matchLabels:
app: felix-broker
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: felix-clients
ports:
- protocol: UDP
port: 5000
- from:
- namespaceSelector:
matchLabels:
name: monitoring
ports:
- protocol: TCP
port: 8080
egress:
- to:
- namespaceSelector:
matchLabels:
name: felix-controlplane
ports:
- protocol: TCP
port: 8443
- to:
- podSelector: {}
Pod Security Standards¶
apiVersion: v1
kind: Namespace
metadata:
name: felix
labels:
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Security Context¶
spec:
securityContext:
runAsNonRoot: true
runAsUser: 10001
fsGroup: 10001
seccompProfile:
type: RuntimeDefault
containers:
- name: broker
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
Operations¶
Scaling¶
# Scale StatefulSet
kubectl scale statefulset felix-broker --replicas=5 -n felix
# Check scaling progress
kubectl rollout status statefulset/felix-broker -n felix
Rolling Updates¶
# Update image
kubectl set image statefulset/felix-broker \
broker=felix/broker:v0.2.0 -n felix
# Watch rollout
kubectl rollout status statefulset/felix-broker -n felix
# Rollback if needed
kubectl rollout undo statefulset/felix-broker -n felix
Debugging¶
# View logs
kubectl logs -f felix-broker-0 -n felix
# Shell into pod
kubectl exec -it felix-broker-0 -n felix -- /bin/sh
# Port forward for local access
kubectl port-forward felix-broker-0 5000:5000 8080:8080 -n felix
# Check events
kubectl get events -n felix --sort-by='.lastTimestamp'
# Describe pod
kubectl describe pod felix-broker-0 -n felix
Backup and Recovery¶
# Backup PVC data
kubectl exec felix-broker-0 -n felix -- tar czf - /data > backup.tar.gz
# List PVCs
kubectl get pvc -n felix
# Restore from backup
cat backup.tar.gz | kubectl exec -i felix-broker-0 -n felix -- tar xzf - -C /
Complete Production Example¶
Full production-ready manifest:
# Clone repository
git clone https://github.com/gabloe/felix.git
cd felix/deploy/kubernetes
# Review and customize
$EDITOR production/values.yaml
# Apply with kustomize
kubectl apply -k production/
# Or use Helm (when available)
helm install felix ./charts/felix -n felix --create-namespace
Troubleshooting¶
Pod Stuck in Pending¶
kubectl describe pod felix-broker-0 -n felix
# Check for: insufficient resources, PVC binding, node affinity
CrashLoopBackOff¶
kubectl logs felix-broker-0 -n felix --previous
# Check for: config errors, port conflicts, resource limits
Service Not Reachable¶
# Test from within cluster
kubectl run -it --rm debug --image=busybox --restart=Never -n felix -- sh
nc -zvu felix-broker-headless 5000
High Memory Usage¶
# Check metrics
kubectl top pods -n felix
# Adjust resource limits
kubectl set resources statefulset felix-broker \
--limits=memory=8Gi -n felix
Next Steps¶
- Monitor deployment: Observability Guide
- Tune performance: Performance Guide
- Configure fully: Configuration Reference
- Secure deployment: Security Guide