System Design¶
Felix is a sovereign-first, low-latency distributed data backend that unifies event streaming, message queueing, and distributed caching over a single QUIC-based transport layer.
Design Principles¶
1. Sovereignty by Default¶
Each Felix cluster represents a single sovereign region. Data is isolated by default and cannot leave the region unless an explicit, configured bridge exists. This is enforced in routing, metadata, and encryption boundaries—not left to deployment discipline.
Why this matters:
- Regulatory Compliance: GDPR, CCPA, HIPAA require data residency
- Data Sovereignty: Government and enterprise data governance
- Security: Reduced attack surface with explicit data movement
- Auditability: Complete visibility into cross-region data flow
2. One Core Log, Many Semantics¶
Internally, Felix is built around a single append-only log abstraction. Different external semantics are projections over this core:
- Streams (Pub/Sub): Fanout cursors per subscription
- Queues: Shared consumer-group cursors with acknowledgements
- Cache: Key → latest value with TTL, backed by the same log for invalidation and replay
This drastically reduces operational complexity and consistency bugs compared to running Kafka, Redis, and a queueing system side-by-side.
3. Low-Latency First¶
Felix prioritizes predictable low latency over maximum batch throughput:
- QUIC transport: Multiplexed, encrypted, congestion-aware
- Optional ephemeral streams: No disk on hot path
- Aggressive backpressure: Bounded memory everywhere
- Leader-based writes: Tunable acknowledgement policies
4. Kubernetes-Native¶
Felix assumes Kubernetes for process lifecycle, identity (ServiceAccounts), networking and service discovery, and failure detection. Felix does not attempt to reimplement scheduling or node membership logic that Kubernetes already provides.
System Architecture (Current MVP)¶
The current implementation is a single-node broker for development and testing:
flowchart TB
subgraph Clients["Client Applications"]
P1["Publisher 1"]
P2["Publisher 2"]
S1["Subscriber 1"]
S2["Subscriber 2"]
C1["Cache Client"]
end
subgraph Broker["Felix Broker (Single Node)"]
direction TB
Transport["QUIC Transport Layer<br/>felix-transport"]
Wire["Wire Protocol Handler<br/>felix-wire framing"]
Router["Stream Router<br/>Control vs Event vs Cache"]
subgraph DataPlane["Data Plane"]
PubSub["Pub/Sub Engine<br/>felix-broker"]
Cache["Cache Engine<br/>TTL + eviction"]
Storage["Ephemeral Storage<br/>In-memory"]
end
Metrics["Metrics Server<br/>:8080"]
Transport --> Wire
Wire --> Router
Router --> PubSub
Router --> Cache
PubSub --> Storage
Cache --> Storage
end
P1 & P2 --> Transport
Transport --> S1 & S2
C1 <--> Transport
PubSub -.-> Metrics
Cache -.-> Metrics
Key Components:
- Transport Layer: Accepts QUIC connections, manages stream lifecycle
- Wire Protocol: Frames messages, validates envelopes, routes by type
- Pub/Sub Engine: Enqueues publishes, manages subscriptions, fans out events
- Cache Engine: Handles put/get operations with TTL and lazy expiration
- Storage: In-memory ring buffers and hash maps (ephemeral)
- Metrics Server: Prometheus-compatible endpoint for monitoring
Planned Multi-Node Architecture¶
The intended multi-node design adds explicit control-plane coordination and data-plane scalability:
flowchart TB
subgraph Clients["Clients"]
C1["Producers"]
C2["Consumers"]
C3["Cache Clients"]
end
LB["Load Balancer<br/>(L4 for QUIC)"]
Clients --> LB
subgraph ControlPlane["Control Plane (RAFT)"]
direction LR
CONTROLPLANE1["controlplane-0"]
CONTROLPLANE2["controlplane-1"]
CONTROLPLANE3["controlplane-2"]
CONTROLPLANE1 <--> CONTROLPLANE2
CONTROLPLANE2 <--> CONTROLPLANE3
CONTROLPLANE1 <--> CONTROLPLANE3
Meta["Metadata Store<br/>• Topics/Streams<br/>• Tenants/Namespaces<br/>• Shard Placement<br/>• ACLs/Quotas"]
end
subgraph DataPlane["Data Plane (Brokers)"]
direction LR
B1["Broker A<br/>Shards 0-99"]
B2["Broker B<br/>Shards 100-199"]
B3["Broker C<br/>Shards 200-299"]
end
subgraph Storage["Storage Layer"]
direction LR
Ephemeral["Ephemeral<br/>(in-memory)"]
Durable["Durable Log<br/>(persistent volumes)"]
Snapshots["Snapshots<br/>(object storage)"]
end
LB --> DataPlane
ControlPlane --> Meta
DataPlane <--> ControlPlane
DataPlane --> Storage
Control Plane Responsibilities¶
- Metadata Management: Topics, tenants, namespaces, ACLs
- Shard Placement: Assign shards to broker nodes
- Health Monitoring: Track broker liveness and readiness
- Configuration: Cluster-wide retention, limits, feature flags
- Rebalancing: Migrate shards on node failures or scaling events
Data Plane Responsibilities¶
- Client Connections: Accept and route QUIC streams
- Data Operations: Publish, subscribe, cache operations
- Shard Ownership: Host assigned shards (leaders and followers)
- Replication: (Future) Replicate log entries to followers
- Backpressure: Enforce flow control and isolation
Data Flow Patterns¶
Publish/Subscribe Flow¶
sequenceDiagram
participant P as Publisher
participant B as Broker
participant S1 as Subscriber 1
participant S2 as Subscriber 2
P->>B: Open control stream (QUIC bi)
S1->>B: Open control stream (QUIC bi)
S2->>B: Open control stream (QUIC bi)
S1->>B: Subscribe(tenant, namespace, stream)
B-->>S1: OK
B->>S1: Open event stream (QUIC uni)
S2->>B: Subscribe(tenant, namespace, stream)
B-->>S2: OK
B->>S2: Open event stream (QUIC uni)
loop Publishing
P->>B: Publish(batch of messages)
B-->>P: ACK (optional)
B->>B: Enqueue for fanout
par Fanout to subscribers
B->>S1: Event batch
B->>S2: Event batch
end
end
Key characteristics:
- Publishers use bidirectional control streams for publish requests
- Subscribers get dedicated unidirectional event streams
- Fanout happens independently per subscriber (isolation)
- Batching is time and count-bounded for throughput optimization
Cache Flow¶
sequenceDiagram
participant C as Client
participant B as Broker
C->>B: Open cache stream pool (N connections)
Note over C,B: M stream workers per connection
par Concurrent requests
C->>B: cache_put(tenant, namespace, cache, key1, value1, ttl)
C->>B: cache_get(tenant, namespace, cache, key2)
C->>B: cache_put(tenant, namespace, cache, key3, value3, ttl)
end
par Concurrent responses
B-->>C: OK (key1)
B-->>C: cache_value(key2, null)
B-->>C: OK (key3)
end
Key characteristics:
- Connection pooling reduces handshake overhead
- Request multiplexing over long-lived streams
- Request IDs for request/response matching
- Sub-millisecond latency at moderate concurrency
Cross-Broker Routing (Planned)¶
When a client connects to a broker that doesn't own the target shard:
sequenceDiagram
participant C as Client
participant B1 as Broker (ingress)
participant CONTROLPLANE as Control Plane
participant B2 as Broker (shard owner)
C->>B1: Publish(topic, batch)
B1->>CONTROLPLANE: Lookup shard placement(topic)
CONTROLPLANE-->>B1: owner = B2
B1->>B2: Forward publish (internal QUIC)
B2->>B2: Commit to log
B2-->>B1: ACK
B1-->>C: ACK
Storage Architecture¶
Ephemeral (Current MVP)¶
- In-memory only: No disk writes on hot path
- Bounded buffers: Ring buffers with fixed capacity
- TTL support: Lazy expiration on access
- No persistence: Data lost on restart
Use cases:
- Ultra-low latency workloads
- Development and testing
- Temporary caching
- Non-critical event streams
Durable (Planned)¶
- Write-Ahead Log (WAL): Append-only log segments
- Segmented storage: Rotate segments based on time/size
- Retention policies: Time-based and size-based limits
- Snapshots: Periodic state snapshots for faster recovery
Use cases:
- Production event streaming
- Critical message delivery
- Long-term event retention
- Replay and audit trails
Consistency Model¶
Single-Node (MVP)¶
- Delivery: At-most-once (best-effort)
- Ordering: Per-stream ordering preserved per subscriber
- Durability: None (ephemeral only)
Multi-Node (Planned)¶
Tunable per stream:
- Leader-only acknowledgements: Lowest latency, leader commits before replicating
- Quorum acknowledgements: Higher durability, waits for majority replica confirmation
- Asynchronous replication: Background replication after ACK
- Synchronous replication: Blocks on replication before ACK
Delivery guarantees:
- At-least-once: With durable storage and replay on failure
- At-most-once: Best-effort with no retries
- Exactly-once: (Future roadmap) via idempotent producers and transactions
Multi-Region Architecture (Planned)¶
Felix enforces regional isolation with explicit bridges:
flowchart LR
subgraph Region1["Region: US-EAST"]
B1["Brokers<br/>US-EAST"]
CONTROLPLANE1["Control Plane<br/>US-EAST"]
end
subgraph Region2["Region: EU-WEST"]
B2["Brokers<br/>EU-WEST"]
CONTROLPLANE2["Control Plane<br/>EU-WEST"]
end
subgraph Bridge["Explicit Bridge"]
BridgeAgent["Bridge Agent<br/>• Allowlist<br/>• Encryption<br/>• Audit Log"]
end
B1 <-->|"Explicit config only"| BridgeAgent
BridgeAgent <-->|"Explicit config only"| B2
style Region1 fill:#e1f5ff
style Region2 fill:#fff4e1
style Bridge fill:#ffe1e1
Bridge characteristics:
- Explicit Configuration: No implicit data movement
- Stream Allowlist: Only specified streams replicate
- Independent Encryption: Per-region key contexts
- Audit Trail: Complete log of cross-region data movement
- Compliance: Satisfies strict data sovereignty requirements
Scalability Considerations¶
Vertical Scaling (Single-Node)¶
- CPU: More cores for parallel stream processing
- Memory: Larger buffers and cache capacity
- Network: Higher bandwidth for fanout
- Typical: 10k-50k msg/s on modern hardware
Horizontal Scaling (Multi-Node)¶
- Sharding: Partition streams across brokers
- Connection pooling: Reuse connections across shards
- Control plane: RAFT quorum for metadata (3-5 nodes)
- Data plane: Many broker nodes for capacity
- Target: 100k-1M+ msg/s per cluster
Next Steps¶
- Components Deep Dive - Detailed component architecture
- Wire Protocol - Protocol specification
- Semantics - Delivery and consistency guarantees
- Performance Tuning - Optimize for your workload