Skip to content

System Design

Felix is a sovereign-first, low-latency distributed data backend that unifies event streaming, message queueing, and distributed caching over a single QUIC-based transport layer.

Design Principles

1. Sovereignty by Default

Each Felix cluster represents a single sovereign region. Data is isolated by default and cannot leave the region unless an explicit, configured bridge exists. This is enforced in routing, metadata, and encryption boundaries—not left to deployment discipline.

Why this matters:

  • Regulatory Compliance: GDPR, CCPA, HIPAA require data residency
  • Data Sovereignty: Government and enterprise data governance
  • Security: Reduced attack surface with explicit data movement
  • Auditability: Complete visibility into cross-region data flow

2. One Core Log, Many Semantics

Internally, Felix is built around a single append-only log abstraction. Different external semantics are projections over this core:

  • Streams (Pub/Sub): Fanout cursors per subscription
  • Queues: Shared consumer-group cursors with acknowledgements
  • Cache: Key → latest value with TTL, backed by the same log for invalidation and replay

This drastically reduces operational complexity and consistency bugs compared to running Kafka, Redis, and a queueing system side-by-side.

3. Low-Latency First

Felix prioritizes predictable low latency over maximum batch throughput:

  • QUIC transport: Multiplexed, encrypted, congestion-aware
  • Optional ephemeral streams: No disk on hot path
  • Aggressive backpressure: Bounded memory everywhere
  • Leader-based writes: Tunable acknowledgement policies

4. Kubernetes-Native

Felix assumes Kubernetes for process lifecycle, identity (ServiceAccounts), networking and service discovery, and failure detection. Felix does not attempt to reimplement scheduling or node membership logic that Kubernetes already provides.

System Architecture (Current MVP)

The current implementation is a single-node broker for development and testing:

flowchart TB
    subgraph Clients["Client Applications"]
        P1["Publisher 1"]
        P2["Publisher 2"]
        S1["Subscriber 1"]
        S2["Subscriber 2"]
        C1["Cache Client"]
    end

    subgraph Broker["Felix Broker (Single Node)"]
        direction TB
        Transport["QUIC Transport Layer<br/>felix-transport"]
        Wire["Wire Protocol Handler<br/>felix-wire framing"]
        Router["Stream Router<br/>Control vs Event vs Cache"]

        subgraph DataPlane["Data Plane"]
            PubSub["Pub/Sub Engine<br/>felix-broker"]
            Cache["Cache Engine<br/>TTL + eviction"]
            Storage["Ephemeral Storage<br/>In-memory"]
        end

        Metrics["Metrics Server<br/>:8080"]

        Transport --> Wire
        Wire --> Router
        Router --> PubSub
        Router --> Cache
        PubSub --> Storage
        Cache --> Storage
    end

    P1 & P2 --> Transport
    Transport --> S1 & S2
    C1 <--> Transport

    PubSub -.-> Metrics
    Cache -.-> Metrics

Key Components:

  • Transport Layer: Accepts QUIC connections, manages stream lifecycle
  • Wire Protocol: Frames messages, validates envelopes, routes by type
  • Pub/Sub Engine: Enqueues publishes, manages subscriptions, fans out events
  • Cache Engine: Handles put/get operations with TTL and lazy expiration
  • Storage: In-memory ring buffers and hash maps (ephemeral)
  • Metrics Server: Prometheus-compatible endpoint for monitoring

Planned Multi-Node Architecture

The intended multi-node design adds explicit control-plane coordination and data-plane scalability:

flowchart TB
    subgraph Clients["Clients"]
        C1["Producers"]
        C2["Consumers"]
        C3["Cache Clients"]
    end

    LB["Load Balancer<br/>(L4 for QUIC)"]

    Clients --> LB

    subgraph ControlPlane["Control Plane (RAFT)"]
        direction LR
        CONTROLPLANE1["controlplane-0"]
        CONTROLPLANE2["controlplane-1"]
        CONTROLPLANE3["controlplane-2"]

        CONTROLPLANE1 <--> CONTROLPLANE2
        CONTROLPLANE2 <--> CONTROLPLANE3
        CONTROLPLANE1 <--> CONTROLPLANE3

        Meta["Metadata Store<br/>• Topics/Streams<br/>• Tenants/Namespaces<br/>• Shard Placement<br/>• ACLs/Quotas"]
    end

    subgraph DataPlane["Data Plane (Brokers)"]
        direction LR
        B1["Broker A<br/>Shards 0-99"]
        B2["Broker B<br/>Shards 100-199"]
        B3["Broker C<br/>Shards 200-299"]
    end

    subgraph Storage["Storage Layer"]
        direction LR
        Ephemeral["Ephemeral<br/>(in-memory)"]
        Durable["Durable Log<br/>(persistent volumes)"]
        Snapshots["Snapshots<br/>(object storage)"]
    end

    LB --> DataPlane
    ControlPlane --> Meta
    DataPlane <--> ControlPlane
    DataPlane --> Storage

Control Plane Responsibilities

  • Metadata Management: Topics, tenants, namespaces, ACLs
  • Shard Placement: Assign shards to broker nodes
  • Health Monitoring: Track broker liveness and readiness
  • Configuration: Cluster-wide retention, limits, feature flags
  • Rebalancing: Migrate shards on node failures or scaling events

Data Plane Responsibilities

  • Client Connections: Accept and route QUIC streams
  • Data Operations: Publish, subscribe, cache operations
  • Shard Ownership: Host assigned shards (leaders and followers)
  • Replication: (Future) Replicate log entries to followers
  • Backpressure: Enforce flow control and isolation

Data Flow Patterns

Publish/Subscribe Flow

sequenceDiagram
    participant P as Publisher
    participant B as Broker
    participant S1 as Subscriber 1
    participant S2 as Subscriber 2

    P->>B: Open control stream (QUIC bi)
    S1->>B: Open control stream (QUIC bi)
    S2->>B: Open control stream (QUIC bi)

    S1->>B: Subscribe(tenant, namespace, stream)
    B-->>S1: OK
    B->>S1: Open event stream (QUIC uni)

    S2->>B: Subscribe(tenant, namespace, stream)
    B-->>S2: OK
    B->>S2: Open event stream (QUIC uni)

    loop Publishing
        P->>B: Publish(batch of messages)
        B-->>P: ACK (optional)
        B->>B: Enqueue for fanout
        par Fanout to subscribers
            B->>S1: Event batch
            B->>S2: Event batch
        end
    end

Key characteristics:

  • Publishers use bidirectional control streams for publish requests
  • Subscribers get dedicated unidirectional event streams
  • Fanout happens independently per subscriber (isolation)
  • Batching is time and count-bounded for throughput optimization

Cache Flow

sequenceDiagram
    participant C as Client
    participant B as Broker

    C->>B: Open cache stream pool (N connections)
    Note over C,B: M stream workers per connection

    par Concurrent requests
        C->>B: cache_put(tenant, namespace, cache, key1, value1, ttl)
        C->>B: cache_get(tenant, namespace, cache, key2)
        C->>B: cache_put(tenant, namespace, cache, key3, value3, ttl)
    end

    par Concurrent responses
        B-->>C: OK (key1)
        B-->>C: cache_value(key2, null)
        B-->>C: OK (key3)
    end

Key characteristics:

  • Connection pooling reduces handshake overhead
  • Request multiplexing over long-lived streams
  • Request IDs for request/response matching
  • Sub-millisecond latency at moderate concurrency

Cross-Broker Routing (Planned)

When a client connects to a broker that doesn't own the target shard:

sequenceDiagram
    participant C as Client
    participant B1 as Broker (ingress)
    participant CONTROLPLANE as Control Plane
    participant B2 as Broker (shard owner)

    C->>B1: Publish(topic, batch)
    B1->>CONTROLPLANE: Lookup shard placement(topic)
    CONTROLPLANE-->>B1: owner = B2
    B1->>B2: Forward publish (internal QUIC)
    B2->>B2: Commit to log
    B2-->>B1: ACK
    B1-->>C: ACK

Storage Architecture

Ephemeral (Current MVP)

  • In-memory only: No disk writes on hot path
  • Bounded buffers: Ring buffers with fixed capacity
  • TTL support: Lazy expiration on access
  • No persistence: Data lost on restart

Use cases:

  • Ultra-low latency workloads
  • Development and testing
  • Temporary caching
  • Non-critical event streams

Durable (Planned)

  • Write-Ahead Log (WAL): Append-only log segments
  • Segmented storage: Rotate segments based on time/size
  • Retention policies: Time-based and size-based limits
  • Snapshots: Periodic state snapshots for faster recovery

Use cases:

  • Production event streaming
  • Critical message delivery
  • Long-term event retention
  • Replay and audit trails

Consistency Model

Single-Node (MVP)

  • Delivery: At-most-once (best-effort)
  • Ordering: Per-stream ordering preserved per subscriber
  • Durability: None (ephemeral only)

Multi-Node (Planned)

Tunable per stream:

  • Leader-only acknowledgements: Lowest latency, leader commits before replicating
  • Quorum acknowledgements: Higher durability, waits for majority replica confirmation
  • Asynchronous replication: Background replication after ACK
  • Synchronous replication: Blocks on replication before ACK

Delivery guarantees:

  • At-least-once: With durable storage and replay on failure
  • At-most-once: Best-effort with no retries
  • Exactly-once: (Future roadmap) via idempotent producers and transactions

Multi-Region Architecture (Planned)

Felix enforces regional isolation with explicit bridges:

flowchart LR
    subgraph Region1["Region: US-EAST"]
        B1["Brokers<br/>US-EAST"]
        CONTROLPLANE1["Control Plane<br/>US-EAST"]
    end

    subgraph Region2["Region: EU-WEST"]
        B2["Brokers<br/>EU-WEST"]
        CONTROLPLANE2["Control Plane<br/>EU-WEST"]
    end

    subgraph Bridge["Explicit Bridge"]
        BridgeAgent["Bridge Agent<br/>• Allowlist<br/>• Encryption<br/>• Audit Log"]
    end

    B1 <-->|"Explicit config only"| BridgeAgent
    BridgeAgent <-->|"Explicit config only"| B2

    style Region1 fill:#e1f5ff
    style Region2 fill:#fff4e1
    style Bridge fill:#ffe1e1

Bridge characteristics:

  • Explicit Configuration: No implicit data movement
  • Stream Allowlist: Only specified streams replicate
  • Independent Encryption: Per-region key contexts
  • Audit Trail: Complete log of cross-region data movement
  • Compliance: Satisfies strict data sovereignty requirements

Scalability Considerations

Vertical Scaling (Single-Node)

  • CPU: More cores for parallel stream processing
  • Memory: Larger buffers and cache capacity
  • Network: Higher bandwidth for fanout
  • Typical: 10k-50k msg/s on modern hardware

Horizontal Scaling (Multi-Node)

  • Sharding: Partition streams across brokers
  • Connection pooling: Reuse connections across shards
  • Control plane: RAFT quorum for metadata (3-5 nodes)
  • Data plane: Many broker nodes for capacity
  • Target: 100k-1M+ msg/s per cluster

Next Steps