Caching and Fanout - Durable Streams

Durable Streams is designed to leverage CDN infrastructure for massive horizontal scaling. A single origin server can efficiently serve millions of concurrent readers through intelligent caching and request collapsing.

Why caching matters

Without caching, every reader requires a separate connection to your origin server:

Without CDN

1,000 viewers = 1,000 origin connectionsYour origin server handles all load directly, requiring expensive vertical scaling.

With CDN

1,000,000 viewers = ~10 origin connectionsCDN caches and collapses requests, serving millions from edge nodes while origin handles minimal load.

Cache-Control headers

Durable Streams servers set appropriate Cache-Control headers based on stream content:

Catch-up reads (historical data)

GET /stream?offset=100_5678

HTTP/1.1 200 OK
Cache-Control: public, max-age=60, stale-while-revalidate=300
ETag: stream-123:100_5678:100_9999
Stream-Next-Offset: 100_9999

[data bytes]

public

Response can be cached by shared CDN caches (not just browser cache).

max-age=60

Response is fresh for 60 seconds.

stale-while-revalidate=300

After 60 seconds, CDN can serve stale content for up to 5 minutes while revalidating in the background.

Historical data is highly cacheable because it never changes. Once written at offset X, those bytes remain at offset X forever.

HEAD requests (metadata)

HEAD /stream

HTTP/1.1 200 OK
Cache-Control: no-store
Stream-Next-Offset: 100_9999
Stream-Closed: false

HEAD requests are not cached because the tail offset and closure status change as new data arrives.

ETag-based validation

Servers generate ETags for efficient cache validation:

GET /stream?offset=100_5678

HTTP/1.1 200 OK
ETag: "stream-123:100_5678:100_9999"
Cache-Control: public, max-age=60

[data bytes]

Clients can use If-None-Match for efficient revalidation:

GET /stream?offset=100_5678
If-None-Match: "stream-123:100_5678:100_9999"

HTTP/1.1 304 Not Modified
ETag: "stream-123:100_5678:100_9999"

304 Not Modified responses are extremely efficient—no data transfer, just a tiny header response. CDNs use this to quickly validate cached content.

ETag format

ETags encode the stream ID and offset range:

{internal_stream_id}:{start_offset}:{end_offset}

Example: stream-123:100_5678:100_9999

ETags change when streams are closed, even if no new data is appended. This ensures clients receive the closure signal instead of stale 304 Not Modified responses.

Request collapsing with cursors

For live modes (long-polling and SSE), the protocol uses cursors to enable CDN request collapsing.

The problem without collapsing

Without collapsing, each viewer creates a separate long-poll request to the origin:

Viewer 1: GET /stream?offset=100_9999&live=long-poll
Viewer 2: GET /stream?offset=100_9999&live=long-poll  
Viewer 3: GET /stream?offset=100_9999&live=long-poll
...
Viewer 1000: GET /stream?offset=100_9999&live=long-poll

→ 1000 concurrent connections to origin server

Request collapsing with cursors

Cursors enable CDNs to collapse multiple viewer requests into a single upstream request:

Server includes cursor in response

HTTP/1.1 200 OK
Stream-Cursor: abc123
Stream-Next-Offset: 100_9999

[data]

Client echoes cursor on next request

GET /stream?offset=100_9999&cursor=abc123&live=long-poll

CDN collapses identical requests

All viewers with the same (offset, cursor) tuple are served from a single upstream request:

1000 viewers → CDN → 1 origin request

How cursors work

Cursors are time-interval identifiers generated by the server:

// Server-side cursor generation (conceptual)
function generateCursor(): string {
  const epoch = new Date('2024-10-09T00:00:00Z').getTime()
  const interval = 20000  // 20 seconds
  const now = Date.now()
  const intervalNumber = Math.floor((now - epoch) / interval)
  return intervalNumber.toString()
}

Cursors change every ~20 seconds, which:

Prevents infinite CDN cache loops (viewers seeing same empty response forever)
Enables request collapsing for viewers polling at similar times
Balances freshness with collapsing efficiency

Client behavior

import { stream } from '@durable-streams/client'

const response = await stream({
  url: 'https://streams.example.com/events',
  offset: '-1',
  live: 'long-poll'
})

// Client libraries automatically handle cursors
response.subscribeJson(async (batch) => {
  console.log('Cursor:', batch.cursor)  // Server-provided cursor
  // Client echoes this cursor on next request
})

Client libraries handle cursors automatically. You don’t need to manually track or pass cursors—they’re managed internally for optimal CDN behavior.

Query parameter ordering

For optimal cache hit rates, clients should order query parameters lexicographically:

// ✅ Good: Consistent ordering
GET /stream?cursor=abc&live=long-poll&offset=100_9999

// ❌ Suboptimal: Different ordering (different cache key)
GET /stream?offset=100_9999&live=long-poll&cursor=abc

Client libraries automatically order parameters for optimal caching.

Private vs. public caching

Public streams (shared data)

Cache-Control: public, max-age=60, stale-while-revalidate=300

Use public for streams that contain non-user-specific data:

Public event feeds
Broadcast notifications
Shared game state
System status updates

Private streams (user-specific data)

Cache-Control: private, max-age=60, stale-while-revalidate=300

Use private for streams with user-specific or confidential data:

Personal notifications
User-specific state
Private chat rooms
Account activity logs

CDNs respect Authorization headers when configured properly. Private streams can still benefit from CDN caching with per-user cache keys based on auth headers.

Stream closure and caching

Closed streams remain fully cacheable:

Data chunks are cached

GET /stream?offset=100_5678

HTTP/1.1 200 OK
Cache-Control: public, max-age=60
ETag: "stream-123:100_5678:100_9999"
Stream-Next-Offset: 100_9999

[final data chunk]

This chunk is cached normally, even if it’s the last chunk before closure.

Closure is discovered on next request

GET /stream?offset=100_9999

HTTP/1.1 200 OK
Cache-Control: public, max-age=60
ETag: "stream-123:100_9999:100_9999:closed"
Stream-Next-Offset: 100_9999
Stream-Closed: true
Stream-Up-To-Date: true

[]  // Empty body

The closure signal is a distinct cacheable request.

Why this design? It ensures:

All data chunks remain cacheable (no retroactive invalidation)
Closure is a distinct, cacheable signal
Cached chunks don’t become “stale” when the stream closes

CDN configuration examples

Cloudflare Workers

export default {
  async fetch(request: Request): Promise<Response> {
    const url = new URL(request.url)
    
    // Proxy to origin server
    const originUrl = `https://origin.example.com${url.pathname}${url.search}`
    const response = await fetch(originUrl, {
      headers: request.headers,
      method: request.method,
    })
    
    // Cloudflare respects Cache-Control headers automatically
    return response
  }
}

Fastly VCL

sub vcl_recv {
  # Normalize query parameters for better cache hits
  set req.url = querystring.sort(req.url);
  
  # Cache based on Authorization header for private streams
  if (req.http.Authorization) {
    set req.hash += req.http.Authorization;
  }
}

sub vcl_backend_response {
  # Respect origin Cache-Control
  if (beresp.http.Cache-Control ~ "public") {
    set beresp.ttl = 60s;
  }
  
  # Enable stale-while-revalidate
  if (beresp.http.Cache-Control ~ "stale-while-revalidate") {
    set beresp.grace = 300s;
  }
}

Nginx caching proxy

proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=streams:10m max_size=1g inactive=60m;

server {
    listen 80;
    server_name streams.example.com;
    
    location / {
        proxy_pass http://origin-server:4437;
        proxy_cache streams;
        proxy_cache_valid 200 60s;
        proxy_cache_valid 204 10s;
        proxy_cache_key "$request_uri$http_authorization";
        
        # Respect Cache-Control from origin
        proxy_cache_revalidate on;
        proxy_cache_use_stale error timeout updating;
        
        add_header X-Cache-Status $upstream_cache_status;
    }
}

Performance metrics

With proper CDN caching, you can achieve:

Cache hit ratio

85-95% for popular streams
Most requests served from CDN edge without touching origin.

Origin load reduction

100x - 1000x reduction
1M viewers → 1K origin requests (with 99.9% cache hit rate).

Latency

10-50ms from CDN edge
Near-instant responses for cached content vs. 100-500ms from origin.

Bandwidth savings

90%+ reduction in origin bandwidth
CDN serves the majority of bytes.

Real-world example: Live event streaming

Scenario: 1 million viewers watching a live event stream.

Without CDN caching

1,000,000 viewers
× 1 request/second (long-poll)
= 1,000,000 requests/second to origin

Origin server requirements:
- 1M concurrent connections
- Massive compute resources
- Expensive infrastructure

With CDN caching and request collapsing

1,000,000 viewers
× cursor-based request collapsing (1000:1 ratio)
= 1,000 requests/second to origin

Origin server requirements:
- 1K concurrent connections
- Modest compute resources  
- Cost-effective infrastructure

1000x reduction in origin load through request collapsing alone. Historical data caching provides even more savings.

Monitoring cache performance

Cache headers

CDNs typically add cache status headers:

GET /stream?offset=100_5678

HTTP/1.1 200 OK
X-Cache: HIT
X-Cache-Hits: 42
Age: 15
Cache-Control: public, max-age=60

X-Cache: HIT: Response served from cache
X-Cache: MISS: Response fetched from origin
X-Cache-Hits: Number of times this cached response has been served
Age: Seconds since response was cached

Client-side monitoring

import { stream } from '@durable-streams/client'

const response = await stream({
  url: 'https://streams.example.com/events',
  offset: '-1'
})

// Check cache status
const cacheStatus = response.headers.get('x-cache')
const age = response.headers.get('age')

console.log(`Cache status: ${cacheStatus}`)  // "HIT" or "MISS"
console.log(`Cached for: ${age}s`)

Best practices

Use CDN for production

Deploy your Durable Streams server behind a CDN (Cloudflare, Fastly, AWS CloudFront) to benefit from caching and request collapsing.

Viewers → CDN Edge → Origin Server

Enable stale-while-revalidate

Allow CDNs to serve slightly stale content while revalidating:

Cache-Control: public, max-age=60, stale-while-revalidate=300

This improves cache hit rates and reduces perceived latency.

Normalize query parameters

Ensure query parameters are sorted lexicographically for consistent cache keys. Client libraries do this automatically.

Monitor cache hit rates

Track CDN cache performance:

Target: 85%+ cache hit rate for popular streams

Alert if hit rate drops below 70%

Investigate cache misses (authorization issues, improper headers, etc.)

Configure cache keys properly

For private streams, ensure CDN caches are keyed by authentication:

proxy_cache_key "$request_uri$http_authorization";

Limitations and considerations

First viewer problem: The first viewer to request a new offset triggers a cache miss and waits for origin. Subsequent viewers benefit from the cached response.

Cache invalidation: When a stream closes, cached responses don’t invalidate automatically. Clients discover closure by requesting the next offset, which returns a cacheable closure signal.

CDN request collapsing is probabilistic: Viewers must poll at similar times (within the same 20-second cursor interval) to benefit from collapsing. Higher viewer concurrency = better collapsing ratios.

Next steps

Protocol Overview

Review the core protocol concepts

Live Modes

Learn about real-time streaming patterns

Deployment Guide

Deploy Durable Streams to production with CDN

Get Started

Core Concepts

Guides

Use Cases

Documentation Index

​Why caching matters

Without CDN

With CDN

​Cache-Control headers

​Catch-up reads (historical data)

​HEAD requests (metadata)

​ETag-based validation

​ETag format

​Request collapsing with cursors

​The problem without collapsing

​Request collapsing with cursors

​How cursors work

​Client behavior

​Query parameter ordering

​Private vs. public caching

​Public streams (shared data)

​Private streams (user-specific data)

​Stream closure and caching

​CDN configuration examples

​Cloudflare Workers

​Fastly VCL

​Nginx caching proxy

​Performance metrics

Cache hit ratio

Origin load reduction

Latency

Bandwidth savings

​Real-world example: Live event streaming

​Without CDN caching

​With CDN caching and request collapsing

​Monitoring cache performance

​Cache headers

​Client-side monitoring

​Best practices

​Limitations and considerations

​Next steps

Protocol Overview

Live Modes

Deployment Guide

Why caching matters

Cache-Control headers

Catch-up reads (historical data)

HEAD requests (metadata)

ETag-based validation

ETag format

Request collapsing with cursors

The problem without collapsing

Request collapsing with cursors

How cursors work

Client behavior

Query parameter ordering

Private vs. public caching

Public streams (shared data)

Private streams (user-specific data)

Stream closure and caching

CDN configuration examples

Cloudflare Workers

Fastly VCL

Nginx caching proxy

Performance metrics

Real-world example: Live event streaming

Without CDN caching

With CDN caching and request collapsing

Monitoring cache performance

Cache headers

Client-side monitoring

Best practices

Limitations and considerations

Next steps