CDN-level speed with Memcached and GeoDNS
CDNs solve a simple problem: put content close to users. A request from Tokyo should not travel to Frankfurt to fetch a product image or a JSON payload that has not changed in hours. CDNs cache content at edge locations and serve it locally.
But CDNs have constraints. They work best with static assets — images, scripts, stylesheets. The moment you need dynamic data, computed results, or personalised content, a CDN either cannot help or requires complex invalidation logic that often causes more problems than it solves.
On Bahriya, you can achieve CDN-level latency for dynamic data by combining multi-region Memcached with GeoDNS. Your application runs in multiple regions, each with its own Memcached pool. GeoDNS routes users to the nearest region. The cache is warm in every region, and your users get sub-millisecond cache hits regardless of where they are.
The architecture
Here is the setup:
- Your container runs in multiple regions (e.g., Helsinki, Virginia, Singapore)
- A Memcached pool runs in the same regions as your container
- GeoDNS routes each user to the nearest region
- Your application checks the local Memcached pool before hitting the origin database
The key insight is that each region has its own Memcached pool. There is no cross-region cache replication — each region warms its own cache independently. This sounds like a downside, but it is actually an advantage:
- No replication lag between regions
- No consistency problems from stale cross-region cache entries
- Each region's cache reflects the access patterns of its own users
- A cache failure in one region does not affect other regions
Setting it up
Step 1: Deploy Memcached to all regions
kind: memcached
project: my-app
handle: content-cache
name: Content Cache
regions:
- falkenstein-1
- helsinki-1
- virginia-1
- singapore-1
tuning:
memory: 512
topology:
nodes: 2
Step 2: Deploy your container to the same regions
kind: container
project: my-app
handle: api
name: API
image: ghcr.io/myorg/api:latest
regions:
- falkenstein-1
- helsinki-1
- virginia-1
- singapore-1
workload:
cpu: "500"
memory: "512"
port: "8080"
healthcheck: /healthz
env:
CACHE_HOST: content-cache
CACHE_PORT: "11211"
CACHE_TTL: "300"
scaling:
replicas: "2"
max_replicas: "8"
autoscalingtargetcpu: "70"
Step 3: Enable GeoDNS
Set the DNS mode to geo on your container. European users resolve to Helsinki or Falkenstein, North American users to Virginia, Asian users to Singapore.
Each user hits the nearest region, and that region's container checks its local Memcached pool. Cache miss? The container queries the database, stores the result in the local cache, and returns. The next user in that region gets a cache hit.
Application-level caching pattern
Here is a typical implementation. The key is that your application does not need to know about regions — it just talks to content-cache:11211, and Bahriya's project-private networking routes it to the local pool.
import pymemcache
import json
import hashlib
cache = pymemcache.Client('content-cache:11211')
def get_product_catalog(category_id):
cache_key = f"catalog:{category_id}"
cached = cache.get(cache_key)
if cached:
return json.loads(cached)
# Cache miss — query the database
catalog = db.query(
"SELECT * FROM products WHERE category_id = %s AND active = 1",
[category_id]
)
# Store in local cache for 5 minutes
cache.set(cache_key, json.dumps(catalog), expire=300)
return catalog
Every region runs this same code. The first request in each region is a cache miss, but every subsequent request for the same data is a local cache hit — no network round-trip to a distant database or origin server.
Cache warming strategies
The independent-cache-per-region model means each region starts cold after a deployment or a cache restart. For high-traffic applications, the cache warms naturally within seconds as real user requests come in. For lower-traffic regions, you can warm the cache proactively.
Warm on deploy
Add a startup script that pre-populates critical cache entries:
def warm_cache():
"""Pre-populate the most frequently accessed cache entries."""
popular_categories = db.query("SELECT id FROM categories ORDER BY views DESC LIMIT 50")
for cat in popular_categories:
get_product_catalog(cat['id']) # This populates the cache
Warm via background task
Run a periodic background job that refreshes cache entries before they expire:
def refresh_expiring_cache():
"""Refresh cache entries that are about to expire."""
keys = get_tracked_cache_keys()
for key in keys:
ttl = cache.get(f"{key}:ttl")
if ttl and int(ttl) < 60: # Less than 60 seconds remaining
regenerate_cache_entry(key)
When this approach beats a CDN
This pattern works better than a traditional CDN when:
- Your data is dynamic — product prices, inventory levels, search results, API responses that change frequently
- Your data is computed — aggregated analytics, recommendation engine output, filtered/sorted query results
- Your invalidation is complex — when a product is updated, you need to invalidate every cache entry that references it
- You need sub-second freshness — a 5-minute CDN TTL is too stale for your use case, but 30-second Memcached TTL is fine because the cache is co-located with your application
The advantage over a CDN is control. You control the cache keys, the TTLs, the invalidation logic, and the warming strategy. You are not working around a black-box CDN configuration — you are using a standard Memcached client in your application code.
Cost comparison
A CDN charges per request and per bandwidth. For API responses and dynamic data, those costs can be unpredictable and scale linearly with traffic.
With Memcached on Bahriya, you pay a per-minute rate for the memory you provision, with a 60-second minimum. A 512 MB, 2-node pool in 4 regions gives you 4 GB of total cache capacity at a predictable cost. Whether you serve 1,000 or 1,000,000 cache hits per hour, the cost is the same.
Summary
Multi-region Memcached with GeoDNS gives you CDN-level latency for dynamic data. Each region has its own cache, warmed by local traffic, accessed over project-private networking with sub-millisecond round-trips. No replication lag, no CDN invalidation headaches, no per-request costs — just fast, local caching at every edge your application runs on.