Managed proxy caching is now live on Bahriya
Most HTTP traffic is repetitive. The same product page, the same catalogue listing, the same public API response — fetched thousands of times, served from your container thousands of times. Each one costs CPU. Each one adds a few milliseconds of tail latency. Each scraper that finds your endpoint costs you compute budget you wanted to spend on real users.
The standard fix is well known: put a cache in front of your origin. The problem has always been everything around that sentence. You stand up a cache cluster. You wire ingress to check it before forwarding. You manage failover, sizing, eviction, and key rules. You debug stale responses on a Sunday night. The cache idea took an afternoon. The infrastructure to run it cost months.
Starting today, Bahriya runs that layer for you. Flip proxy caching on for any HTTP container and Bahriya stands up a three-node managed cache cluster in every region the container runs in. Repeat requests are served from the cache in milliseconds and never touch your container. Misses go through, the response is stored, and the next caller benefits.
How it works
Open your HTTP container in the console, head to the Proxy Cache section, and turn it on. You pick:
- Cache size per region (256 MB to 8 GB, in 256 MB steps)
- Cache TTL — how long a response is fresh (default 300 s)
- Storage TTL — how long a stale entry sticks around as a fallback (optional)
- Methods, status codes, and content types the cache will store
- Where the cache sits relative to rate limiting — before (saves your rate-limit budget on bursty reads) or after (per-user fairness)
Save, deploy, and the cache is in front of your container on the next deployment cycle. No application changes. No image rebuild.
What you actually get
- Three-node cache cluster per region. If one node fails the others keep serving. If the whole cache is unavailable, traffic falls back to your container transparently — cache loss never causes request failures.
- Standard
Cache-Control honoured by default. Your container can override the platform TTL per response with headers it already emits, and you can disable the override if you want platform settings to win unconditionally.
- Configurable cache key. Defaults to method + path + query. Extend with specific headers (
Accept-Language, a tenant header), an explicit query-param allow-list, or JSON body fields for POST-style search endpoints.
- Standard or Premium tier — pick the latency / throughput profile you need.
When to reach for it
- Read-heavy public APIs and catalogue / listing endpoints — the classic high-leverage case.
- Expensive upstream work — joins, third-party calls, LLM inference. The first caller pays. Everyone else doesn't.
- Anti-scrape — a scraper hitting the same URL ten thousand times in a minute gets ten thousand cached responses and your origin doesn't notice.
It's not the answer for highly dynamic per-request data, per-user content (use application-level memcached instead), or sub-second freshness requirements. The KB article walks through the full set of trade-offs.
Pricing
Flat surcharge per region plus the memcached memory you pick.
| Component | Standard | Premium |
| Proxy cache surcharge | $5.00 / region / month | $7.50 / region / month |
| Managed memcached (cache memory) | $10 / GB / month | $15 / GB / month |
A 1 GB standard cache in two regions is $30/month all-in. A 4 GB premium cache in four regions is $270/month all-in. Pick the smallest size that fits your working set and resize when the hit-rate tells you to.
Getting started
In the console, open your HTTP container, scroll to Proxy Cache, and switch it on. The cache cluster comes up alongside your container in every region it runs in — and shows up in your memcached list as read-only ("Managed by container {name}") so you can see what the platform provisioned.
Read the full documentation: Proxy Caching
What's next
Proxy caching is the first managed-layer feature that sits between the gateway and your container. There's more coming — we'd rather ship what works today and hear what you need next.