TTL synchronization in high-traffic systems can create deterministic load spikes — even when cache hit ratio looks healthy.
At 02:13 AM, latency increased from 120ms to 3.4 seconds. Database CPU reached 95%. Redis was stable.
Cache hit ratio remained 92%.
The cache did not fail. Time alignment did.
Client → ASP.NET Core (.NET 8) → Redis → SQL Server
Cache-aside pattern. TTL = 10 minutes. No jitter. No soft refresh.
public async Task<UserProfile> GetProfileAsync(Guid userId)
{
var key = $"user:profile:{userId}";
var cached = await _cache.GetAsync<UserProfile>(key);
if (cached != null)
return cached;
var profile = await _repository.GetAsync(userId);
await _cache.SetAsync(key, profile, TimeSpan.FromMinutes(10));
return profile;
}
Total traffic: 40,000 requests/second
TTL: 600 seconds
40,000 × 600 = 24,000,000 requests per TTL window
Endpoint accounts for 70% of traffic.
40,000 × 70% = 28,000 RPS
28,000 × 5 seconds = 140,000 requests
Assume 60% hot-key skew:
140,000 × 60% = 84,000 expired-key requests
Distributed across 5 seconds:
84,000 / 5 = 16,800 RPS
Database safe capacity ≈ 5,000 RPS.
16,800 ÷ 5,000 ≈ 3.36× overload.
The average miss rate was only 3,200 RPS. But systems fail on variance, not averages.
When expiration wave hits:
TTL wave → DB spike → ThreadPool inflation → tail latency explosion.
if (await db.LockTakeAsync(lockKey, value, TimeSpan.FromSeconds(5)))
{
try
{
// regenerate
}
finally
{
await db.LockReleaseAsync(lockKey, value);
}
}
Instead of reducing load, it increased latency variance and lock contention.
var jitter = Random.Shared.Next(0, 120);
var ttl = TimeSpan.FromMinutes(10)
+ TimeSpan.FromSeconds(jitter);
await db.StringSetAsync(key, value, ttl);
84,000 misses distributed across 120 seconds:
84,000 / 120 ≈ 700 RPS
700 RPS is safe.
if (cacheEntry.IsSoftExpired)
{
_ = Task.Run(() => RefreshAsync(key));
return cacheEntry.Value;
}
Users never wait for regeneration. Latency stabilizes.
Cache design is about load shaping, not load reduction.