Production Lesson #03 — When a Read Query Blocked an Entire Write-Heavy System

Sometimes, the system isn’t slow. It’s waiting.

Table of Contents

Context

Our system is write-heavy. On average, more than 20,000 new records are inserted into the main table every single day.

The read side supports dynamic filtering. Users can filter by multiple fields — status, category, owner, flags, and more. Different combinations are possible.

That flexibility came with a trade-off:

No single dominant query pattern
No predictable access path
Indexing strategy becomes extremely hard to optimize

Composite indexes couldn’t realistically cover every filter combination. And adding too many indexes would significantly hurt write performance.

In a write-heavy system, every additional index is a tax on inserts.

The Business Request

One day, business requested a new feature:

“List all data — without any date range filter.”

This meant:

No selective predicate
No efficient index seek
Full table scan

In development, it worked fine. In production, it didn’t.

What Happened in Production

Once deployed:

The query scanned the entire main table.
SQL Server acquired long-lived shared locks.
Write operations required exclusive locks.
They couldn’t acquire them.

Write requests started queuing. Some were blocked for more than 15 seconds. Monitoring alerts flooded in.

The application was healthy. The database was not.

Why AsNoTracking Didn’t Help

On the read side, we already used AsNoTracking().

That reduces EF Core change tracking overhead. But it does NOT change database isolation behavior.

Under default READ COMMITTED isolation:

Reads acquire shared locks.
Writes require exclusive locks.
Shared and exclusive locks conflict.

A long-running full-table read holds shared locks long enough to starve write operations.

In a write-heavy system, that’s catastrophic.

The Indexing Constraint

Could we just add more indexes?

Not realistically.

Because:

Dynamic filters create many query shapes.
No stable leading column for composite indexing.
Multiple indexes increase insert/update cost.
This system’s critical path is writing.

In write-heavy systems, indexing strategy is not just about query speed — it’s about protecting write throughput.

The Tactical Fix: READ UNCOMMITTED

Given:

The feature was operational listing/reporting.
Absolute real-time accuracy was not critical.
System availability was business-critical.

We applied READ UNCOMMITTED to this specific query.

Effect:

No shared locks.
No blocking writes.
Possible dirty reads.

The blocking stopped immediately. Write latency returned to normal. Alerts disappeared.

Consistency was slightly relaxed. System stability was restored.

Why Snapshot Isolation Might Be the Strategic Fix

READ UNCOMMITTED solved the immediate problem. But it allows dirty reads.

A more robust alternative is Snapshot Isolation.

Instead of locking reads, it uses row versioning.

Readers access a consistent snapshot.
Writers continue normally.
No read-write blocking.

However, nothing is free.

Snapshot Isolation introduces:

TempDB pressure due to version storage
Additional I/O overhead
Potential update conflicts at commit time

In write-heavy environments, row versioning cost must be carefully monitored.

READ UNCOMMITTED is a tactical fix. Snapshot Isolation is a strategic design choice.

Concurrency control is never about right or wrong. It is about choosing which cost you are willing to pay.

Why Not Just Use a Search Engine?

A common question:

“If full-table reads are dangerous, why not move this to a search engine?”

That’s a valid architectural option. But context matters.

1. Data Freshness

Introducing a search engine means:

Event publishing
Index synchronization
Eventual consistency

You trade lock contention for replication delay.

2. Operational Overhead

A search engine is an operational system:

Cluster management
Shard balancing
Reindexing strategy
Monitoring and failure handling

For one listing feature, that may be architectural escalation.

3. Write Amplification

In a write-heavy system (20k+ inserts/day), every write would now:

Persist to database
Publish event
Index into search engine

That increases system complexity and write amplification.

4. Dynamic Filtering Still Exists

Even in a search engine, dynamic filters still create many query shapes.

You move the scalability challenge — you don’t eliminate it.

Architectural escalation should follow bottleneck patterns — not feature pressure.

The Real Lesson

In write-heavy systems:

A read query can become a write killer.
Indexing is a write-throughput decision.
Isolation level is a business trade-off.

In high-write systems, isolation level is not a database setting. It’s a business decision.

And sometimes, the most dangerous query is the one that “just reads everything.”