Back to Archive
May 8, 2026 10 min read Distributed Systems

Queue Integrity.

Redis Queue Serialization: Preventing Race Conditions in Event-Driven Systems

Share Architecture:

You deploy your webhook handler. It works fine with test data. Then a real customer sends 500 orders in quick succession, and your system creates duplicate orders, forgets confirmations, and overwrites inventory. The problem isn't your code—it's the assumption that webhook events arrive in order.

They don't. Shopify fires a "create" event and an "update" event for the same order. The "update" might process first. Your database tries to update an order that doesn't exist yet. Or worse: three webhook events for the same order hit your Node.js server simultaneously, and three separate database transactions try to change the order state at once. Whoever runs last wins, silently overwriting the others.

This is why **queue serialization** is non-negotiable for event-driven systems. I learned this the hard way building OrdersPilot. Now, every event that touches critical data goes through **BullMQ** with per-entity serialization, ensuring that no two operations on the same resource can run simultaneously.

The Problem: Concurrent Event Processing

Imagine this sequence happens in 50 milliseconds:

1. Shopify fires order.created webhook
2. Shopify fires order.payment_confirmed webhook
3. Shopify fires order.updated webhook (inventory change)

Without a queue, your Express server immediately spawns three async handlers. Each handler does the same thing: fetch the order from MongoDB, modify it, and write it back. In a concurrent system, this is a race.

Race sequence

Handler 1, 2, and 3 all read the order while it's still "pending".

Handler 1 sets state to "created" and writes.

Handler 2 sets state to "confirmed" and writes (overwrites Handler 1).

Handler 3 sets state to "updated" and writes (overwrites Handler 2).

Final result: the order is "updated", but it never went through "created" or "confirmed" in the database. Your inventory count got lost.

The Solution: BullMQ Serialization

The key insight is using **job patterns** to serialize operations on the same entity. Instead of processing immediately, we queue the event.

// Pattern 1: Basic Queueing

app.post('/webhooks/order', async (req) => {
  const { id, event } = req.body;
  await orderQueue.add(
    { orderId: id, event, timestamp: Date.now() },
    { jobId: `order-${id}-${Date.now()}` } 
  );
  res.json({ queued: true });
});

// Pattern 2: Redis Locks

orderQueue.process('order-*', async (job) => {
  const { orderId } = job.data;
  const orderLock = await redis.set(`lock:order:${orderId}`, job.id, 'EX', 10, 'NX');
  
  if (!orderLock) throw new Error('Order is locked, retry later');

  try {
    await handleOrderEvent(orderId);
  } finally {
    await redis.del(`lock:order:${orderId}`);
  }
});

// Pattern 3: Event Deduplication

const eventHash = crypto.createHash('md5').update(`${id}-${event}-${created_at}`).digest('hex');
const isDuplicate = await redis.getdel(`webhook:${eventHash}`);
if (isDuplicate) return res.json({ duplicate: true });

await redis.setex(`webhook:${eventHash}`, 86400, '1');
await orderQueue.add({ orderId: id, event }, { jobId: `${eventHash}` });

Real-World Impact

Before

5,000-order flash sale resulted in 47 duplicate orders and 3 lost inventory updates.

After

15,000-order sale processed flawlessly. Zero duplicates. 100% data integrity.

Lessons Learned

01

Queues Aren't Just for Performance

Developers often think queues are for deferring slow work. That's half the story. Queues are also for safety. A queue serializes work that would otherwise race.

02

Test Concurrency, Not Just Correctness

Only load tests with 500+ concurrent events exposed the race condition. Now, every critical path has a concurrency test using tools like autocannon.

03

Redis Locks Have Latency

Each lock operation is a Redis call. If your Redis lock latency creeps above 10ms, you need to shard your locks across multiple instances.

Queue serialization is the unglamorous foundation of reliable systems. It won't make you faster on the surface—it'll actually add latency. But it prevents the silent data corruption that wakes you up at 3 AM.