codewithzezo
Reading in:English β€” ProfessionalΒ· 4 min read

Designing Idempotent APIs That Survive Network Failures

Networks fail. Clients retry. Without idempotency, you get duplicate charges, double-booked records, and 3 a.m. incidents. A practical guide to making POST endpoints safe to retry.

Every system that communicates over a network will eventually lose a packet at the wrong moment. The client times out, retries, and the server quietly processes the same request twice. If the request was GET /users/42, you get away with it. If it was POST /payments, you owe somebody a refund.

This post covers how to build write endpoints that are safe to retry β€” without burdening every caller with "just be careful."


The Problem

A naive checkout endpoint looks like this:

app.post("/charges", async (req, res) => {
  const { amount, customerId } = req.body;
  const charge = await stripe.charges.create({ amount, customer: customerId });
  await db.charges.insert(charge);
  res.json(charge);
});

Now consider this scenario: the client posts, the server processes the payment, and the response is lost in transit. The client doesn't know whether the charge succeeded. It retries. The customer is charged twice.

The root cause is that the request carries no identity. The server treats every POST as fresh intent.

Idempotency Keys

The fix is well-established: have the client send a unique key with every retryable request.

POST /charges HTTP/1.1
Idempotency-Key: 7f3a9c12-4b2e-4f1a-8d3c-aab1c45ee721
Content-Type: application/json
 
{ "amount": 4500, "customerId": "cus_pk_001" }

The server contract is straightforward:

  1. If you have never seen this key, process the request and persist (key β†’ response) atomically with the work itself.
  2. If you have seen it, return the stored response without repeating the work.

The phrase atomically with the work itself is where most implementations break down.

The Incorrect Approach

A common but broken implementation:

app.post("/charges", async (req, res) => {
  const key = req.headers["idempotency-key"];
 
  const existing = await db.idempotency.get(key);
  if (existing) return res.json(existing.response);
 
  const charge = await stripe.charges.create(...);
  await db.idempotency.insert({ key, response: charge });
  res.json(charge);
});

Two concurrent retries can both pass the get check before either has inserted a record. Both call Stripe. Both produce a duplicate charge. The check-then-act pattern is a classic race condition.

The Correct Approach: Claim First

The fix is to claim the key before doing the work, in an atomic operation that cannot be won by two concurrent requests simultaneously.

app.post("/charges", async (req, res) => {
  const key = req.headers["idempotency-key"];
  if (!key) return res.status(400).json({ error: "Idempotency-Key header is required" });
 
  await db.transaction(async (tx) => {
    // Unique constraint acts as the distributed lock
    const claim = await tx.raw(`
      INSERT INTO idempotency_keys (key, status, request_hash)
      VALUES (?, 'processing', ?)
      ON CONFLICT (key) DO NOTHING
      RETURNING *
    `, [key, hash(req.body)]);
 
    if (claim.rows.length === 0) {
      const existing = await tx.idempotency.get(key);
 
      if (existing.status === 'processing') {
        return res.status(409).json({ error: "Request is still being processed" });
      }
      if (existing.request_hash !== hash(req.body)) {
        return res.status(422).json({ error: "Key reused with a different request body" });
      }
 
      return res.json(existing.response);
    }
 
    // We own the claim β€” safe to perform the side effect
    const charge = await stripe.charges.create({
      amount: req.body.amount,
      customer: req.body.customerId,
      idempotency_key: key, // Forward the key to Stripe as well
    });
 
    await tx.idempotency.update(key, { status: 'done', response: charge });
    res.json(charge);
  });
});

Three properties this gets right:

  • Atomic claim β€” ON CONFLICT DO NOTHING uses the database's unique constraint as the concurrency primitive. Exactly one transaction wins.
  • Request fingerprinting β€” detects clients reusing a key with a different payload, which almost always indicates a client-side bug.
  • Forwarding the key downstream β€” Stripe accepts its own idempotency_key field. If your server crashes between the Stripe call and the database write, the next retry presents the same key to Stripe and receives the original charge, not a new one.

Handling External Side Effects

The most dangerous failure window is between an external API call and your database write. If your server crashes there, you have a processed payment with no local record.

Two strategies:

Strategy When to use
Forward your idempotency key to the downstream service When the downstream supports it (Stripe, AWS, Square, etc.)
Outbox pattern β€” write intent to DB first, execute via background worker When the downstream is not idempotent, or requires exactly-once semantics

For most payment workflows, forwarding the key is sufficient and significantly simpler.

Key Expiry

Idempotency records are not permanent. A reasonable policy:

  • 24 hours for high-volume transactional endpoints
  • 7 days for lower-frequency, higher-stakes operations such as refunds or account closures

Use a database index on created_at and run a periodic cleanup job, or use Redis with a TTL if the response payloads are small.

Checklist

Before shipping any write endpoint:

  1. Is the operation safe to repeat? If so, no idempotency layer is needed.
  2. Does the endpoint accept an Idempotency-Key header?
  3. Is the claim atomic β€” claimed before the work begins?
  4. Is the downstream key forwarded to external services?
  5. Is there a defined TTL for stored records?

Idempotency is one of those correctness properties that users never notice when it works. They notice catastrophically when it doesn't β€” typically through a support queue and a difficult conversation with the finance team.

The investment is modest: one table, one unique index, and a small amount of careful thought about ordering. The alternative is far more expensive.