Achieving Exactly-Once Behavior with Idempotency Keys API Design
Distributed systems do not offer exactly-once delivery for free. Networks drop responses, clients time out and retry, and message queues redeliver. The practical answer to this unavoidable reality is idempotency keys API design: a contract where a client attaches a unique key to a request, and the server guarantees that processing the same key more than once produces the same result as processing it exactly once.
This pattern is most familiar from payment APIs, where charging a customer twice is unacceptable. The same technique applies to any operation that creates state or has side effects: placing an order, sending an email, provisioning a resource. The IETF has drafted a standard header for this, and providers such as Stripe have documented their implementation in detail. This guide explains the moving parts, then shows a concrete store and filter in Java and SQL.
Why at-least-once delivery forces the issue
Consider a client that sends a payment request and never receives a response. Did the charge succeed? The client cannot tell, so it must retry to avoid losing the order. If the original request actually succeeded and only the response was lost, that retry would charge the customer twice. At-least-once delivery is the realistic guarantee most systems can offer, and it inevitably produces duplicates.
Idempotency closes this gap. The client generates a key once and reuses it across every retry of that same logical operation. The server records the outcome of the first attempt under that key and, on any subsequent request carrying the same key, returns the stored outcome instead of executing again. Eventual-consistency systems lean on this heavily; the broader trade-offs appear in this guide to event-driven microservices and eventual consistency.
The Idempotency-Key header pattern
The convention is a request header, commonly named Idempotency-Key, carrying a client-generated unique value such as a UUID. Stripe popularized this design, and the IETF has published a draft specification standardizing the header’s semantics; both are worth reading at stripe.com and datatracker.ietf.org. The key is opaque to the server, which treats it purely as a deduplication token.
Scope matters. A key is meaningful only within a single endpoint and account; the same UUID sent to two different operations should not collide. Therefore the storage key is typically a composite of the account identifier, the route, and the client key. This prevents one tenant’s keys from interfering with another’s and keeps the namespace clean.
Request fingerprint versus key
A common failure mode deserves explicit handling: a client reuses a key but changes the request body. Perhaps a bug sends two genuinely different charges under one key. If the server blindly replays the first response, it silently drops the second operation, which is worse than a duplicate.
The defense is a request fingerprint: a hash of the meaningful request parameters, stored alongside the key. When a request arrives with a known key, the server compares the incoming fingerprint to the stored one. If they match, it is a legitimate retry, so the server replays the cached response. If they differ, the key is being misused, and the server should reject the request with a clear error, conventionally a 422. This turns a silent data-loss bug into a loud, debuggable failure.
Storing and replaying the original response
To replay a response, the server must persist it. The idempotency record therefore holds the key, the request fingerprint, the response status and body, and a status field tracking the lifecycle of the operation. The lifecycle typically moves from in-progress to completed. Recording the response only after the underlying operation commits is essential, because a crash mid-operation must not leave behind a record that replays a success that never happened.
The Java filter below intercepts requests, manages the idempotency record, and replays completed responses. It pairs with the SQL schema that follows. This kind of cross-cutting concern fits naturally at the adapter layer; the boundaries described in this hexagonal architecture guide are a clean place to put it.
@Component
public class IdempotencyFilter extends OncePerRequestFilter {
private final IdempotencyRepository repo;
@Override
protected void doFilterInternal(HttpServletRequest req,
HttpServletResponse res,
FilterChain chain)
throws ServletException, IOException {
String key = req.getHeader("Idempotency-Key");
if (key == null) { // header is optional; pass through
chain.doFilter(req, res);
return;
}
String scopedKey = accountId(req) + ":" + req.getRequestURI() + ":" + key;
String fingerprint = sha256(bodyOf(req));
// Atomic claim: INSERT ... ON CONFLICT DO NOTHING returns false if the
// key already exists, which closes the concurrent-duplicate race.
boolean claimed = repo.tryClaim(scopedKey, fingerprint);
if (!claimed) {
IdempotencyRecord existing = repo.find(scopedKey);
if (!existing.fingerprint().equals(fingerprint)) {
res.setStatus(422); // same key, different payload
res.getWriter().write("Idempotency-Key reused with a different request");
return;
}
if (existing.status() == Status.IN_PROGRESS) {
res.setStatus(409); // a concurrent request still running
return;
}
replay(existing, res); // completed: return the stored response
return;
}
ContentCachingResponseWrapper wrapper = new ContentCachingResponseWrapper(res);
chain.doFilter(req, wrapper);
repo.complete(scopedKey, wrapper.getStatus(), bodyOf(wrapper));
wrapper.copyBodyToResponse();
}
}
Handling concurrent duplicates with a unique constraint
Retries do not always arrive sequentially. Two copies of the same request can hit the server at once, and a naive check-then-insert leaves a window where both see no record and both execute. The robust fix lives in the database: a unique constraint on the scoped key, combined with an atomic insert.
The first writer wins the insert; the second receives a conflict and knows a sibling is already in flight. The schema below enforces this. The losing request returns a 409 Conflict and the client retries shortly after, by which point the original has usually completed and the response can be replayed. This pushes the race resolution down to the one component that can settle it atomically.
CREATE TABLE idempotency_record (
scoped_key VARCHAR(512) PRIMARY KEY, -- account:route:client_key
fingerprint CHAR(64) NOT NULL, -- sha256 of request body
status VARCHAR(16) NOT NULL, -- IN_PROGRESS | COMPLETED
response_code INT,
response_body TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ NOT NULL DEFAULT now() + INTERVAL '24 hours'
);
-- Atomic claim. Returns a row only when this caller wins the insert;
-- a concurrent duplicate gets nothing and must wait or replay.
INSERT INTO idempotency_record (scoped_key, fingerprint, status)
VALUES (:scopedKey, :fingerprint, 'IN_PROGRESS')
ON CONFLICT (scoped_key) DO NOTHING
RETURNING scoped_key;
-- Background cleanup of expired keys.
DELETE FROM idempotency_record WHERE expires_at < now();
Key scope, expiry, and the limits of the guarantee
Idempotency records cannot live forever, since the store would grow without bound. A common policy, and the one Stripe documents, is a 24-hour retention window. After that, the key expires and a request carrying it is treated as new. This is an acceptable trade-off because legitimate client retries happen within minutes, not days.
It is worth being precise about what this buys you. Idempotency keys provide deduplication of a specific client operation; they are not the same as system-wide message deduplication or distributed consensus. The famous failure mode is a partial write: if your operation touches two systems and only one commits before a crash, an idempotency record alone will not heal the inconsistency. For multi-step writes that must publish events reliably, pair idempotency with the transactional outbox pattern, so the response replay and the downstream event stay consistent.
When not to add idempotency keys
Not every endpoint needs this machinery. Naturally idempotent operations, such as a GET or a PUT that fully replaces a resource, already produce the same result on repeat by definition; layering keys on top adds cost for no benefit. The pattern earns its keep specifically on non-idempotent, side-effecting operations, the POST requests that create or charge.
There is also an honest cost to weigh. Every protected request now reads and writes an extra record, and that store becomes a dependency on the hot path. For low-stakes operations where an occasional duplicate is harmless, the simpler design may be the right one. Reserve the full apparatus for operations where a duplicate is genuinely expensive.
In conclusion, sound idempotency keys API design converts the messy reality of at-least-once delivery into safe, predictable, exactly-once behavior. Scope keys carefully, fingerprint requests to catch misuse, replay stored responses, and lean on a database unique constraint to settle concurrent duplicates. Applied to the side-effecting operations that need it, this pattern makes retries safe and your API trustworthy under the failures that distributed systems guarantee.