YC Medical
ENTER

Exactly Once Is a Lie (And What to Do About It)

Caution

SEMANTIC AUDIT IN PROGRESS: “Exactly once” guarantee detected in vendor documentation. Cross-referencing with distributed systems literature. Discrepancy identified. Initiating accuracy protocol.

Every messaging system makes a promise about how many times each message will be delivered. The options are:

  • At most once: the message may be delivered zero or one time. It could be lost. It will never be duplicated.
  • At least once: the message will definitely be delivered. It may be delivered more than once. It will never be lost.
  • Exactly once: the message will be delivered precisely one time. No loss. No duplication.

Exactly once sounds like the obvious choice. It is also why understanding delivery guarantees is one of the most important topics in streaming systems—because exactly once requires significant infrastructure to implement, carries real performance costs, and is still not perfectly achievable across all system boundaries.


🔁 Why Duplicates Happen

Before looking at solutions, understand the failure mode.

A Kafka consumer reads a message and processes it. Before it can commit (acknowledge) the offset back to Kafka, it crashes.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Consumer                  Kafka
    │                       │
    │── read offset 42 ────▶│
    │                       │
    │  [process event 42]   │
    │  [write to DB] ✓      │
    │                       │
    │  [CRASH]              │
    │                       │
    [restart]               │
    │── read from offset 42 ─▶│  ← Consumer hasn't committed; must re-read
    │                       │
    │  [process event 42]   │
    │  [write to DB] ← DUPLICATE!

The event has been written to the database twice. From the consumer’s perspective, it has no way to know the first write succeeded—it crashed before it could confirm.

This is not a Kafka bug. It is a fundamental reality of distributed systems: networks can be interrupted at any moment, and there is no way to atomically link the acknowledgement to an external side effect.


🎛️ At Most Once: Accept the Loss

The simplest config: commit offsets before processing.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
consumer = KafkaConsumer(
    'rides',
    enable_auto_commit=True,      # Offset committed on schedule
    auto_commit_interval_ms=1000, # Every second
    auto_offset_reset='latest'
)

for message in consumer:
    # Offset may have been committed before this line runs
    # If we crash here, we lose the event
    process_and_write(message.value)

If the consumer crashes between the commit and the write, the event is lost. This is acceptable for use cases where losing occasional events is tolerable: log collection, non-critical metrics, analytics where approximate counts are sufficient.


🔂 At Least Once: The Default Streaming Mode

The safer default: commit offsets only after successful processing.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
consumer = KafkaConsumer(
    'rides',
    enable_auto_commit=False,   # Manual offset management
    group_id='rides-processor'
)

for message in consumer:
    result = process_and_write(message.value)
    if result.success:
        consumer.commit()       # Only acknowledge after confirmed write

If the consumer crashes after writing but before committing, the event is reprocessed. We get duplicates. But we never lose data.

At least once is the default posture for most streaming pipelines. The implication is that your downstream sink must be able to handle duplicate writes gracefully.


🎯 Exactly Once: The Full Machinery

True exactly once—no loss, no duplicates—requires three coordinated components:

  1. Idempotent producer: Kafka ensures that if the same message is sent twice due to a retry, the broker deduplicates it.
  2. Transactional producer: Multiple messages can be written atomically to Kafka—all succeed or all fail.
  3. Flink checkpointing aligned with Kafka offset commits: the processor’s state and the consumer offset are snapshotted atomically.

Flink implements exactly once end-to-end using a two-phase commit protocol between its checkpointing mechanism and the sink connector.

When a checkpoint begins:

  1. Flink’s Kafka source records the current Kafka offsets in the checkpoint.
  2. The Kafka sink connector begins a transaction for all records produced since the last checkpoint.

When the checkpoint completes: 3. The Kafka sink commits the transaction—records become visible to downstream consumers. 4. If the checkpoint fails: the transaction is aborted. Records are not visible. The job restarts from the previous checkpoint.

This ensures that records appear in the output topic if and only if the corresponding Kafka input offsets have been checkpointed.

1
2
3
4
5
6
7
# Enabling exactly-once in Flink Kafka Sink (Java/Scala API)
KafkaSink.builder()
    .setBootstrapServers("localhost:9092")
    .setRecordSerializer(MySerializer.create())
    .setDeliveryGuarantee(DeliveryGuarantee.EXACTLY_ONCE)
    .setTransactionalIdPrefix("my-job")  # Unique prefix for transactions
    .build()

The Cost of Exactly Once

This performance profile matters in practice:

Guarantee Latency Throughput Complexity
At most once Lowest Highest Trivial
At least once Low High Low
Exactly once Higher Lower (~20-30% overhead typical) High

Exactly once requires longer checkpoint intervals (to reduce the overhead of transaction coordination), which increases end-to-end latency. For many use cases, the latency penalty is unacceptable.


🏥 The Practical Alternative: Idempotent Sinks

In most production streaming systems, the engineering answer is not exactly-once semantics—it is at least once delivery with an idempotent sink.

An idempotent operation produces the same result whether applied once or a hundred times. If your sink is designed to handle duplicate writes without producing duplicate data, you get the effective behaviour of exactly once without the overhead of distributed transactions.

UPSERT as Idempotency

1
2
3
4
5
6
7
-- PostgreSQL: write the same row twice, get one row
INSERT INTO hourly_revenue (zone_id, window_start, num_trips, total_revenue)
VALUES (132, '2026-03-23 14:00:00', 47, 892.40)
ON CONFLICT (zone_id, window_start)
DO UPDATE SET
    num_trips = EXCLUDED.num_trips,
    total_revenue = EXCLUDED.total_revenue;

Running this statement ten times produces one row—the final values. This works because we are overwriting with the same values, not accumulating.

1
2
3
-- This does NOT work as idempotent:
ON CONFLICT DO UPDATE SET total_revenue = total_revenue + EXCLUDED.total_revenue;
-- Running twice: 892.40 becomes 1784.80

The key is to use replacement semantics (overwrite with final value) rather than accumulation semantics (add to running total) when designing for idempotency.

Deduplication with a Unique Constraint

For event-level sinks where each event should appear exactly once:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
-- Create a unique constraint on your event ID
CREATE TABLE processed_rides (
    ride_id     UUID PRIMARY KEY,
    zone_id     INTEGER,
    amount      DOUBLE PRECISION,
    pickup_time TIMESTAMP
);

-- Any duplicate insertion silently does nothing
INSERT INTO processed_rides VALUES (...)
ON CONFLICT (ride_id) DO NOTHING;

The event ID becomes your deduplication key. The database enforces uniqueness. Duplicate deliveries from your Kafka consumer are absorbed without side effects.


🧬 Choosing Your Guarantee

Use this decision framework for each pipeline you build:

1
2
3
4
5
6
7
8
9
Can I tolerate losing events?
├── Yes → At Most Once (simple, fast)
└── No
    ├── Can my sink handle duplicate writes idempotently?
    │   ├── Yes → At Least Once + Idempotent Sink (recommended for most cases)
    │   └── No (e.g., sending emails, charging cards, external API calls)
    │       └── Exactly Once (with all its complexity and cost)
    └── Do I need sub-second latency AND no duplicates?
        └── Reconsider your architecture — this tradeoff is fundamental

For the vast majority of data engineering use cases—aggregation pipelines, analytical dashboards, feature stores—at least once with UPSERT sinks is the correct and practical choice. Save exactly once for payment processing, notification dispatch, and other irreversible side effects.


🔬 What the Industry Actually Does

At Uber, the real-time analytics platform (Apache Pinot + Kafka) uses at least once delivery with time-series based deduplication. At LinkedIn (where Kafka was born), the standard recommendation for data pipelines is at least once with idempotent consumers. Netflix’s Flink deployments use exactly once selectively—for financial settlement pipelines and billing—and at least once for their analytics workloads.

The pattern is consistent: match the delivery guarantee to the cost of the duplication. If a duplicate record in your dashboard causes a minor inaccuracy, at least once is fine. If a duplicate triggers a double charge, invest in exactly once.


← Previous: The Late Data Problem: Dead Zones, Watermarks, and Event Time

Next: Lambda, Kappa, and the Architecture You Actually Need →