Kafka Guarantees
- Messages sent by a producer to a topic and a partition are appended in the same order
- A consumer instance gets the messages in the same order as they are produced
- A topic with replication factor N, tolerates upto N-1 server failures
Message Processing Guarantees
- No guarantee - No explicit guarantee is provided, so consumers may process messages once, multiple times or never at all.
- At most once - This is "best effort" delivery semantics. Consumers will receive and process messages exactly once or not at all.
- At least once - Consumers will receive and process every message, but they may process the same message more than once.
- Effectively once - Also contentiously known as exactly once, this promises consumers will process every message once.

Exactly Once Semantics (EOS)
Key Building Blocks for EOS
Kafka achieves EOS using three main features:
(a) Idempotent Producer
- Every message sent by a producer gets a producer ID (PID) and a sequence number.
- If retries happen (e.g., network glitch), Kafka detects duplicates by checking the sequence number and ignores them.
- Ensures no duplicates on retries → gives idempotent writes.
(b) Transactions
- A producer can group multiple writes (to one or more partitions, topics) into a transaction.
- Kafka ensures that either:
- All writes in the transaction are committed, or
- None are (atomicity).
- Consumers using read_committed isolation see only committed data.
(c) Replication Protocol
- With
acks=all, Kafka ensures that once a transaction is committed, it’s durably replicated to all in-sync replicas (ISRs). - This guarantees durability even if the leader fails immediately after commit.
Putting It Together
Here’s how EOS works in practice:
- Producer starts a transaction →
initTransactions(). - Writes messages with idempotence enabled.
- Calls
commitTransaction()→ Kafka writes a transaction marker ensuring atomic visibility. - Messages are replicated to all ISRs (with
acks=all). - Consumer reads with
isolation.level=read_committed→ sees each committed message exactly once.
Limitations
- EOS works only within Kafka (end-to-end with producers + Kafka + consumers).
- If you write to an external DB/system, you need two-phase commit or outbox patterns.
- Slight overhead due to transaction coordination.
Replication Protocol
-
When a producer writes a message to a partition, it first goes to the leader replica of that partition.
-
The producer can specify the acks setting:
acks=0→ Producer doesn’t wait for acknowledgment.acks=1→ Producer gets an acknowledgment once the leader writes it.acks=all(or-1) → Producer gets acknowledgment only after the message is written to the leader and replicated to all in-sync replicas (ISRs).
Important nuance: Kafka does not guarantee replication to all available replicas after the leader write. Instead, it guarantees replication to the set of in-sync replicas (ISRs).
- If a follower is out of sync, it won’t block the acknowledgment.
- As long as at least one ISR (typically the leader + followers in sync) has the message, Kafka considers it durable.
Kafka’s replication protocol guarantees that once a message has been acknowledged with
acks=all, it has been written successfully to the leader replica and replicated to all in-sync replicas (ISRs).
Outbox Pattern
The Outbox Pattern is one of the most popular patterns used to achieve reliable event-driven integration between a database (like MySQL, PostgreSQL) and a message broker (like Kafka).
Problem it Solves
When a service needs to:
- Update its own database (say insert a new
Order), and - Publish an event to Kafka (like
OrderCreated)
You run into the dual-write problem:
- If you write to the DB but crash before publishing to Kafka → event is lost.
- If you publish to Kafka but crash before writing to DB → system state is inconsistent.
This breaks exactly-once semantics across DB + Kafka.
The Outbox Pattern
Instead of doing both separately, you write only to your database in a single atomic transaction:
- Service writes business data (e.g., new
Order) - In the same transaction, service writes an event record into a special
outboxtable.
Example (Postgres/MySQL):
BEGIN;
INSERT INTO orders (id, customer, amount, status) VALUES (123, 'Deepak', 1000, 'CREATED');
INSERT INTO outbox (event_id, type, payload, status) VALUES (uuid(), 'OrderCreated', '{"id":123,"amount":1000}', 'NEW');
COMMIT;
Now DB and event are guaranteed consistent.
Event Relay (Debezium / Polling)
Next step:
- A relay process reads events from the
outboxtable and publishes them to Kafka. - Can be done via:
- Change Data Capture (CDC) tools like Debezium (reads DB transaction log, automatically streams outbox rows to Kafka).
- Polling (a background service queries the outbox table and sends events).
After publishing, event status can be marked as SENT or the row can be archived.
Benefits
- Atomicity → DB update + event log are one transaction.
- Reliability → No lost or duplicated events.
- Event-driven architecture works smoothly with existing DB.
- Plays well with Kafka + EOS when you integrate external systems.
Example Flow
Order Service:
- User places order.
- Service saves order + inserts event into outbox table.
- Debezium streams outbox → Kafka topic
order.events.
Inventory Service:
- Consumes from
order.events. - Updates stock accordingly.
Outbox to Kafka - EOS
1. The Challenge
- The outbox table is already atomic with your business data.
- But when your producer service reads from the outbox and publishes to Kafka, two risks appear:
- Duplicate sends (retry on failure, service crash before marking event as sent).
- Lost events (service marks row as sent, but crash happens before actually producing to Kafka).
We need EOS here.
2. Solution Approaches
Approach A: Kafka Transactions (EOS Producer)
Kafka has idempotent producers + transactions:
- Configure the producer with:
enable.idempotence=trueacks=alltransactional.id=outbox-relay-1
- Start a transaction:
- Read batch of unsent events from outbox table.
- Begin Kafka transaction.
- Send all events to Kafka.
- Commit Kafka transaction AND update outbox table status (
sent=true) in the same transactional flow.
But here’s the catch: Kafka transactions and DB transactions are separate. You cannot commit both atomically in one step.
So you need a transactional outbox + idempotent updates pattern.
Approach B: Idempotent Deduplication Key
Each event in outbox has a unique event_id (UUID).
- Use
event_idas Kafka message key (or put it in headers). - Kafka producer with idempotence ensures retries don’t create duplicates.
- Consumers can deduplicate by
event_idif needed.
Approach C: Debezium / CDC Outbox (Best Practice)
Instead of writing custom relay code:
-
Use Debezium (reads DB’s transaction log).
-
When your service commits the DB transaction (business row + outbox row), Debezium streams the outbox change directly into Kafka.
-
This is already exactly-once because:
- DB transaction log guarantees no duplicates.
- Kafka’s EOS producer (in Debezium) guarantees idempotence.
This avoids the "dual-write" problem entirely.
3. Practical Design (if writing custom relay service)
- Outbox row has:
event_id (UUID, PK), payload, status, created_at - Relay service logic:
- Poll outbox rows where
status='NEW'. - Start Kafka transaction.
- For each row:
producer.send(new ProducerRecord("topic", event_id, payload)); - Commit Kafka transaction.
- After successful commit, mark those rows as
SENT.
- Poll outbox rows where
- If crash happens:
- Rows still marked
NEW→ safe to retry. - Kafka idempotence + unique key ensure no duplicates.
- Rows still marked
4. Key Points
- Idempotence in Kafka removes duplicates caused by retries.
- Outbox status column ensures no message is lost.
- CDC (Debezium) is the cleanest and production-proven way.
- If you roll your own relay, you must use idempotent producer + unique event_id.
So, the safest way to ensure EOS from Outbox → Kafka is:
- Use CDC (Debezium Outbox pattern) if possible.
- If using a custom service, use Kafka idempotent producer with unique event_id + mark rows after commit.
Idempotent Writer
A writer produces Events that are written into an Event Stream, and under stable conditions, each Event is recorded only once. However, in the case of an operational failure or a brief network outage, an Event Source may need to retry writes. This may result in multiple copies of the same Event ending up in the Event Stream, as the first write may have actually succeeded even though the producer did not receive the acknowledgement from the Event Streaming Platform. This type of duplication is a common failure scenario in practice and one of the perils of distributed systems.
Solution
Generally speaking, this can be addressed by native support for idempotent clients. This means that a writer may try to produce an Event more than once, but the Event Streaming Platform detects and discards duplicate write attempts for the same Event.
Implementation
To make an Apache Kafka® producer idempotent, configure your producer with the following setting:
enable.idempotence=true
The Kafka producer tags each batch of Events that it sends to the Kafka cluster with a sequence number. Brokers in the cluster use this sequence number to enforce deduplication of Events sent from this specific producer. Each batch's sequence number is persisted so that even if the leader broker fails, the new leader broker will also know if a given batch is a duplicate.
To enable exactly-once processing within an Apache Flink® application that uses Kafka sources and sinks, configure the delivery guarantee to be exactly once, either via the DeliveryGuarantee.EXACTLY_ONCE KafkaSink configuration option if the application uses the DataStream Kafka connector, or by setting the sink.delivery-guarantee configuration option to exactly-once if it uses one of the Table API connectors. Confluent Cloud for Apache Flink provides built-in exactly-once semantics. Downstream of the Flink application, be sure to configure any Kafka consumer with an isolation.level of read_committed since Flink leverages Kafka transactions in the embedded producer to implement exactly-once processing.
To enable exactly-once processing guarantees in Kafka Streams or ksqlDB, configure the application with the following setting, which includes enabling idempotence in the embedded producer:
processing.guarantee=exactly_once_v2
Considerations
Enabling idempotency for a Kafka producer not only ensures that duplicate Events are fenced out from the topic, it also ensures that they are written in order. This is because the brokers accept a batch of Events only if its sequence number is exactly one greater than that of the last committed batch; otherwise, it results in an out-of-sequence error.
Exactly-once semantics (EOS) allow Event Streaming Applications to process data without loss or duplication. This ensures that computed results are always consistent and accurate, even for stateful computations such as joins, aggregations, and windowing. Any solution that requires EOS guarantees must enable EOS at all stages of the pipeline, not just on the writer. An Idempotent Writer is therefore typically combined with an Idempotent Reader and transactional processing.