Replicator
15 Things You Should Know About Replicator
- Replicator enables hybrid cloud architectures
- Replicator supports out-of-the-box configuration with Confluent Control Center
- Confluent Control Center enables Replicator monitoring
- Replicator can migrate schemas between Confluent Schema Registry clusters
- Replicator supports Role-Based Access Control (RBAC)
- Replicator helps applications recover after a disaster
- Replicator can aggregate messages from multiple clusters
- Replicator supports topic renaming during replication
- Replicator prevents cyclic replication
- Replicator can run as an executable
- Replicator can start from any position within the source partitions
- Replicator supports complex topic replication requirements
- Replicator synchronizes topic configuration
- Replicator supports Single Message Transforms (SMTs) - This feature was deprecated as of CP 7.5
- Replicator verification tool
Execution modes
1. Executable Mode (Standalone CLI)
This is what you used in our session. You run a single binary command (replicator) that spins up its own internal Kafka Connect worker.
- How it works: It reads your properties files and launches a "hidden" Connect cluster consisting of just one node.
- Best For: Quick migrations, development testing, or "Bridge" scenarios where you don't want to manage a full Connect cluster.
- Default Port: 8083. If you have another Connect instance running, this mode will fail to bind to the port.
replicator --consumer.config source.properties --producer.config target.properties --replication.config replication.properties
2. Connector Mode (Plugin / Distributed)
In this mode, you treat Replicator as just another connector (like a JDBC or S3 connector) and install it into an existing Kafka Connect cluster.
- How it works: You download the Replicator JARs/ZIP from Confluent Hub, put them in your Connect
plugin.path, and restart Connect. You then "submit" a JSON configuration to the Connect REST API. - Best For: Production environments. It allows Replicator to scale across multiple Connect workers and be managed via Control Center.
- Key Advantage: High Availability. If one Connect worker dies, another worker in the cluster picks up the replication task.
curl -X POST -H "Content-Type: application/json" --data @replicator.json http://localhost:8083/connectors
3. Docker / Container Mode
Confluent provides a specific Docker image for Replicator (cp-enterprise-replicator). This is essentially an automated version of Mode 1.
- How it works: You pass your configurations as Environment Variables or mount them as volumes.
- Best For: Kubernetes (k8s) or automated CI/CD pipelines.
- Configuration Logic: Environment variables follow a specific naming convention (e.g.,
REPLICATOR_BOOTSTRAP_SERVERS).
Comparison Matrix
| Feature | Executable (CLI) | Connector (Plugin) | Docker |
|---|---|---|---|
| Ease of Setup | High (Single command) | Medium (Needs Connect setup) | High (If using k8s/Compose) |
| Scalability | Vertical (One process) | Horizontal (Cluster-wide) | Vertical/Horizontal |
| Management | Terminal / CLI | Control Center / REST API | Container Orchestrator |
| Isolation | Completely isolated | Shares Connect resources | Isolated Container |
| Reliability | Manual restart needed | Auto-failover | Auto-restart by Orchestrator |
Provenance Headers
Confluent Replicator uses provenance headers to avoid cyclic message replication. When provenance headers are enabled (provenance.header.enable=true), Replicator adds a header to each replicated message that records the origin cluster, topic, and timestamp. This allows Replicator to detect if a message has already been replicated to a cluster, and thus prevents it from being replicated again, which would otherwise cause cycles or infinite loops in active-active (bi-directional) replication setups