Streaming CDC: PostgreSQL to ClickHouse via Redpanda Connect
It's a common practice to utilize Debezium as the engine for implementing CDC, Change Data Capture. Then push the changes into Apache Kafka, and subsequently direct them to various destinations via Kafka Connect or Apache Flink.
However, managing these JVM-based components can be challenging and costly, in terms of both hardware resources and human effort.
With the new version of Redpanda Connect, as well as the built-in integration with Timeplus, now you can setup streaming CDC without the need for any Java components—delivering simplicity, power, and cost-efficiency.
Redpanda Connect is a declarative data streaming service that solves a wide range of data engineering problems with simple, chained, stateless processing steps. Implemented in Golang, it's simple to deploy, and comes with a wide range of connectors. Since version 4.40, it supports streaming data changes from a PostgreSQL database using logical replication, with plans to extend support to MySQL and other databases.
In this tutorial, we will configure Redpanda Connect to load existing data and real-time changes from a PostgreSQL database, send the CDC events to a Timeplus stream, then subsequently write to a ClickHouse table via a materialized view.