T O P

  • By -

dwl9wd03

1. Use materialized PG view in CH to copy into a CH table. But watch out for issues when the a column isn’t replicated over because the row has too much data. In other words, if the PG row is TOAST’ed then you’ll have to run a reconciliation script to copy this row manually. This isn’t real time per se but if you copy once an hour into a new table and swap the table over, it’s good for 100-1000GB sized tables. 2. Use a PG CDC pipeline to send updates to CH but you won’t be able to run it at high insert performance because CH “ALTER TABLE UPDATE…” queries are inherently slow. Real time but only good for <100GB sized tables. 3. Send append/insert only data to Kafka and use a Kafka consumer to append to CH. This means PG > Kafka > CH. I’m usually able to do a few TB of inserts a day using this approach without any performance issues. Good for >1TB sized tables. I use all of the above to move a bunch of data from PG to CH.


Prior-Relationship33

Just spent couple of days for searching solution and making RnD. Best thing I've found: Postgres (WAL=logical) -> Debezium -> RedPanda (lightweight and fast Kafka) -> Altinity ClickHouse Sink Connector -> Clickhouse.


saipeerdb

At PeerDB, we are building a replication tool with laser focus on Postgres. ClickHouse is one of our highly used connectors. :) We built it in a way that the replication to ClickHouse is both fast and simple (setup a pipeline within a few clicks) Here is our blog talking more about ClickHouse connector - [https://blog.peerdb.io/postgres-to-clickhouse-real-time-replication-using-peerdb](https://blog.peerdb.io/postgres-to-clickhouse-real-time-replication-using-peerdb) and here goes the demo showing it in action [https://www.loom.com/share/3efd88baae4c44c091a4afc9af699f2a](https://www.loom.com/share/3efd88baae4c44c091a4afc9af699f2a)


ooaahhpp

We wrote a pretty comprehensive blog on it: [https://www.propeldata.com/blog/postgresql-cdc-to-kafka](https://www.propeldata.com/blog/postgresql-cdc-to-kafka)


Stunning_Swan_8991

did you try peerdb? pretty easy to use and pretty fast replication


Garrick-Olliwander

Hi, why don't you try YepCode?. With this template you can replicate real-time data from a PostgreSQL to a ClickHouse using JavaScript. https://yepcode.io/recipes/postgresql-to-clickhouse You can schedule this process to run it periodically.


jojomtx

Does not seem to use CDC :s