Description
What did you do?
We use TICDC 6.5.2 to sync data into our KAKFA cluster。There are approximately 4000 topics in our KAKFA cluster and we created 50+ ticdc jobs。
What did you expect to see?
All components run normally。
What did you see instead?
The latency(including produce and consumer latency) of the KAFKA controller increased immediately after we started the jobs.
We checked the authorizer log on the KAFKA controller node and found huge numbers of Topic Describe requests.
After more experiments, we found that every TICDC job tried to describe all the topics in the KAFKA cluster every 5 seconds which caused the controller overload.
After checking the source code, we found that there was an unnecessary operation when the sinkv2 generated metrics by running this:
m.updateBrokers()
which meant to get broker info but triggered unnecessarily describe requests for all topics.
Versions of the cluster
Upstream TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
(paste TiDB cluster version here)
Upstream TiKV version (execute tikv-server --version
):
(paste TiKV version here)
TiCDC version (execute cdc version
):
v6.5.2
Activity