Closed
Description
Steps to reproduce
- Deploy two cross-communicating ohmyglb setup locally
$ make deploy-full-local-setup
- Check generated
localtargets.*
dnsendpoint conf
$ kubectl -n test-gslb get dnsendpoints test-gslb -o yaml
...
spec:
endpoints:
- dnsName: localtargets.app3.cloud.example.com
recordTTL: 30
recordType: A
targets:
- 172.17.0.2
- 172.17.0.3
- 172.17.0.4
...
- Check if coredns returns matching A records
dig +short @localhost localtargets.app3.cloud.example.com
172.17.0.2
172.17.0.4
172.17.0.3
This is expected result. After some time localtargets.*
can 'lose' one of the records in the following way:
localtargets.*
dnsendpoint conf is always consistent
$ kubectl -n test-gslb get dnsendpoints test-gslb -o yaml
...
spec:
endpoints:
- dnsName: localtargets.app3.cloud.example.com
recordTTL: 30
recordType: A
targets:
- 172.17.0.2
- 172.17.0.3
- 172.17.0.4
...
- Meanwhile actual DNS response might lose one of the A records
dig +short @localhost localtargets.app3.cloud.example.com
172.17.0.2
172.17.0.4
Issue is not really deterministic in its behaviour . Meanwhile we faced it several times over multiple deployments
In case of 2 cluster setup only single cluster is affected effectively making exposed through coredns only 5 out of 6 k8s worker.
DNSEndpoint
CR generation looks always correct so the problem is somewhere in etcd coredns backend area.
make debug-test-etcd
can help in debugging this issue runtime.
Activity