Skip to content

Missing endpoints in localtargets.* A records #62

Closed
@ytsarev

Description

Steps to reproduce

  • Deploy two cross-communicating ohmyglb setup locally
    $ make deploy-full-local-setup
  • Check generated localtargets.* dnsendpoint conf
$ kubectl -n test-gslb get dnsendpoints test-gslb -o yaml
...
spec:
  endpoints:
  - dnsName: localtargets.app3.cloud.example.com
    recordTTL: 30
    recordType: A
    targets:
    - 172.17.0.2
    - 172.17.0.3
    - 172.17.0.4
...
  • Check if coredns returns matching A records
dig +short @localhost localtargets.app3.cloud.example.com
172.17.0.2
172.17.0.4
172.17.0.3

This is expected result. After some time localtargets.* can 'lose' one of the records in the following way:

  • localtargets.* dnsendpoint conf is always consistent
$ kubectl -n test-gslb get dnsendpoints test-gslb -o yaml
...
spec:
  endpoints:
  - dnsName: localtargets.app3.cloud.example.com
    recordTTL: 30
    recordType: A
    targets:
    - 172.17.0.2
    - 172.17.0.3
    - 172.17.0.4
...
  • Meanwhile actual DNS response might lose one of the A records
dig +short @localhost localtargets.app3.cloud.example.com
172.17.0.2
172.17.0.4

Issue is not really deterministic in its behaviour . Meanwhile we faced it several times over multiple deployments
In case of 2 cluster setup only single cluster is affected effectively making exposed through coredns only 5 out of 6 k8s worker.

DNSEndpoint CR generation looks always correct so the problem is somewhere in etcd coredns backend area.

make debug-test-etcd can help in debugging this issue runtime.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions