Skip to content

Implement failover load balancing strategy #46

Closed
@donovanmuller

Description

As per the supported load balancing strategies in the initial design a failover strategy should be implemented to ensure the guarantees stated:

Failover - Pinned to a specified primary cluster until that cluster has no available Pods, upon which the next available cluster's Ingress node IPs will be resolved. When Pods are again available on the primary cluster, the primary cluster will once again be the only eligible cluster for which cluster Ingress node IPs will be resolved

Scenario 1:

  • Given 2 separate Kubernetes clusters, X, and Y
  • Each cluster has a healthy Deployment with a backend Service called app and that backend service exposed with a Gslb resource on all 2 clusters as:
apiVersion: ohmyglb.absa.oss/v1beta1
kind: Gslb
metadata:
  name: app-gslb
  namespace: test-gslb
spec:
  ingress:
    rules:
      - host: app.cloud.example.com
        http:
          paths:
            - backend:
                serviceName: app
                servicePort: http
              path: /
  strategy: failover 
    primary: cluster-x
  • Each cluster has one worker node that accepts Ingress traffic. The worker node in each cluster has the following name and IP:
cluster-x-worker-1: 10.0.1.10
cluster-y-worker-1: 10.1.1.11

When issuing the following command, curl -v http://app.cloud.example.com, I would expect the IP's resolved to reflect as follows (if this command was executed 3 times consecutively):

$ curl -v http://app.cloud.example.com # execution 1
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 2
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 3
*   Trying 10.0.1.10...
...

The resolved node IP's that ingress traffic will be sent should be "pinned" to the primary cluster named explicitly in the Gslb resource above, even though there was a healthy Deployment in cluster Y, the Ingress node IPs for cluster Y would not be resolved.

Scenario 2:

  • Same configuration as Scenario 1 except that the Deployment only has healthy Pods on one cluster, cluster Y. I.e. The Deployment on cluster X has no healthy Pods.

When issuing the following command, curl -v http://app.cloud.example.com, I would expect the IP's resolved to reflect as follows (if this command was executed 3 times consecutively):

$ curl -v http://app.cloud.example.com # execution 1
*   Trying 10.1.1.11...
...

$ curl -v http://app.cloud.example.com # execution 2
*   Trying 10.1.1.11...
...

$ curl -v http://app.cloud.example.com # execution 3
*   Trying 10.1.1.11...
...

In this scenario, only Ingress node IPs for cluster Y are resolved given that there is not a healthy Deployment for the Gslb host on the primary cluster, cluster X. Therefore, the "failover" cluster(s) are resolved instead (cluster Y in this scenario).

Now, given that the Deployment on cluster X (the primary cluster) now becomes healthy once again, I would expect the IP's resolved to reflect as follows (if this command was executed 2 times consecutively):

$ curl -v http://app.cloud.example.com # execution 1
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 2
*   Trying 10.0.1.10...
...

The primary cluster's Ingress node IPs are now resolved exclusively once again.

NOTE:

  • The design of the specification around how to indicate the primary cluster as described in this issue is solely for the purpose of describing the scenario. It should not be considered a design.
  • The existence of multiple "secondary" failover clusters should also be considered. For example, if there were 3 clusters (X, Y and Z) in the scenario 2 above, could the Ingress node IPs for both clusters (X and Z) be resolved and if so, how (in terms of "load balancing") would the Ingress node IPs across both those secondary/failover clusters be resolved? Would they use the default round robin strategy, if any strategy at all?

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions