【Envoy-04】Envoy xDS Dynamic Configuration and Control Plane Interactions

Posted by Hao Liang's Blog on Sunday, December 24, 2023

1. Interacting with Control Plane

What’s a control plane

To manage all these configuration files in a central place, we need to introduce a control plane. Control plane propagates all the network configuration to the data plane.

Why is it useful

Control plane subscribes for configuration updates, whenever cluster changed, routes added, listeners added, the control plane will send those updates to envoy and it apply the new configuration dynamically without restarting. Operationally it makes it easier to manage configuration since it’s all in one place. When Envoy start/restart, it will load the configuration from control plane dynamically instead of copying extra config files from node.

2. xDS

  • Envoy’s versioned configuration
  • Various specifics DS’s: CDS, EDS, LDS, EDS…
  • Various transports methods: Files, REST, gRPC
  • Eventual consistency vs ADS( aggregate discovery service): wait until the configuration is consistent.

How xDS working

  • Envoy sends a DiscoveryRequest to the control plane.
  • Control plane responds with a DiscoveryResponse, containing resources Envoy needs to apply.
  • Envoy validates the response and decides whether to accept it or reject it(negative acknowledgement).

SotW vs Delta configuration update policy

Envoy has 2 ways for the control plane to propagate configuration

  • State of the world: In every control plane response, control plane returns the entire state of the resources onto Envoy.( e.g. whenever one Cluster updates, envoy gets the entire list of all the Clusters from control plane)
  • Delta: If a single resource updates, only that resource is sent to Envoy.

3. Demo

Here is an example of changing route configuration dynamically in canary scenario.

xDS yaml demo

# https://github.com/solo-io/hoot/blob/master/04-xds/xds.yaml
node:
  id: node-1
  cluster: edge-gateway

admin:
  access_log_path: /dev/stdout
  address:
    socket_address: { address: 127.0.0.1, port_value: 9901 }

layered_runtime:
  # use runtime to limit number of connections to prevent file descriptor exhaustion
  layers:
    - name: static_layer_0
      static_layer:
        overload:
          global_downstream_max_connections: 100

dynamic_resources:
  ads_config:
    # allows limiting the rate of discovery requests.
    # for edge cases with very frequent requests or due to a bug.
    rate_limit_settings:
      max_tokens: 10
      fill_rate: 3
    # we use v3 xDS framing
    transport_api_version: V3
    # over gRPC
    api_type: GRPC
    grpc_services:
      - envoy_grpc:
          cluster_name: xds_cluster
  # Use ADS for LDS and CDS; request V3 clusters and listeners.
  lds_config: {ads: {}, resource_api_version: V3}
  cds_config: {ads: {}, resource_api_version: V3}

static_resources:
  clusters:
  - name: xds_cluster
    connect_timeout: 0.25s
    type: STATIC
    lb_policy: ROUND_ROBIN
    # as we are using gRPC xDS we need to set the cluster to use http2
    http2_protocol_options: {}
    upstream_connection_options:
      # important:
      # configure a TCP keep-alive to detect and reconnect to the admin
      # server in the event of a TCP socket half open connection
      # the default values are very conservative, so you will want to tune them.
      tcp_keepalive: {}
    load_assignment:
      cluster_name: xds_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 127.0.0.1
                port_value: 9977
  • node.id demonstrates which node is sending the xDS request to the control plane.( tell control plane who the envoy is)
  • dynamic_resources.lds_config and dynamic_resources.cds_config are using ads to configure everything.
  • dynamic_resources.ads_config.rate_limit_settings allows limiting the rate of discovery requests can be sent to control plane, it’s set to prevent exceeded CPU usage.
  • static_resources.clusters specifies the control plane server names xds_cluster

xDS server demo

An simple implementation of xDS server using envoyproxy/go-control-plane

Refer to: 04-xds/xds

const (
    // don't use dots in resource name
	ClusterName1  = "cluster_a"
    ClusterName2  = "cluster_b"
    RouteName     = "local_route"
    ListenerName  = "listener_0"
    ListenerPort  = 10000
    UpstreamHost  = "127.0.0.1"
    UpstreamPort1 = 8082
    UpstreamPort2 = 8083

    xdsPort                  = 9977
    grpcMaxConcurrentStreams = 1000000
)
func makeRoute(routeName string, weight uint32, clusterName1, clusterName2 string) *route.RouteConfiguration {
	routeConfiguration := &route.RouteConfiguration{
		Name: routeName,
		VirtualHosts: []*route.VirtualHost{{
			Name:    "local_service",
			Domains: []string{"*"},
		}},
	}
	switch weight {
	case 0:
		// update route to clusterName1
		routeConfiguration.VirtualHosts[0].Routes = []*route.Route{{
			Match: &route.RouteMatch{
				PathSpecifier: &route.RouteMatch_Prefix{
					Prefix: "/",
				},
			},
			Action: &route.Route_Route{
				Route: &route.RouteAction{
					ClusterSpecifier: &route.RouteAction_Cluster{
						Cluster: clusterName1,
					},
					HostRewriteSpecifier: &route.RouteAction_HostRewriteLiteral{
						HostRewriteLiteral: UpstreamHost,
					},
				},
			},
		}}

	case 100:
	    // update route to clusterName2
		routeConfiguration.VirtualHosts[0].Routes = []*route.Route{{
			Match: &route.RouteMatch{
				PathSpecifier: &route.RouteMatch_Prefix{
					Prefix: "/",
				},
			},
			Action: &route.Route_Route{
				Route: &route.RouteAction{
					ClusterSpecifier: &route.RouteAction_Cluster{
						Cluster: clusterName2,
					},
					HostRewriteSpecifier: &route.RouteAction_HostRewriteLiteral{
						HostRewriteLiteral: UpstreamHost,
					},
				},
			},
		}}
		// canary-roll out: update route based on percentage of weight in each cluster
	default:
		routeConfiguration.VirtualHosts[0].Routes = []*route.Route{{
			Match: &route.RouteMatch{
				PathSpecifier: &route.RouteMatch_Prefix{
					Prefix: "/",
				},
			},
			Action: &route.Route_Route{
				Route: &route.RouteAction{
					ClusterSpecifier: &route.RouteAction_WeightedClusters{
						WeightedClusters: &route.WeightedCluster{
							TotalWeight: &wrapperspb.UInt32Value{
								Value: 100,
							},
							Clusters: []*route.WeightedCluster_ClusterWeight{
								{
									Name: clusterName1,
									Weight: &wrapperspb.UInt32Value{
										Value: 100 - weight,
									},
								},
								{
									Name: clusterName2,
									Weight: &wrapperspb.UInt32Value{
										Value: weight,
									},
								},
							},
						},
					},
					HostRewriteSpecifier: &route.RouteAction_HostRewriteLiteral{
						HostRewriteLiteral: UpstreamHost,
					},
				},
			},
		}}

	}
	return routeConfiguration
}
  • If we want to do a canary rollout for a route update, Envoy sends a xDS request to the xDS server and wait for the configuration updates.
  • xDS generates route configuration based on percentage of weight in each cluster.
  • xDS response to the Envoy with new route configuration.
  • Envoy updates its route configuration dynamically.

Application server demo

Refer to: 04-xds/server.go

/// https://github.com/solo-io/hoot/blob/master/04-xds/server.go
func redServer(rw http.ResponseWriter, r *http.Request) {
    fmt.Fprintln(rw, "red")
}
func blueServer(rw http.ResponseWriter, r *http.Request) {
    rw.WriteHeader(http.StatusAccepted)
    fmt.Fprintln(rw, "blue")
}

func main() {
	l, err := net.Listen("tcp", ":8082")
	if err != nil {
		log.Fatal("listen error:", err)
	}
	l2, err := net.Listen("tcp", ":8083")
	if err != nil {
		log.Fatal("listen error:", err)
	}
	go http.Serve(l, http.HandlerFunc(redServer))
	go http.Serve(l2, http.HandlerFunc(blueServer))

	select {}
}
  • server.go starts 2 servers listen on 2 different port: 8082 and 8083.
  • Server l returns redServer() handler with status code 200 and message red.
  • Server l2 returns blueServer() handler with status code 202 and message blue.

Running the Demo

  • Run the xDS server as control plane
# running the xds server as a control plane
go run xds.go
  • Run the application server and envoy server as data plane
# running a application server with 2 cluster endpoints for route switching
go run server.go
# running a envoy server with xds.yaml configuration
envoy -c xds.yaml
  • Send a curl request to the application server
# request the envoy port 10000, it will redirect to the application server
curl localhost:10000
# 100% returns red with 200 status code because all the traffic go to cluster-a
red
  • Use interactive prompt in xds.go control plane to set a weight of 5% to cluster-b
Enter weight for cluster-b: 5
setting weight to 5
publishing version: snapshow-2
  • Send a curl request to the application server
# request the envoy port 10000, it will redirect to the application server
curl localhost:10000
# 95% chance returns red with 200 status code
red
# 5% chance returns blue with 202 status code
blue

4. Reference