D2 Zookeeper Configuration Properties

Contents

Tiers of Configuration

There are tiers of configuration in D2. This is how we structure our configuration.

  • List of all clusters
    • cluster A
      • cluster level configuration (see below for more info)
      • services (all the services that belong to cluster A)
        • service A-1
          • service level properties (see below for more info)
          • loadBalancerStrategyProperties
            • "loadBalancerStrategy" level properties
            • http.loadBalancer.updateIntervalMs
            • http.loadBalancer.globalStepDown
            • other load balancer properties
          • transportClientProperties
            • "transportClient" level properties
            • http.maxResponseSize
            • http.shutdownTimeout
            • other transport client properties
          • degraderProperties
            • "degraderProperties" level properties
            • degrader.lowLatency
            • degrader.maxDropDuration
            • other degrader properties
        • service A-2
          • service level properties
        • other services under cluster A
      • partitionProperties for cluster A
        • "partitionProperties" level properties
        • partitionType
        • partitionKeyRegex
        • other partitionProperties level properties
    • cluster B
      • partitionProperties for cluster B (optional)
        • etc
      • services (all the services that belong to cluster B)
        • service B-1
        • service B-2
        • etc
    • cluster C
    • cluster D
    • etc

As you can see, there are multiple tiers for configuration. Next we'll enumerate all the levels and the configurations that belong to that level.

Cluster Level Properties

Property Name

Description

partitionProperties

A map containing all the properties to partition the cluster. (See below for more details)

services

A list of d2 services that belong to this cluster.

partitionProperties Level Properties

Property Name

Description

partitionType

The type partitioning your cluster use. Valid values are RANGE and HASH.

partitionKeyRegex

The regex pattern used to extract key out of URI.

partitionSize

Only if you choose partitionType RANGE. The size of the partition i.e. what the is the size of the RANGE in one partition

partitionCount

How many partition in the clusters

keyRangeStart

Only if you choose partitionType RANGE. This is the number where the key starts. Normally we start at 0.

hashAlgorithm

Only if you choose partitionType HASH. You have to give the type of hash. Valid values are MODULE and MD5.

Service Level Properties

Property Name

Description

loadBalancerStrategyList

The list of Strategies that you want to use in your LoadBalancer. Valid values are random, degraderV2, degraderV3. Only degraderV3 support partitioning. Random load balancer just choose any random server to send the request to. So you can't do sticky routing if you choose random load balancer.

path

The context path of your service

loadBalancerStrategyProperties

The properties of D2 LoadBalancer.

transportClientProperties

A map of all properties related on the creation transport client

degraderProperties

Properties of D2 Degrader. Basically it's a map of all properties related to how D2 perceives a single server's health so D2 can redirect traffic to healthier server. Contrast this to LoadBalancer properties which is used to determine the health of the entire cluster. The difference is, if the health of cluster deteriorate, d2 will start dropping requests instead of redirecting traffic.

banned

A list of all the servers that shouldn't be used.

transportClient Level Properties

Properties used to create a client to talk to a server.

Property Name

Description

http.queryPostThreshold

The max length of a URL before we convert GET into POST because the server buffer header size maybe limited. Default is Integer.MAX_VALUE (a.k.a not enabled).

http.poolSize

Maximum size of the underlying HTTP connection pool. Default is 200.

http.requestTimeout

Timeout, in ms, to get a connection from the pool or create one, send the request, and receive a response (if applicable). Default is 10000.

http.idleTimeout

Interval, in ms, after which idle connections will be automatically closed. Default is 25000.

http.shutdownTimeout

Timeout, in ms, the client should wait after shutdown is initiated before terminating outstanding requests. Default is 10000.

http.maxResponseSize

Maximum response size, in bytes, that the client can process. Default is 2 MB.

degraderProperties Level Properties

Note that each degrader is used to represent a server among many servers in a cluster.

Property Name

Description

degrader.name

Name that will show up in the logs (make debugging easier)

degrader.logEnabled

Whether or not logging is enabled in degrader

degrader.latencyToUse

What kind of latency to use for our calculation. We support AVERAGE (default), PCT50, PCT90, PCT95, PCT99

degrader.overrideDropDate

What fraction of the call should be dropped. A value larger than 0 means this client will permanenty drop that fraction of the calls. Default is -1.0.

degrader.maxDropRate

The maximum fraction of calls that can be dropped. A value of greater or equal than 0 and less than 1 means we cannot degrade the client to drop all calls if necessary. Default is 1.0.

degrader.maxDropDuration

The maximum duration, in ms, that is allowed when all requests are dropped. For example if maxDropDuration is 1 min and the last request that should not be dropped is older than 1 min, then the next request should not be dropped. Default is 60000.

degrader.upStep

The drop rate incremental step every time a degrader crosses the high water mark. Default is 0.2.

degrader.downStep

The drop rate decremental step every time a degrader recover below the low water mark. Default is 0.2.

degrader.minCallCount

The minimum number of calls needed before we use the tracker statistics to determine whether a client is healthy or not. Default is 5.

degrader.highLatency

If the latency of the client exceeds this value then we'll increment the computed drop rate. The higher the computed drop rate, the less the traffic that will go to this server. Default is 3000.

degrader.lowLatency

If the latency of the client is less than this value then we'll decrement the computed drop rate. The lower the computed drop rate, the more the traffic will go to this server. Default is 500

degrader.highErrorRate

If the error rate is higher than this value then we'll increment the computed drop rate which cause less traffic to this server.

degrader.lowErrorRate

If the error rate is lower that this value then we'll decrement the computed drop rate which in turn will cause more traffic to this server.

degrader.highOutstanding

If the number of outstanding call is higher than this value then we'll increment the computed drop rate which causes less traffic to this server. Default is 10000.

degrader.lowOutstanding

If the number of outstanding call is lower than this value then we'll decrement the computed drop rate which causes more traffic to this server. Default is 500.

degrader.minOutstandingCount

The number of outstanding calls sohuld be greater or equal than this value for the degrader to use the average outstanding latency to determine if high and low watermark condition has been met. High and low water mark conditions are any of these: errorRate, latency and outstandingCount. Default is 5.

degrader.overrideMinCallCount

If overriden, we will use this value as the minimum number of calls needed before we compute drop rate. Default is -1.

loadBalancerStrategy Level Properties

Properties for load balancers. This affects all servers in a cluster.

Property Name

Description

http.loadBalancer.hashMethod

What kind of hash method we should use (this is relevant to stickiness). Valid values are none or uriRegex

http.loadBalancer.hashConfig

If you declare this, you need to define the regexes list that we need to use to parse the URL

http.loadBalancer.updateIntervalMs

Time interval that the load balancer will update the state (meaning should load balancer, rebalance the traffic, should it increase the drop rate, etc). Default value is 5000.

http.loadBalancer.pointsPerWeight

The max number of points a client get in a hashring per 1.0 of weight. Default is 100. Increasing this number will increase the computation needed to create a hashring but lead to more even-ness in the hashring.

http.loadBalancer.lowWaterMark

If the cluster average latency, in ms, is lower than this, we'll reduce the entire cluster drop rate. (This will affect all the clients in the same cluster regardless whether they are healthy or not). Default value is 500.

http.loadBalancer.highWaterMark

If the cluster average latency is higher than this, in ms, we'll increase the cluster drop rate.(This will affect all the clients in the same cluster regardless whether they are healthy or not). Default value is 3000.

http.loadBalancer.initialRecoveryLevel

Once a cluster gets totally degraded, this is the baseline that the cluster use to start recovering. Let's say a healthy client has 100 points in a hashring. At a complete degraded state, it has 0 point. Let's say the initial recovery level is 0.005, that means the client get 0.5 point not enough to be reintroduced (because a client need at least 1 point). Default value is 0.01.

http.loadBalancer.ringRampFactor

Once a cluster is in the recovery mode, this is the multiplication factor that we use to increase the number of point for a client in the ring. For example: a healthy client has 100 points in a hashring. It's completely degraded now with 0 points. The initialRecoveryLevel is set to 0.005 and ringRampFactor is set to 2. So during the #1 turn of recovery we get 0.5 point. Not enough to be reintroduced into the ring. But at #2 turn, because ringRampFactor is 2, then we get 1 point. Turn #3 we get 2 points, etc. Default value is 1.

http.loadBalancer.globalStepUp

The size of step function when incrementing drop rate in the cluster. Default value is 0.2. Example if globalStepUp = 0.2
drop rate is 0.0 then becomes 0.2 then becomes 0.4 etc as the cluster gets more degraded

http.loadBalancer.globalStepDown

Same as http.loadBalancer.globalStepUp except this is for decrementing drop rate