Available cluster options - silviucpp/erlcass GitHub Wiki
The cluster options can be set inside your app.config
file under the cluster_options
key:
{erlcass, [
{log_level, 3},
{keyspace, <<"keyspace">>},
{cluster_options,[
{contact_points, <<"172.17.3.129,172.17.3.130,172.17.3.131">>},
{latency_aware_routing, true},
{token_aware_routing, true},
{number_threads_io, 4},
{queue_size_io, 128000},
{core_connections_host, 1},
{tcp_nodelay, true},
{tcp_keepalive, {true, 60}},
{connect_timeout, 5000},
{request_timeout, 5000},
{retry_policy, {default, true}},
{default_consistency_level, 6}
]}
]}.
-
Use
token_aware_routing
andlatency_aware_routing
-
Don’t use
number_threads_io
bigger than the number of your cores. -
Use
tcp_nodelay
and also enabletcp_keepalive
-
Don’t use large values for
core_connections_host
. The driver is system call bound and performs better with less I/O threads and connections because it can batch a larger number of writes into a single system call (the driver will naturally attempt to coallesce these operations). You may want to reduce the number of I/O threads to 2 or 3 and reduce the core connections to 1 (default).
Example : {contact_points, <<"172.17.3.129">>}
Sets/Appends contact points. The first call sets the contact points and any subsequent calls appends additional contact points. Passing an empty string will clear the contact points. White space is striped from the contact points.
Accepted values: <<"127.0.0.1">>
, <<"127.0.0.1,127.0.0.2">>
or <<"server1.domain.com">>
Example:
{ssl, [
{trusted_certs, [<<"cert1">>, <<"cert2">>]},
{cert, <<"cert_here">>},
{private_key, {<<"private_key_here">>, <<"private_key_pwd_here">>}},
{verify_flags, ?CASS_SSL_VERIFY_PEER_CERT}
]
}
Sets the SSL context and enables SSL.
Default: None
{ssl, [
{trusted_certs, CertsList::list()},
{cert, Cert::binary()},
{private_key, {PrivateKey::binary(), KeyPassword::binary()}},
{verify_flags, VerifyFlags::integer()}
]
}
-
trusted_certs
: Adds one or more trusted certificate. This is used to verify the peer’s certificate. -
cert
: Set client-side certificate chain. This is used to authenticate the client on the server-side. This should contain the entire Certificate chain starting with the certificate itself. -
private_key
: Set client-side private key. This is used to authenticate the client on the server-side. PrivateKey is a key PEM formatted key string and KeyPassword is the password used to decrypt key -
verify_flags
: Sets verification performed on the peer’s certificate.
For verify_flags use one of the values defined in erlcass.hrl
:
-define(CASS_SSL_VERIFY_NONE, 0).
-define(CASS_SSL_VERIFY_PEER_CERT, 1).
-define(CASS_SSL_VERIFY_PEER_IDENTITY, 2).
-
CASS_SSL_VERIFY_NONE
- No verification is performed -
CASS_SSL_VERIFY_PEER_CERT
- Certificate is present and valid -
CASS_SSL_VERIFY_PEER_IDENTITY
- IP address matches the certificate’s common name or one of its subject alternative names. This implies the certificate is also present.
You can use also a combination like : ?CASS_SSL_VERIFY_PEER_CERT bor ?CASS_SSL_VERIFY_PEER_IDENTITY
Default: CASS_SSL_VERIFY_PEER_CERT
Example: {protocol_version, 2}
Sets the protocol version. This will automatically downgrade to the lowest protocol version supported.
Default: 4
Example: {number_threads_io, 1}
Sets the number of IO threads. This is the number of threads that will handle query requests.
Default: 1
Example: {queue_size_io, 8192}
Sets the size of the the fixed size queue that stores pending requests.
Default: 8192
Example: {queue_size_event, 8192}
Sets the size of the the fixed size queue that stores events.
Default: 8192
Example: {core_connections_host, 1}
Sets the number of connections made to each server in each IO thread.
Default: 1
Example: {max_connections_host, 2}
Sets the maximum number of connections made to each server in each IO thread.
Default: 2
Example: {reconnect_wait_time, 2000}
Sets the amount of time to wait before attempting to reconnect.
Default: 2000 milliseconds
Example: {max_concurrent_creation, 1}
Sets the maximum number of connections that will be created concurrently. Connections are created when the current connections are unable to keep up with request throughput.
Default: 1
Example: {max_requests_threshold, 100}
Sets the threshold for the maximum number of concurrent requests in-flight on a connection before creating a new connection. The number of new connections created will not exceed max_connections_host.
Default: 100
Example: {requests_per_flush, 128}
Sets the maximum number of requests processed by an IO worker per flush.
Default: 128
Example: {constant_reconnect, 0}
Configures the cluster to use a reconnection policy that waits a constant time between each reconnection attempt. Time is specified in milliseconds. Use 0 to perform a reconnection immediately.
Example: {exponential_reconnect, {2000, 60000}}
Configures the cluster to use a reconnection policy that waits exponentially longer between each reconnection attempt; however will maintain a constant delay once the maximum delay is reached.
A random amount of jitter (+/- 15%) will be added to the pure exponential delay value. This helps to prevent situations where multiple connections are in the reconnection process at exactly the same time. The jitter will never cause the delay to be less than the base delay, or more than the max delay.
Example: {coalesce_delay, 200}
Sets the amount of time, in microseconds, to wait for new requests to coalesce into a single system call. This should be set to a value around the latency SLA of your application’s requests while also considering the request’s roundtrip time. Larger values should be used for throughput bound workloads and lower values should be used for latency bound workloads.
Default: 200
Example: {request_ratio, 50}
Sets the ratio of time spent processing new requests versus handling the I/O and processing of outstanding requests. The range of this setting is 1 to 100, where larger values allocate more time to processing new requests and smaller values allocate more time to processing outstanding requests.
Default: 50
Example: {max_schema_wait_time, 10000}
Sets the maximum time to wait for schema agreement after a schema change is made (e.g. creating, altering, dropping a table/keyspace/view/index etc).
Default: 10000
Example: {token_aware_routing_shuffle_replicas, true}
Configures token-aware routing to randomly shuffle replicas. This can reduce the effectiveness of server-side caching, but it can better distribute load over replicas for a given partition key.
Note:
Token-aware routing must be enabled for the setting to be applicable.
Default: true (enabled)
Example: {max_reusable_write_objects, 4294967295}
Sets the maximum number of "pending write" objects that will be saved for re-use for marshalling new requests. These objects may hold on to a significant amount of memory and reducing the number of these objects may reduce memory usage of the application. The cost of reducing the value of this setting is potentially slower marshalling of requests prior to sending.
Default: Max unsigned integer value (4294967295)
Example: {speculative_execution_policy, null}
Enable/disable constant speculative executions with the supplied settings.
Accepted values:
-
null
will disable -
`{ConstantDelayMs, MaxSpeculativeExecutions} will enable with this settings
Example: {connect_timeout, 5000}
Sets the timeout for connecting to a node.
Default: 5000 milliseconds
Example: {heartbeat_interval, 30}
Sets the amount of time between heartbeat messages and controls the amount of time the connection must be idle before sending heartbeat messages. This is useful for preventing intermediate network devices from dropping connections.
Default: 30 seconds
Example: {idle_timeout, 60}
Sets the amount of time a connection is allowed to be without a successful heartbeat response before being terminated and scheduled for reconnection.
Default: 60 seconds
Example: {request_timeout, 12000}
Sets the timeout for waiting for a response from a node.
Default: 12000 milliseconds
Example: {credentials, {<<"username">>, <<"password">>}}
Sets credentials for plain text authentication.
Example: {load_balance_round_robin, true}
Configures the cluster to use round-robin load balancing. The driver discovers all nodes in a cluster and cycles through them per request. All are considered 'local'.
Example: {load_balance_dc_aware, {"dc_name", 2, true}}
Configures the cluster to use DC-aware load balancing. For each query, all live nodes in a primary 'local' DC are tried first, followed by any node from other DCs.
This is the default, and does not need to be called unless switching an existing from another policy or changing settings. Without further configuration, a default local_dc is chosen from the first connected contact point, and no remote hosts are considered in query plans. If relying on this mechanism, be sure to use only contact points from the local DC.
{load_balance_dc_aware, {LocalDc, UsedHostsPerRemoteDc, AllowRemoteDcsForLocalCl}}
*
-
LocalDc
- The primary data center to try first -
UsedHostsPerRemoteDc
- The number of host used in each remote DC if no hosts are available in the local dc -
AllowRemoteDcsForLocalCl
- Allows remote hosts to be used if no local dc hosts are available and the consistency level isLOCAL_ONE
orLOCAL_QUORUM
Example: {token_aware_routing, true}
Configures the cluster to use token-aware request routing, or not. This routing policy composes the base routing policy, routing requests first to replicas on nodes considered 'local' by the base load balancing policy.
Default is true (enabled).
Example:
-
{latency_aware_routing, true}
-
{latency_aware_routing, {true, {2.0, 100, 10000, 100 , 50}}}
Configures the cluster to use latency-aware request routing, or not. This routing policy is a top-level routing policy. It uses the base routing policy to determine locality (dc-aware) and/or placement (token-aware) before considering the latency.
{Enabled, {ExclusionThreshold, ScaleMs, RetryPeriodMs, UpdateRateMs, MinMeasured}}
-
Enabled
: State of the future -
ExclusionThreshold
- Controls how much worse the latency must be compared to the average latency of the best performing node before it penalized. -
ScaleMs
- Controls the weight given to older latencies when calculating the average latency of a node. A bigger scale will give more weight to older latency measurements. -
RetryPeriodMs
- The amount of time a node is penalized by the policy before being given a second chance when the current average latency exceeds the calculated threshold (ExclusionThreshold * BestAverageLatency). -
UpdateRateMs
- The rate at which the best average latency is recomputed. -
MinMeasured
- The minimum number of measurements per-host required to be considered by the policy.
Defaults: {false, {2.0, 100, 10000, 100 , 50}}
Example: {tcp_nodelay, false}
Enable/Disable Nagel’s algorithm on connections.
Default: true (disabled).
Example: {tcp_keepalive, {true, 60}}
Enable/Disable TCP keep-alive
Default: false
(disabled).
Example: {default_consistency_level, ?CASS_CONSISTENCY_LOCAL_QUORUM}
Set the default consistency level
Default: ?CASS_CONSISTENCY_LOCAL_QUORUM
Example: {retry_policy, {default, false}}
Retry polices allow the driver to automatically handle server-side failures when Cassandra is unable to fulfill the consistency requirement of a request.
Important: Retry policies do not handle client-side failures such as client-side timeouts or client-side connection issues. In these cases application code must handle the failure and retry the request. The driver will automatically recover requests that haven’t been written, but once a request is written the driver will return an error for in-flight requests and will not try to automatically recover. This is done because not all operations are idempotent and the driver is unable to distinguish which requests can automatically retried without side effect. It’s up to application code to make this distinction.
By default, the driver uses the default
retry policy for all requests unless it is overridden.
Supported values:
This policy retries queries in the following cases: - On a read timeout, if enough replicas replied but data was not received. - On a write timeout, if a timeout occurs while writing the distributed batch log - On unavailable, it will move to the next host
In all other cases the error will be returned. This policy always uses the query’s original consistency level.
This policy may attempt to retry requests with a lower consistency level. Using this policy can break consistency guarantees.
This policy will retry in the same scenarios as the default policy, but it will also retry in the following cases:
-
On a read timeout, if some replicas responded but is lower than required by the current consistency level then retry with a lower consistency level.
-
On a write timeout, Retry unlogged batches at a lower consistency level if at least one replica responded. For single queries and batch if any replicas responded then consider the request successful and swallow the error.
-
On unavailable, retry at a lower consistency if at lease one replica responded.
The goal of this policy is to attempt to save a request if there’s any chance of success. A writes succeeds as long as there’s a single copy persisted and a read will succeed if there’s some data available even if it increases the risk of reading stale data.
This policy never retries or ignores a server-side failure. The error is always returned.