PDP 41 (Enabling Transport Layer Security (TLS) for External Clients) - derekm/pravega GitHub Wiki

Status: Draft

Related Issues:

Table of Contents

Motivation

There are two types of targets for securing communications with TLS (or its older form SSL) in Pravega:

  • Internal communications among Pravega server-side components: Controller-to-Segment Store, Controller-to-Zookeeper, Segment Store-to-Zookeeper, Segment Store-to-Bookkeeper, etc.
  • Client-server communications between Pravega clients and server-side components like Controller and Segment Store services.

In this document, we deal with the latter.

Pravega already supports securing client-server communications using one-way TLS. What we really need to address here is the logistics of that mechanism for different types of Pravega deployments. For the purpose of this document, we divide them into two types:

  • Self-hosted deployments on physical/virtual machines
  • Containerized deployments on self-hosted or cloud-based managed container clusters - especially Kubernetes (K8s) based clusters

Specifically, we need to decide on the approach(es) that address the questions below and implement them.

Questions common to all deployment types:

  1. How are certificates issued by an internal or external certificate authority (CA) for client-facing Pravega services, the Controller and Segment Store services?
  2. How are self-signed certificates provisioned for those services? While it isn't recommended to use self-signed certificates in production, it is not uncommon to use them for development/testing purposes.
  3. How are certificates and the corresponding material such as keys and trust stores installed on those services?
  4. Are certificates and keys rotated on expiry or compromise, and if yes, how is that done?
  5. How do clients trust the server (or rather the client-facing services)?

Questions specific to containerized deployments:

  1. Where does SSL/TLS terminate if the containers are mortal and dynamic?
  2. How are 1, 2, 3, 4 and 5 done automatically, as new containers (or rather Pods*/task groups) come up?
  3. Does server hostname verification work for clients and if yes how? Bypassing hostname verification is not an option, due to security implications.

Considerations Common to Both Types of Deployments

  1. Pravega clients talk to controller and segment store services. Clients may discover Controllers using either of these two mechanisms: a) a static list of instances supplied by the application, and b) dynamic discovery of instances via one or more instances.

    As for Segment Store services, clients discover them dynamically via the Controller.

  2. Many companies have a strict requirement of using certificates signed by a trusted CA for production applications. Admins must have a way to supply trusted CA-signed certificates to be used for the server components. This is especially true for "static" environments such as self-hosted deployments in hardware/virtual machines.

    The CA that signs the node certificates can be a public/external CA, an internal CA, or even a dedicated internal CA created for the Pravega cluster. It is common to use an internal CA when using private DNS names and IP addresses in certificates. An internal CA could be as simple as a public-private key pair and a certificate and as complicated as a full-blown private CA that issues certificates via APIs and Web-based user interfaces to authorized users/accounts.

    Even in the case where certificates are automatically generated, a mechanism for obtaining CA-signed certificates must be in place. For instance, in more dynamic environments such as Kubernetes (K8s) based environments, pods can come up automatically. So, the server components may need to work with automated provisioning and installation of TLS certificates. Even in that scenario, for the establishment of trust via the standard chain-of-trust model, it is essential that the certificates are signed by a trusted CA.

    For development/testing purposes, many prefer using self-signed certificates instead. Self-signed certificates can be used at two levels:

    • The CA is created for the sole purpose of the deployment, and that CA has a self-signed certificate.
    • The certificate created for the individual nodes may themselves be self-signed.
  3. The certificates used by the edge services can be trusted by the clients, either via chain-of-trust or direct trust.

    • Chain-of-Trust: In this trust model, the client trusts the CA (the CA's certificate is in its truststore), and not the server directly. If the server presents a valid certificate issued to them (the subject) by the trusted CA, the client trusts the server for the purpose mentioned in the certificate. This is the standard trust model in PKI.
    • Direct Trust: In this model of trust, the client has a set of one or more certificates it trusts, i.e. those certificates are in its truststore. If the server's certificate is not in that set, the client doesn't trust it. There is no 3rd party arbitrator of trust, unlike the chain-of-trust model. It is a non-standard way of establishing trust, but especially useful when using self-signed server certificates or using certificates that aren't issued to the subject using the certificate (either via subject name field or subject alternate name extension field).

    Among the two trust models, chain-of-trust is the preferred model. The direct trust model is often considered inferior to chain-of-trust, but it is still preferred over having to forego hostname verification on the client side under scenarios where chain-of-trust is not an option.

TLS for Client Communications in Self-hosted Deployments

By "self-hosting", we mean hosting Pravega in physical or virtual machines that you own or control. Such deployments are usually on-premise, but they can also be hosted in VMs provisioned in an infrastructure-as-a-service platform such as Amazon Web Services EC2.

Additional Considerations

  1. We'll assume that the traditional self-hosted deployments are "static" in nature, unlike containerized deployments which are much more "dynamic"/elastic. In such static environments, production applications are often deployed (i.e., installed, configured and orchestrated) manually.

  2. Controller and Segment Store services may (or may not) be front-ended by reverse proxies. When they are forwarded traffic via reverse proxies, all the controller services are likely to be serviced by one reverse proxy rule. Segment Store services, on the other hand, are proxied on a rule-per-segment store basis.

Alternatives & Selected Approach

Approach 1: Manual TLS Setup

In static environments, where production systems may be deployed using a series of manual/semi-automated steps, it may be feasible for the operators to obtain CA-signed certificates (and other TLS material like private keys) and configure the individual service instances with that material during deployment.

Approach 2: TLS Management at the Application-Layer

A second approach is to use/run a custom certificate management component inside the Pravega cluster, and use it to automatically issue certificates to various services. Since client-facing services need to be able to use their existing corporate CAs (or even public CA), there needs to be a way for the built-in certificate management to work with other CA's deployed in the environment.

The primary advantage of using this approach is that it's implementation shall work uniformly in both containerized and non-containerized deployments. The primary disadvantage of this approach is that it brings infrastructure-level concerns into Pravega. Also, managing certificates, encryption keys, CAs, passwords, keystrokes, etc. requires addressing a number of security and compliance concerns, which can quickly become complicated. There are specialized products that do these tasks.

TLS Termination Options When Using Reverse Proxies

As mentioned earlier, Controller and Segment Store services may be set up with reverse proxies front-ending them.

There are two ways of handling TLS, when using reverse proxies:

  1. Terminate TLS at the proxies: At a high-level, here's how it works:

    • A single proxy rule receives and forwards client requests directed for the Controllers. The client is preconfigured with a network address (domain name/IP address and port) that is used in the proxy rule for directing traffic to one of the backend Controller services.
    • Each Segment Store service is proxied via a proxy rule on a rule-per-segment store basis. The address specified in the proxy rule is configured by the admin/operator as the published address and port in the respective Segment Store service. That ensures that the Segment Store service registers the published address and port in Zookeeper. Controller services discover those Segment Store addresses from Zookeeper.
    • TLS terminates at the Reverse Proxy for both Controller and Segment Store services. TLS certificates are configured for the rules. The TLS certificates have the domain name/IP address used in the proxy rules. Administrators may use either a single certificate with all of the domain names/IP addresses specified in the certificate's Subject Alternative Name extension field or separate certificates with the individual domain name/IP address.
    • All traffic between the client and the proxy is encrypted, but the traffic between the proxy and the edge services is over plaintext channels.
  2. Terminate TLS at the services: Alternatively, we can terminate TLS directly at the Service. The consequences are:

    • End-to-end traffic is encrypted.
    • The proxy lets the TLS traffic to pass through unchanged.
    • TLS configuration is done entirely using Pravega configuration files.

Selected Approach

The approach selected was approach 1, i.e., manual TLS setup.

TLS for Client/Server Communications in Containerized Deployments

In this section, we focus on SSL/TLS for containerized deployments of Pravega.

* A terminology note:

We use the terms "container" and "pod" interchangeably in this document for convenience, even though they are not the same. The term "Pod" is used by Kubernetes (K8s) to refer to a higher-level wrapper that groups one or more containers together into a single and atomic unit-of-deployment/scheduling. Some people describe a pod as "atomic container group". Mesos has a similar concept in "Task Groups".

Additional Considerations

Here are some considerations that apply exclusively to containerized deployments:

  1. We'll assume that the scope is limited to deploying Pravega in Kubernetes. The scope does not include standalone Pravega deployment using docker run, as well as Compose and Swarm based deployments.

  2. Containers (or rather Pods) are ephemeral. They can go down without a fuss, be migrated to another container cluster host, or spin up automatically in response to scaling and other events.

  3. Pravega Operator can be used to deploy a Pravega cluster in Kubernetes. As of writing of this document, when deploying containerized Pravega clusters using the Pravega Operator:

    • Segment Store service pods are part of a Kubernetes "StatefulSet". So, their identities (including network identities IP addresses/DNS names) are stable and long-lived.
    • That's not the case with the controller pods, which are members of a "Kubernetes Deployment" object. While each controller pod also gets its own IP address/DNS name, those addresses can change over time.
  4. The following table lists the Kubernetes resources used for exposing the edge services for internal and external access. Internal access means that the edge components are accessible only inside the Kubernetes cluster. External access, on the other hand, exposes them for traffic coming from outside the Kubernetes cluster.

    Components For internal-only access For external access
    Controller Pods Single Kubernetes Service of type ClusterIP Single Kubernetes Service of type Load Balancer
    Segment Store Pods Single Kubernetes Headless Service One-per-pod Kubernetes Service of type:
    • Either, Load Balancer,
    • or, NodePort
  5. How do we automatically issue CA signed certificates for pods or their agents such as services? There are a few options, including (but not limited to):

    • Kubernetes Cluster Root CA: According to a Kubernetes documentation, every Kubernetes cluster has a cluster root CA, which is used by the cluster components to validate API server's certificate. This CA also has an API that can be used for generating certificates for the pods, which will be trusted by rest of the pods and other resources in the cluster.
    • JetStack cert-manager: It is another widely used tool for generating certificates for resources in a Kubernetes cluster. According to its readme file, it is a Kubernetes add-on to automate the issuance and management of TLS certificates. Read more about it here.
  6. An alternative to automatically generating and distributing certificates is to do those tasks manually. This is a three-step process for generating, signing, and distributing certificates to Pods, very much like the manual process we discussed earlier in the context of self-hosted deployments. Here are the steps: a) First we need a CA that will sign the certficates used for the services. Let's assume we have a CA certificate and its private key. b) Generate a certificate and key for each of the services. In this case, we can set the subject's common name as the Kubernetes service name. c) Now, have the certificate signed by the CA. d) After we have the certificates and keys, we need to distribute them the appropriate pods. This can be done using Kubernetes Secrets.

    See this document from Linkerd for a detailed example.

  7. Both Kubernetes Load Balancer service and Ingress controllers support TLS termination. The former typically operates at OSI layer 4 (network-level), while the latter operates at layer 7 (application-level such as HTTP).

Alternatives & Selected Approach

Here are some of the alternatives for setting up SSL/TLS for Kubernetes based Pravega deployments:

  • Approach 1: Automated certificate provisioning, chain-of-trust for clients-to-controller services (C-to-C) and direct trust for clients-to-segment store services (C-to-SS)
  • Approach 2: Automated certificate provisioning, chain-of-trust for both C-to-C and C-to-SS
  • Approach 3: Automated certificate provisioning, chain-of-trust using side-cars in pods for both C-to-C and C-to-SS
  • Approach 4: Manual certificate provisioning and deployment, chain-of-trust for both C-to-C and C-to-SS

We'll cover each one of these in greater detail in the following sub-sections.

Approach 1: Certificates Handled at Infrastructure and Application Layers with Use of Both Chain-of-Trust & Direct Trust

This is the approach that was proposed in the issue.

This approach has the following major ingredients:

  • Automated certificate provisioning
  • Chain-of-trust for clients-to-controllers (C-to-C) communications, and
  • Direct trust for clients-to-segment stores (C-to-SS)

Here are some of the specifics w.r.t. to Pravega Controllers in this approach:

  • SSL/TLS termination point: Controller services are proxied by an HTTP(S) Load Balancer (an application load balancer) using an Ingress resource. SSL/TLS is terminated at the Ingress.
  • Hostname verification and trust model: As Ingress objects provide stable endpoints, we can assign them certificates with stable network identities (DNS names/IP addresses). If we assign them certificates signed by a CA that clients trust, we can have chain-of-trust based trust in the controller services by clients.
  • Service discovery: The clients shall have the address of the Kubernetes Ingress object pre-configured in them.

As for the segment store services, things work in a different way:

  • SSL/TLS termination point: SSL/TLS terminates in the container itself.

  • Hostname verification and trust model: Certificates assigned to segment store pods do not have the respective pod's network identities set in them (via the SAN extension), the assumption being that it is difficult to know the network identities of the segment store pod upfront.

    As the network identity of the segment store pod is missing from the certificate, if the client were to use chain-of-trust, hostname verification would fail. One solution would then be to disable hostname verification altogether. However, that solution is really not an option as mentioned earlier in one of the considerations.

    The solution proposed in the issue was to have the controller instead generate a truststore for the clients that contains the segment store pods' certificates. The clients then use the truststore to establish direct trust on the segment store services.

  • Service discovery: The controllers find the addresses of the segment store services/pods from Zookeeper (ZK), as they do today. Segment store pods shall also register their certificates in ZK, in addition to their endpoint addresses. Controllers then can locate the segment stores' certificates, pack them in a truststore and pass the truststore along to the client.

This approach solves the problem stated in the issue, but also has some shortcomings:

  1. The standard way of establishing trust in the server for the clients is to rely on a third-party point of trust system - the CA, which arbitrates trust between the two parties. The whole point is to avoid having to rely on direct trust. In this approach, while the client still relies on its trust on the CA for client-to-controller interaction and its really the controllers that serve up the certificates of the segment stores to the clients, it may be viewed as a workaround than a real solution.

  2. This approach brings infrastructure/deployment-level concerns into the Pravega application layer. Ideally, such concerns should be addressed at higher-levels - i.e., at the level of K8s/container orchestrators and/ Pravega operator.

Approach 2: Certificates Management and TLS Termination at Infrastructure Layer Supporting Chain-of-Trust

In this approach, TLS is terminated in Kubernetes Ingresses for both controller and segment store pods. The following diagram depicts a representative topology

image

As shown in the figure above, a single Ingress object routes gRPC (HTTP/2) and REST (HTTP/1.1) traffic to one of the Controllers in the set. As for the Segment Store services, Ingresses are deployed on an Ingress-per-Segment-Store basis.

SSL/TLS is terminated at the Ingress objects. Any Ingress controller that supports the respective protocols and SSL/TLS termination can be used. It should be noted that not all Ingress controllers support load-balancing HTTP/2 traffic: this may be especially important from the controllers perspective, as the proposed approach utilizes a single ingress. Among the notable ones that do support HTTP/2 are:

  • ingress-nginx
  • Envoy-based ingress
  • The default GKE ingress controller (It is a beta feature and spins up Google Cloud Platform(GCP) HTTP(S)/L7 LB.)

Certificates assigned to the ingresses have their DNS names and/ ip addresses assigned on them (via the SAN extension). This allows for use of chain-of-trust based trust model for both C-to-C and C-to-SS communicatons. These certificates are provisioned automatically using K8s root CA or JetStack cert-manager.

W.r.t. service discovery:

  • Endpoint address of the Ingress front-ending the Controllers is pre-configured in the client.
  • Each of the segment Store pods publishes the endpoint address of its ingress to Zookeeper upon startup. The endpoint address of the Ingress is obtained through a Kubernetes API call. This can ride on the same mechanism that is used today to obtain the endpoint addresses of the Load Balancer and Node Port services for Segment Store pods.
  • Controller locates the endpoint addresses of the ingresses frontending the segment stores from Zookeeper.

Here are the pros and cons of this approach:

  • Pros
    • SSL/TLS is deployed entirely using infrastructure-level objects created in Kubernetes and Pravega Operator. This prevents abstraction leak on the application-layer (Pravega).
  • Clients can use chain-of-trust to verify the services they are interacting with. Certificates may be signed by an internal CA, a public CA, or even a dedicated CA created for the deployment.
  • Cons
    • Ingress objects are Kubernetes resources, and they have a cost, especially when using them in managed Kubernetes engines of cloud vendors.
    • Ingress objects also consume some IP addresses accessible from outside the Kubernetes cluster. This may not be a big deal usually, but sometimes it becomes necessary saving IP addresses.

Approach 3: Certificates Management at Infrastructure Layer and TLS termination in Side-Cars Supporting Chain-of-Trust

In this approach, we use an add-on sidecar (such as an Nginx container) inside the pods containing "application" containers (Controller and Segment Store containers). The sidecar proxies all traffic to the application containers and terminates TLS traffic.

Note that the sidecar is just another container sitting alongside the application container on the same pod. It is co-scheduled on the same K8s host, as containers in a pod form an "atomic container group". The containers also share resources such as host-name, IP address, parts of a filesystem, etc.

image

In this scheme, the application container serves exclusively on loopback address (127.0.0.1), which prevents it from being accessed directly from outside the pod. How the pods are exposed for exernal and internal access, how clients discover controllers, how controllers discover segment stores, and the what K8s resources are, can remain as they are today; That's a major advantage of this approach.

Here are the pros and cons of this approach:

  • Pros
    • SSL/TLS is deployed entirely using infrastructure-level objects created in Kubernetes and Pravega Operator. This prevents abstraction leak on the application-layer (Pravega).
    • Clients can use chain-of-trust to verify the services they are interacting with. Certificates may be signed by an internal CA, a public CA, or even a dedicated CA created for the deployment.
    • As TLS termination happens inside the pod (in the sidecar), this approach lets us have encrypted traffic all the way upto the pod.
    • It is also (arguably) the easiest to implement, as the sidecar transparently terminates TLS.
  • Cons
    • Using this approach makes it necessary to use an additional component (the sidecar), making things a bit more complex.

Approach 4: Manual Certificate Management and Deployment Supporting Chain-of-Trust

This approach is similar to approach 1 for self-hosted deployments. Admins obtain certificates like earlier.

There are a few variations in how this approach can be applied:

  • TLS terminates inside the controller and segment store containers. This approach requires some changes in how pods (especially segment store pods) come up. There needs to be a way for admins/operators to supply and install certificates before the pod starts servicing requests.
  • TLS terminates in the ingress for the controllers, but in the containers for the segment stores.

More on this later.

Conclusion

TBD

⚠️ **GitHub.com Fallback** ⚠️