Network Policy - bcgov/common-service-showcase GitHub Wiki

Network Policy

The OpenShift cluster leverages Network Policies in order to manage expected network traffic flows. More details can be found here.

By default, all namespaces will have a policy named platform-services-controlled-deny-by-default which denies all network traffic in the namespace by default. This follows the Zero-Trust Security (ZTS) Model. ZTS in a nutshell effectively means that all network traffic is disabled by default and must be explicitly defined in a Network Policy before it can be allowed through. This model improves security by enforcing microsegmentation, and ensuring that only explicitly defined network traffic is allowed. One of the core benefits of the ZTS model is that it can prevent data breaches as well as severely limit the attack surfaces of compromised pods, thereby improving overall security.

OpenShift Routes

For the most part, connections into the cluster can take advantage of OpenShift Routes. In order to allow routes to function, we need to apply a general ingress rule which allows any network traffic coming from an OpenShift router/load balancer to pass through to the service. This can be done with the following network policy template here.

export NAMESPACE=<YOURNAMESPACE>

oc apply -n $NAMESPACE -f "https://raw.githubusercontent.com/wiki/bcgov/common-service-showcase/assets/templates/default.np.yaml"

(Archives) Network Security Policy

The OpenShift Silver cluster does not use Aporeto based Network Security Policies anymore. Instead, it uses Network Policies. More details can be found here, with a major note being that at the time of writing (October 22, 2021), only ingress rules are supported at this time. The rest of this article is historic in nature and is no longer relevant other than for first principles and decisionmaking.

As of October 9, 2019, the Pathfinder OpenShift Platform transitioned into a Zero-Trust Security (ZTS) Model. ZTS in a nutshell effectively means that all network traffic is disabled by default and must be explicitly defined in a Network Security Policy before it can be allowed. This model improves security by enforcing microsegmentation, and ensuring that only explicitly defined network traffic is allowed. One of the core benefits of the ZTS model is that it can prevent data breaches as well as severely limit the attack surfaces of compromised pods, thereby improving overall security.

Previous Security Model

Before the transition to ZTS, the Pathfinder OpenShift Platform used to effectively run with the following three policies:

All pods are allowed egress traffic to the internet.
All pods are allowed to connect to any other pod sharing the same namespace as itself.
All pods are allowed to interact with the k8s API.

The above three "default" policies provided most namespaces and pods reasonable security within the cluster, as well as be permissive enough to allow most common forms of network traffic. However, it is by no means completely secure and should only be used as a temporary measure until you are able to apply your own custom Network Security Policies. If you need this oc template, you can download the above three policies here.

Applying Zero Trust Security

While the idea of Zero Trust Security may sound daunting to implement at first, it is fundamentally only thinking about what kinds of network traffic your applications require access to, and then restricting/dropping everything else.

Best Practices

There are some best practices which can be found on DevHub. Before you begin implementing your own custom NSPs, please make sure you take a look at the following resources to make sure you have 1) a working understanding of how NSPs work and 2) what are the best practices to creating and maintaining custom NSPs.

Notes

Some of the key points to take away from the resources above:

Ensure your deployment configs are properly labelled. It needs to have enough to be able to uniquely identify and select on. However, it should not be more complicated than necessary. (ref)
Custom NSPs only need to be defined for outbound traffic from the resource. If you have an application with a defined route and service, you do not need to worry about that as the route itself is an NSP definition.
Our implementation of ZTS is done through Aporeto. The key takeaway is that Aporeto has the concept of a "Processing Unit" (PU) which is how it identifies and manages network policies and enforcement.
- A PU can represent many things, but is normally associated with a pod.
- PU are uniquely identifiable, but for most use cases you do not need to worry about this.
- The identifiers for a PU and a pod are not the same, even though they may point to the same resources.
You can get, describe, and manipulate NSP objects via the OC CLI with the shorthand nsp selector. For example, oc describe nsp your-custom-nsp-name
Network Security Policies are NOT atomic! Expect a small window of delay between when an NSP object is created and when it is enforced on the cluster.

Decision Points

For the most part, we put in a best-effort to adhere to the best practices as outlined in the previous section. However, we do have some exceptions which we will explain:

We elected to only use the app and role labels. We decided not to use the env label as that we found it redundant. As that our environments are already logically separated into namespaces, we elected to ensure that all of our NSP definitions also has a selector for the namespace explicitly added into the definition.
Instead of allowing all containers access to the k8s API, we have gone and only explicitly allowed ONLY the deployer (and Patroni if applicable) service accounts to have explicit access to the k8s API. This is because most of the application containers have no business talking with the k8s control plane. While there is also a builder service account, as our pipelines build our containers in the tools namespace, the actual deployment environments have no need to access the builder portion of the API. An example source and destination NSP with the k8s API is

  source:
    - - "$namespace=${namespace}"
      - "@app:k8s:serviceaccountname=deployer"
  destination:
    - - int:network=internal-cluster-api-endpoint

For the moment, any pods that require internet access can egress to the linternet without restriction. We would like to create more fine-tuned policies to perhaps network traffic to just *.pathfinder.gov.bc.ca DNS entries, we are not sure how to achieve this and are waiting for further input.
We have embedded our NSP manifests directly into our deployment configuration templates. While this was done to minimize effort, we have since learned that the non-atomic nature of NSP propagation can cause a potential service to be up and running before the policy has been propagated and enforced in the cluster. This can potentially cause the deployment to fail because of failed network traffic. We are currently awaiting a programmatic solution to be able to poll for when an NSP is fully in effect in the cluster.

Process

Our transition to a Custom Network Security Policy is straightforwards.

Analyze and enumerate all expected outgoing network traffic in the namespace
Create an NSP object for each expected network connection
Add the NSP manifests into the existing deployment configurations in the pipeline
Add the new/missing labels to your deployment configurations as required
Remove the grandfathered permissive network policies from the namespace
Test to make sure deployments and application behavior is behaving as intended

The hardest part with transitioning to the ZTS model is ensuring that you have handled steps 1 and 2 thoroughly. Once you have properly identified and handled all of that, the transition to ZTS should behave as intended.