_Database Considerations - blackducksoftware/hub GitHub Wiki
This page has been deprecated. Please see the official Kubernetes Black Duck Installation Guide here.
Introduction
Before you install Black Duck, it is recommended that you take a moment to consider issues relating to persistence of your Black Duck data in your Kubernetes/OpenShift cluster.
Background
Your Black Duck instance stores data in a PostgreSQL database. In the simplest Black Duck configuration, this database data is stored in an underlying "emptyDir" in the cluster. In this configuration, data could be lost if the PostgreSQL container is stopped and restarted.
For temporary installations, evaluations, proof-of-concepts, and non-production deployments, having no persistent DB storage ("emptyDir") should suffice. It is easy to configure and has good performance. This configuration also minimizes the likelihood that the Black Duck server fails because a persistent volume failed.
Note: For OpsSight users, even if the Black Duck Postgres database container is lost, OpsSight annotations and labels will persist in the cluster.
Data Persistence
To avoid loss of data when the DB container restarts, you can do one of three things:
- configure persistent storage (a persistent volume claim) for the PostgreSQL container
- use an external database (e.g., Amazon RDS)
- configure periodic database backups
Each is discussed, below. Work with your database administrators to find the solution that is right for your organization.
Persistent Volume Claims
Using a persistent volume provides immediate recovery in the event the database container is lost. The cost is potentially reduced performance, and also, the need to properly configure and maintain the persistent volume claim. Failures in the underlying persistent volume can cause failures in the Black Duck Database.
This option is discussed in both the Persistent Volume Considerations page and in the "Persistent Storage" section in the Black Duck Installation Parameters page.
External Databases
You can configure your Black Duck deployment to use an external database. The tradeoff is reliability and ease-of-use versus financial cost and potential performance issues during connectivity failures.
Heavy database loads (for example, OpsSight installations) can be hard to manage with a containerized database. As load increases over time, there are a lot of database parameters that are tricky to manage and monitor from inside a container. Although containers themselves are easily tuned in the same way that any Postgres instance is managed, many users wanting maximum performance prefer to use RDS, Cloud SQL, or an internal VM administered by their IT department for Black Duck's Postgres database. It is also easier to manage database backups for externally managed databases.
Before you can use this option, you must properly initialize the external database. For instructions, please reference the Initializing an External Database wiki page.
Once the external database is established, you can configure Black Duck to use it as described in the "External Database" section of the Black Duck Installation Parameters page.
Periodic Database Backups
To maximize performance but still provide some protection against data loss, you can use emptyDir with Postgres, but back up data regularly to a persistent volume.
This approach may make sense for OpsSight users, for the following reasons:
- The OpsSight Connector can have a high rate of scanning in busy clusters, so high-performance is a priority
- Black Duck's Postgres database can be fragile if run on unpredictable filesystems, such as GlusterFS or even a slow NFS device.
Black Duck does not provide a way to configure periodic database backups. You must work with your database administrator to set this up independently.