Gitlab CICD - vmware/versatile-data-kit GitHub Wiki
(toc generated using https://ecotrust-canada.github.io/markdown-toc)
CI is based on Gitlab CI.
The project CICD can be seen at https://gitlab.com/vmware-analytics/versatile-data-kit/-/pipelines
Contact us if you need access to the CI/CD pipelines.
CICD deploys (automatically) on each change the latest version of Control Service and quickstart-vdk. This is used either for demo purposes or by automated release acceptance tests.
- Get KUBECONFIG from either LastPass (if you have access ask the team) or from Gitlab Variables
- You need to also install aws-cli
- And set the following variables (see CICD Variables below for definitions - their values can be taken from Gitlab Variables or LastPass)
export KUBECONFIG=..
export AWS_ACCESS_KEY_ID=..
export AWS_SECRET_ACCESS_KEY=..
The API is deployed behind API Gateway at https://iaclqhm5xk.execute-api.us-west-1.amazonaws.com/data-jobs/swagger-ui.html
Configuration of API Gateway is at https://us-west-1.console.aws.amazon.com/apigateway/main/api-detail?api=iaclqhm5xk&integration=qbji9io®ion=us-west-1&routes=taal8qu&stage=$default . Only limited people have access to make changes. Open a GitHub issue if you need something changed.
The CICD uses the following variables. Those are injected as environment variables in Gitlab CI. Actual values can be seen in CICD -> Variables of the Gitlab project.
Edited using https://www.tablesgenerator.com/markdown_tables# (use File -> Paste table data ...)
Name | Description | (most likely) value |
---|---|---|
PIP_EXTRA_INDEX_URL | extra index used to download from pypi | https://pypi.org/simple |
PIP_REPO_UPLOAD_URL | PyPI URL where python packages are released/uploaded | https://upload.pypi.org/legacy/ |
PIP_REPO_UPLOAD_USER_NAME | Upload PyPI URL user | |
PIP_REPO_UPLOAD_USER_PASSWORD | Upload PyPI URL password | |
VDK_DOCKER_REGISTRY_URI | The main container registry of Versatile Data Kit where build artifacts are uploaded. It's public and everyone can pull | registry.hub.docker.com/versatiledatakit |
VDK_DOCKER_REGISTRY_USERNAME | User used to authenticate against VDK container registry in order to publish. | |
VDK_DOCKER_REGISTRY_PASSWORD | Password of user used to authenticate against VDK container registry in order to publish. | |
NPM_REGISTRY | NPM package registry hostname. | registry.npmjs.org |
NPM_USERNAME | User used to authenticate against NPM registry. | versatiledatakit |
NPM_PASSWORD | Password of user used to authenticate against NPM registry. | |
NPM_TOKEN | Token generated for automation purposes in NPM registry in order to publish. |
Edited using https://www.tablesgenerator.com/markdown_tables# (use File -> Paste table data ...)
Name | Description | (most likely) Value |
---|---|---|
KUBECONFIG | contains kubeconfig to the kubernetes (in AWS EKS) used to deploy CICD gitlab runners and cicd demo deployment of Versatile Data Kit | |
AWS_DEFAULT_REGION | Default region where AWS EKS (kubernetes) | us-west-1 |
AWS_ACCESS_KEY_ID | Used to authenticate against AWS to get access to kubernetes | |
AWS_SECRET_ACCESS_KEY | Used to authenticate against AWS to get access to kubernetes | |
CICD_CONTAINER_REGISTRY_URI | Container registry used to house cicd/demo data job images | ghcr.io/versatile-data-kit-dev/dp |
CICD_CONTAINER_REGISTRY_USER_NAME | User used to authenticate against cicd container registry | versatile-data-kit-dev |
CICD_CONTAINER_REGISTRY_USER_PASSWORD | User password used to authenticate against cicd container registry | |
CICD_GIT_URI | CICD Git uri used to house cicd/demo data jobs sources | github.com/versatile-data-kit-dev/cicd-data-jobs.git |
CICD_GIT_USER | User used to authenticate against CICD git | versatile-data-kit-dev |
CICD_GIT_PASSWORD | User password used to authenticate against cicd git | |
VDK_API_TOKEN | OAuth2 (CSP) API token used to authenticate against Control Service API | |
VDK_CONTROL_SERVICE_REST_API_URL | Control Service API URI deployed in demo/cicd environment | https://iaclqhm5xk.execute-api.us-west-1.amazonaws.com/data-jobs/swagger-ui.html# |
CICD_VDK_TRINO_HOST | Trino database hostname deployed in our demo/cicd env ; use kubectl port-forward service/test-trino 8080:8080 to access locally | test-trino |
CICD_VDK_TRINO_PORT | Trino database port on which is deployed in our demo/cicd env | 8080 |
To connect to a PostgreSQL instance within a PSQL pod, it's essential to use a password for authentication.
This password can be retrieved through Kubernetes secrets:
kubectl get secret psql-postgresql -o jsonpath="{.data.postgres-password}" | base64 --decode
Once you have the password, use it to access the PostgreSQL instance from within the PSQL pod.