Components - teamdatatonic/vertex-pipelines-end-to-end-samples GitHub Wiki
Kubeflow Pipelines Components
The package components
contains KubeFlow pipeline components for interacting with Google Cloud.
The following components are implemented:
extract_table
: Extract a table from BigQuery to Cloud Storage.lookup_model
: Look up a mode in the Vertex AI Model Registry given a name (and version optionally).model_batch_predict
: Run a Batch Prediction Job.upload_model
: Uploads a new model version to the Vertex Model Registry, importing a model evaluation, and updating the "default" tag on the model if the new version (challenger) is superior to the previous (champion) model.
These components either augment, extend, or add new functionalities that aren't found in Google Cloud Pipeline Components list.
Creating a new pipeline components package
Update Python dependencies in poetry.lock
, pyproject.toml
, and in packages_to_install
(in the @component
decorator):
- In
pyproject.toml
, add any dependencies that your component uses under[tool.poetry.dependencies]
(each pinned to a specific version) - In
packages_to_install
(in the@component
decorator used to define your component), add any dependencies that your component uses (each pinned to a specific version)
Define your pipeline components using the @component
decorator in Python files under components/src/components
.
You will need to update the __init__.py
file to provide tests.
See the Kubeflow Pipelines documentation for more information about writing pipeline components.
Finally, you will need to install this new components package into the pipelines
package.
Run make install
from the root of the repository to install the new components.
Testing components
Unit tests for components are defined using pytest and should be created under components/tests
.
Take a look at the existing components to see examples of how you can write these tests and perform mocking/patching of KFP Artifact types.
To run the unit tests, run make test
from the root of the repository.