Modularising your Configuration Files - pathfinder-analytics-uk/dab_project GitHub Wiki
Links and Resources
Project Code
resources/demo_job.job.yml
It's better to modularise your code and include configurations for workflows in yml files stored in the resources
folder.
You will need to update the relative paths to the referenced resources (such as notebooks).
resources:
jobs:
demo_job:
name: demo_job
tasks:
- task_key: notebook_1_task
notebook_task:
notebook_path: ../notebooks/notebook_1.ipynb
source: WORKSPACE
job_cluster_key: job_cluster
job_clusters:
- job_cluster_key: job_cluster
new_cluster:
spark_version: 15.4.x-scala2.12
spark_conf:
spark.master: local[*, 4]
spark.databricks.cluster.profile: singleNode
azure_attributes:
first_on_demand: 1
availability: SPOT_WITH_FALLBACK_AZURE
spot_bid_max_price: -1
node_type_id: Standard_DS3_v2
driver_node_type_id: Standard_DS3_v2
custom_tags:
ResourceClass: SingleNode
spark_env_vars:
PYSPARK_PYTHON: /databricks/python3/bin/python3
enable_elastic_disk: true
data_security_mode: SINGLE_USER
runtime_engine: STANDARD
num_workers: 0
queue:
enabled: true
Commands
databricks bundle deploy
databricks bundle deploy -t test