Unity Catalog - razmipatel/Random GitHub Wiki

https://learn.microsoft.com/en-gb/azure/databricks/data-governance/unity-catalog/create-metastore

1. What Unity Catalog actually is

  • Unity Catalog is a Databricks account-level governance solution.
  • The metastore is the central data governance container in Unity Catalog. It holds catalogs, schemas, and tables, plus permissions.
  • The metastore is created once per region per Databricks account — not per workspace.

2. Relationship between Unity Catalog and workspaces

  • Workspaces (Dev, Test, WebAuth, Prod, etc.) attach to a Unity Catalog metastore, but do not host it.

  • Once attached, users in that workspace can see the governed data (if permitted).

  • Best practice:

    • One primary metastore per region that all relevant workspaces attach to.
    • Avoid creating separate metastores for each environment unless you want total isolation (different policies, different lineage, different governance).

3. Recommended best practice for environments

  • Prod, Test, Dev workspaces: attach them all to the same Unity Catalog metastore. This ensures consistent governance, lineage, data sharing, and central policy management.
  • WebAuth workspace: if it’s just for authentication or a separate function and doesn’t need governed data, you don’t have to attach it. If it does need governed data, it can also attach to the same metastore.
  • If you need hard isolation (for compliance reasons, e.g. regulated data), you can spin up a separate metastore, but this means duplicating policies and governance.

4. Think of it like this:

  • Unity Catalog Metastore = central policy + governance plane.
  • Workspaces = execution environments that plug into that plane.

So to your point:

  • The Unity Catalog is created at the Databricks account level (not inside one specific workspace).
  • You then attach your Dev, Test, and Prod workspaces to that same metastore so governance is consistent across environments.
  • Your WebAuth workspace can be attached if it actually needs to interact with governed data.
⚠️ **GitHub.com Fallback** ⚠️