Managed Identities - razmipatel/Random GitHub Wiki
The distinction between the Databricks Managed Identity (DBMI) and the Azure Databricks Access Connector (ADAC) is subtle but strategically important for security and architecture design. Let’s break it down clearly.
| Feature | Databricks Managed Identity (DBMI) | Azure Databricks Access Connector (ADAC) |
|---|---|---|
| Purpose | Used by the Databricks control plane to access resources in your Azure subscription (e.g., to provision VMs, read/write to storage accounts, etc.) | Used by Databricks users or workloads (within the workspace) to securely access data in external Azure resources (e.g., ADLS Gen2, Key Vault) |
| Scope | Workspace-level system identity (infrastructure operations) | User/workload-level identity (data access operations) |
| Configured in | Managed by Azure Databricks internally, visible under “Managed identity” in the Databricks workspace resource in Azure Portal | Created manually by you as a standalone resource (Microsoft.Databricks/accessConnectors) and assigned to the workspace |
| Used by | Azure Databricks backend and jobs that interact with Azure APIs | Unity Catalog, Databricks clusters, SQL Warehouses, or notebooks using identity passthrough |
| Typical RBAC Role Assignments | Contributor/Reader roles to required Azure resources for provisioning or monitoring | Storage Blob Data Contributor, Key Vault Secrets User, etc. on data resources |
| Authentication Flow | System-to-Azure-resource | User/workload-to-Azure-resource (via Entra ID) |
-
What it is:
When you deploy an Azure Databricks workspace, Azure automatically creates a managed identity for that workspace.
This identity is used by the Databricks control plane (operated by Microsoft) to perform Azure Resource Manager (ARM) operations within your subscription. -
Typical use cases:
- Accessing or writing to storage during workspace creation.
- Writing diagnostic logs to Azure Log Analytics.
- Managing network interfaces or private endpoints if configured in a VNet.
- Reading from Key Vault for workspace-level secrets.
-
Key takeaway:
Think of the DBMI as “the identity of the workspace itself”, used by Azure Databricks infrastructure — not by your data workloads.
-
What it is:
A separate Azure resource that represents a system-assigned managed identity dedicated to data access from Databricks.Resource type:
Microsoft.Databricks/accessConnectors -
Primary function:
Enables secure Entra ID–based authentication from Databricks to Azure resources such as:- Azure Data Lake Storage (ADLS Gen2)
- Azure Synapse
- Azure SQL
- Key Vault
- Other data sources that support Entra ID auth
-
Why it exists:
Historically, Databricks accessed Azure data using service principals and secrets (less secure).
With the Access Connector, Databricks can use Managed Identity + OAuth 2.0 token exchange to obtain short-lived tokens automatically — no secrets or credentials needed. -
Key takeaway:
The ADAC bridges Databricks user or job identities to Azure data plane resources, ensuring secure, least-privilege access.
Think of an airline analogy:
| Role | Analogy |
|---|---|
| DBMI | The airline company’s operational pass – allows the company to fuel planes, move luggage, and manage gates (infrastructure-level) |
| ADAC | The pilot or crew’s access card – allows them to open cockpit doors or use airport services (data-level access for specific workloads) |
| Scenario | Use DBMI | Use ADAC |
|---|---|---|
| Databricks needs to provision or manage Azure infrastructure | ✅ | ❌ |
| Databricks users need to read/write to ADLS or Key Vault | ❌ | ✅ |
| Unity Catalog configured with Entra ID passthrough | ❌ | ✅ |
| Deploying the workspace or configuring monitoring | ✅ | ❌ |
-
Enable both properly:
- The DBMI should have minimal roles (e.g., Contributor for provisioning only).
- The ADAC should have data access roles aligned to your governance model (e.g., “Storage Blob Data Contributor” on required containers only).
-
Use PIM or RBAC delegation for access connector permissions if operating in regulated environments.
-
Avoid using Service Principals for data access — migrate to Access Connector for passwordless Entra ID authentication.
-
Monitor usage: Audit Azure Activity Logs for DBMI and ADAC separately to ensure they are used correctly.
-
Microsoft Docs:
- Use managed identities in Azure Databricks
- Configure the Azure Databricks Access Connector