Copilot Summary - razmipatel/Random GitHub Wiki

Databricks Architecture & Security Workshop – Discussion Summary

Note: AI-generated summary. Please validate technical accuracy.


Key Topics & Discussion Summaries

1. Project Scope, Success Criteria & Onboarding

  • Reviewed overall project scope and the need to resolve key technical questions before the workshop.
  • Onboarding to AIBE is ongoing.
  • Clear success criteria must be defined.
  • Emphasis on removing blockers prior to engaging a wider stakeholder group.

2. High-Level Architecture & Network Design

  • Azure-based architecture presented, including:
    • VPN / ExpressRoute connectivity
    • Azure Blob Storage for raw data
    • Medallion architecture (Bronze / Silver / Gold)
  • No public IPs assumed; all traffic routed over the Microsoft backbone.
  • Network topology includes:
    • Control plane, compute plane, data plane
    • VNet injection
    • Dedicated subnets for clusters and relay services
    • Private Endpoints for Azure PaaS services
  • Clarification required on the relay agent role in the host subnet.

3. Authentication & Identity Management

  • Authentication via Azure Private Link and Microsoft Entra ID.
  • Confirmed that authentication tokens remain on Microsoft’s backbone.
  • Proposed core user groups:
    • Workspace Admins
    • Data Scientists
    • Data Engineers

4. Databricks Control Plane Connectivity

  • Explored how the Databricks control plane connects to customer VNets and Azure services.
  • Open questions around:
    • Azure Firewall routing
    • UDR usage
    • Whether logs / Hive Metastore require public IPs
  • Databricks confirmed as a first-party Microsoft service; further validation on FQDN/IP requirements requested.

5. Azure PaaS Services (Phase Zero)

  • Required services:
    • Azure Data Lake Storage (ADLS)
    • Azure Key Vault (AKV)
  • Databricks clusters require AKV access for secrets.
  • No requirement for ADF or other PaaS services at this stage.

6. Databricks Control Plane & Storage Access

  • Classic compute
    • Control plane does not access ADLS directly.
    • Compute plane accesses storage via Managed Identity.
  • Serverless compute
    • Requires Private Link connectivity from Databricks cloud account.
    • Planned for later phases.

7. Data Movement & Access Patterns

  • Preferred approach is ADLS for governance and simplicity.
  • Copying data to Blob Storage is possible if required.
  • MVP decision:
    • Single container for Bronze / Silver / Gold layers.
    • Option to separate later as scale increases.

8. Service Principals & RBAC

  • Databricks vs Entra ID service principals discussed.
  • No limitations identified with Entra ID service principals.
  • Least-privilege RBAC reviewed for:
    • ADLS
    • Key Vault
  • Documentation links shared via chat.

9. Custom OS Images & Security Hardening

  • Question raised around custom OS images and installing security/monitoring agents.
  • Databricks guidance:
    • OS customisation not recommended.
    • Native monitoring and compliance tooling available.
  • Identified as a potential security blocker requiring AIB review.

10. Infrastructure Deployment & Automation

  • All infrastructure to be provisioned via Terraform, including:
    • VNets
    • Private Endpoints
    • Databricks workspaces
  • Databricks Terraform provider to be used.
  • Application teams manage Databricks configuration post-deployment.

11. Security Analysis Tool (SAT)

  • SAT introduced to assess Databricks workspace security posture.
  • Limitation:
    • Does not assess broader Azure infrastructure.
  • Additional observability required for AIB standards.

12. Integration with Bitbucket & Cloudbase

  • Requirement for outbound connectivity from Databricks compute.
  • Used for:
    • Source control (Bitbucket)
    • CI/CD (Cloudbase)
  • Network specifics still to be confirmed.

13. Traffic Flow & Security Considerations

  • Traffic is predominantly outbound from AIB to Databricks.
  • Minimal inbound traffic.
  • Firewall traversal and routing remain under review.
  • Must align with AIB network security patterns.

14. Workshop Readiness

  • Workshop will proceed with:
    • Explicit assumptions documented
    • Open items tracked for follow-up
  • Outstanding questions to be resolved post-workshop.

Outstanding Questions & Potential Blockers

  1. Relay Agent / Host Subnet

    • Is a VM or agent required for Secure Cluster Connectivity (SCC)?
    • What is hosted in the host subnet?
  2. Azure Firewall Traffic Routing

    • Does traffic to the Databricks control plane stay on the Microsoft backbone?
    • Are public IPs involved?
  3. FQDN / IP Resolution

    • How are Databricks control plane FQDNs resolved?
    • Firewall rule implications?
  4. Custom OS Images & Hardening

    • Can AIB-mandated security agents be deployed?
    • Are alternative controls acceptable?
  5. Outbound Connectivity Patterns

    • Any bidirectional or special outbound requirements?
    • Bitbucket and Cloudbase specifics?
  6. Serverless Compute Connectivity

    • When is it required (Phase Zero vs later)?
    • What network changes are needed?
  7. Minimum RBAC Permissions

    • Confirm least-privilege RBAC for storage and Key Vault.
  8. Bitbucket & Cloudbase Network Path

    • Exact connectivity paths and firewall requirements.
  9. Databricks Control Plane → ADLS

    • Any direct connectivity required, or compute-only access?
  10. IAM Roles & Entra ID Groups

    • Minimum required roles and group model.

These items were identified as potential blockers and require follow-up with Databricks and internal AIB stakeholders before finalising the architecture.

⚠️ **GitHub.com Fallback** ⚠️