Data Governance - sgml/signature GitHub Wiki

Videos

  1. Apache Atlas Introduction: Need for Governance and Metadata Management
  2. Installation & Configuration of Apache ATLAS Part 2
  3. Installation & Configuration of Apache ATLAS Part 1
  4. Data Governance using Apache ATLAS
  5. Apache Atlas: A Hands-on Course
  6. Apache Atlas Wiki

Prerequisites

data_governance_limits:
  data_quality: "Automation relies on high-quality data. Inaccurate or incomplete data can lead to errors and poor decision-making."
  complexity: "Data governance involves complex processes and policies, making automation difficult with diverse data sources and systems."
  human_oversight: "Human oversight is necessary for complex decision-making, exception handling, and ensuring compliance with regulations."
  integration: "Integrating automated tools with existing systems and processes can be challenging, especially with legacy systems."
  scalability: "Maintaining the scalability of automated governance tools as data volumes grow can be difficult."
  security: "Ensuring automated processes are secure and comply with data protection regulations is crucial."

Apache Atlas

Comparison Chart

Tool License Type Key Features
Apache Atlas Apache License 2.0 Metadata management, data lineage tracking, data cataloging
Amundsen Apache License 2.0 Data discovery, metadata management, collaboration tools
DataHub Apache License 2.0 Data cataloging, metadata management, data lineage tracking
Magda Apache License 2.0 Data cataloging, metadata management, data lineage tracking
Open Metadata Apache License 2.0 Metadata management, data cataloging, data lineage tracking
Egeria Apache License 2.0 Metadata management, data lineage tracking, data cataloging
Truedat Apache License 2.0 Data cataloging, metadata management, data lineage tracking

ENV

environment_variables:
  - METADATA_CLIENT_HEAP: "1024m"
  - JAVA_HOME: "/path/to/your/java"
  - LOG_DIR: "/path/to/your/logs"
  - METADATA_COLLECTOR_ENABLED: true
  - KNOX_ENABLED: true
  - LDAP_ENABLED: true
  - TLS_ENABLED: true
  - KERBEROS_ENABLED: true
  - METADATA_OPTS: "-Xmx1024m"