How to setup Data Manager - stonezhong/DataManager GitHub Wiki
- For dm web app:
- You must setup a database server
- You must create a database user
- You must create a database
- For airflow:
- You must setup a database server
- You must create a database user
- You must create a database
Here is an example:
# create database
CREATE SCHEMA `beta-dm` DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
CREATE SCHEMA `beta-airflow` DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
# create user and grant user permission
CREATE USER 'airflow'@'localhost' IDENTIFIED BY 'XYZ';
CREATE USER 'airflow'@'%' IDENTIFIED BY 'XYZ';
FLUSH PRIVILEGES;
GRANT ALL ON `beta-airflow`.* TO 'airflow'@'localhost';
GRANT ALL ON `beta-airflow`.* TO 'airflow'@'%';
FLUSH PRIVILEGES;
CREATE USER 'dm'@'localhost' IDENTIFIED BY 'XYZ';
CREATE USER 'dm'@'%' IDENTIFIED BY 'XYZ';
FLUSH PRIVILEGES;
GRANT ALL ON `prod_dm`.* TO 'dm'@'localhost';
GRANT ALL ON `prod_dm`.* TO 'dm'@'%';
FLUSH PRIVILEGES;
You need to make sure your mordor config db.json
matches your settings
I use Ubuntu 18.04 as my dev machine. Make sure you have python 3.6 (or above) and node 10.x installed.
checkout source code
mkdir ~/dm
cd ~/dm
git clone https://github.com/stonezhong/DataManager.git
create a directory ~/dm/.mordor
, in order to config mordor, you can reference some examples:
mkdir ~/dm/.mordor
# modify ~/dm/.mordor/config.json, make sure it fits your environment
create a virtual environment
mkdir ~/dm/.venv
python3 -m venv ~/dm/.venv
source ~/dm/.venv/bin/activate
pip install pip setuptools --upgrade
pip install wheel mordor2 libsass
setup environment variable
# You can put it in ~/.bashrc
export MORDOR_CONFIG_DIR=~/dm/.mordor
My target machine is CentOS 7. In this example, the target machine is dmdemo2
Install some necessary OS packages
yum install tmux mysql-devel graphviz
Optionally, if you want to use pyspark, you need to install JRE,
yum install java-1.8.0-openjdk
# on Oracle Linux 8, you can do
yum install jdk1.8
First, initialize target for mordor
source ~/dm/.venv/bin/activate
mordor -a init-host -o beta
Build and deploy
cd ~/dm/DataManager/server
DM_STAGE=beta ./build.sh
mordor -a stage -p dm -s beta --update-venv
mordor -a stage -p dmapps -s beta --update-venv
Initialize airflow
# This will install airflow and initialize airflow
mordor -a run -p dm -s beta -cmd="-a setup-airflow"
Initialize dm
# This will initialize dm web app's database
mordor -a run -p dm -s beta -cmd="-a setup-dm"
# login to the target
# To start datamanager web
eae dm
# In production environment, you can do below
python -m pip install gunicorn
gunicorn --workers=10 DataCatalog.wsgi -b 0.0.0.0:8888
# In beta environment, you can do
python manage.py runserver 0.0.0.0:8888
# To start datamanager scheduler
eae dm
./scheduler.py
# start airflow
source ~/airflow/.venv/bin/activate
airflow scheduler -D
airflow webserver -D -p 8080
# If you have firewall, you need to open port
sudo firewall-cmd --zone=public --permanent --add-port=8080/tcp
sudo firewall-cmd --zone=public --permanent --add-port=8888/tcp
sudo firewall-cmd --reload
# on target
eae dmapps
./build.sh <app name>
# note, if you are using oci Data Flow, you need to
1. add oci==2.26.0, oci-core=0.0.6 to data-apps virtual env
2. add oci-core and oci to all Data Application dependencies
3. add oci to airflow environment