How to build and deploy data applications - stonezhong/DataManager GitHub Wiki
Step 1: deploy data-apps to target using mordor
Here is my mordor application config:
"dmapps_beta": {
"stage" : "beta",
"name" : "dmapps",
"home_dir" : "/home/stonezhong/DATA_DISK/projects/DataManager/data-apps",
"deploy_to" : [ "dmhost" ],
"use_python3" : true,
"config" : {
"config.json": "copy"
}
},
Here is how the config.json looks like:
{
"deployer": {
"class": "spark_etl.deployers.HDFSDeployer",
"args": [
{
"bridge": "spnode1",
"stage_dir": "/root/.stage"
}
]
},
"job_submitter": {
"class": "spark_etl.job_submitters.livy_job_submitter.LivyJobSubmitter",
"args": [
{
"service_url": "http://10.0.0.18:60008/",
"username": "root",
"password": "changeme",
"bridge": "spnode1",
"stage_dir": "/root/.stage",
"run_dir": "hdfs:///beta/etl/runs"
}
]
},
"job_run_options": {
"conf": {
"spark.yarn.appMasterEnv.PYSPARK_PYTHON": "python3",
"spark.executorEnv.PYSPARK_PYTHON": "python3"
}
},
"deploy_base": "hdfs:///beta/etl/apps"
}
Once you have mordor setup, you can run command below to deploy it
mordor -a stage -p dmapps --stage beta --update-venv T
Step 2: Build and deploy
ssh dmhost
eae dmapps
./build.sh generate_trading_samples
./build.sh execute_sql
# or you can run below to build all apps
./build_all.sh
After you finish the deployment, ssh to bridge to verify it:
[root@spnode1 ~]# hdfs dfs -ls /etl-prod/apps
Found 4 items
drwxr-xr-x - root supergroup 0 2020-12-18 07:45 /etl-prod/apps/dummy
drwxr-xr-x - root supergroup 0 2020-12-18 07:45 /etl-prod/apps/execute_sql
drwxr-xr-x - root supergroup 0 2020-12-18 07:46 /etl-prod/apps/generate_trading_samples
drwxr-xr-x - root supergroup 0 2020-12-18 07:46 /etl-prod/apps/get_schema
Run an application
# run the cli application
./etl.py -a run -p spark_cli --version=1.0.0.0 --cli-mode
# run a non-cli application remove the --cli-mode option, you can optionally add --input foo.json to feed the application with foo.json as input