Deploy FATE v1.5.0 with Docker Compose - FederatedAI/KubeFATE GitHub Wiki
With the release of FATE v1.5.0 with long-term support version, this document will outline how to quickly deploy and use it for federated learning with docker-compose.
This document will illustrate the deployment of FATE clusters on two Linux hosts with docker-compose.
Deployment of FATE with docker-compose requires three Linux machines, including two worker machines and one deployment machine.
-
Worker Machines: Linux OS (required); 8 cores and 16GB RAM (recommended)
-
Deployment Machine: Any machine running a Linux environment
In this tutorial, we will use a VMware workstation to create two virtual machines. The local Windows WSL environment can be used as the deployment machine or either of the two virtual machines.
The details of the installation and deployment environments are as follows:
Role | Hostname | OS | IP |
---|---|---|---|
Worker Machine | partyA | CentOS 7 | 192.168.0.9 |
Worker Machine | partyB | CentOS 7 | 192.168.0.10 |
Deployment Machine (WSL) | localhost | Ubuntu-18.04 | localhost |
The following content assumes the FATE host runs on a CentOS environment, however, a user can easily switch to other OS like with slightly adjustment.
Perform these operations on both destination machines simultaneously.
# Turn firewall off
sudo systemctl stop firewalld
# View firewall status
sudo systemctl status firewalld
# step 1: Install the necessary system tools
sudo yum install -y yum-utils
# Step 2: Add the source information of the software
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# Step 3: Install Docker
sudo yum install docker-ce docker-ce-cli containerd.io
# Step 4: Start Docker service
sudo service docker start
# Verify that Docker is installed properly.
sudo docker run hello-world
If a user fails to install docker through the above method, try others listed in installation methods.
# step 1: Download Docker Compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
# step 2: Add the executable privilege
sudo chmod +x /usr/local/bin/docker-compose
# step 3: Create a symbolic link to /usr/bin
sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
# step 4: Test the installation
sudo docker-compose --version
If a user fails to install docker through the above method, try others listed in installation methods.
# Add a new user fate
sudo useradd -s /bin/bash -g docker -d /home/fate fate
# Set the user password
sudo passwd fate
Perform this operation on the deployment machine
# step 1: Generate public and private keys
ssh-keygen
# step 2: Send to partyA
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
# step 3: Send to partyB
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
To install FATE, a user need to download the corresponding installation package, unzip, configure, then install. Here's the specific operation method:
Download the docker-compose FATE installation package on the deployment machine
# Download
wget https://github.com/FederatedAI/KubeFATE/releases/download/v1.5.0/kubefate-docker-compose-v1.5.0.tar.gz
# Unzip
tar -xvf kubefate-docker-compose-v1.5.0.tar.gz
# Go to the installation directory
cd docker-deploy
The Installation packages for all versions of FATE are available at https://github.com/FederatedAI/KubeFATE/releases
Deploy 9999 on partyA and 10000 on partyB, then fill in party_id and party_ip in that order.
$ vi parties.conf
#!/bin/bash
user=fate # The system user (corresponding to the new user above) who runs FATE
dir=/data/projects/fate # The file directory in which FATE runs
party_list=(10000 9999) # The party_id of deployment
party_ip_list=(192.168.0.10 192.168.0.9) # The party_id of FATE deployment
serving_ip_list=(192.168.0.10 192.168.0.9) # The party_id of FATE-Serving deployment
# computing_backend could be eggroll or spark.
computing_backend=eggroll # The computing engine of FATE
# default
exchangeip=
# modify if you are going to use an external db
mysql_ip=mysql
mysql_user=fate
mysql_password=fate_dev
mysql_db=fate_flow
# modify if you are going to use an external redis
redis_ip=redis
redis_port=6379
redis_password=fate_dev
*Modify party_ip_list
and serving_ip_list
. Leave the other fields along as default value. * If a user want to use the spark computing engine, just modify computing_backend=spark
.
For first-time deployment, the image needs to be downloaded from docker hub. If a user need to download from other registry instead, he should configure the RegistryURI accordingly.
$ vi .env
RegistryURI=hub.c.163.com # Use the domestic image address
TAG=1.5.0-release
SERVING_TAG=2.0.0-release
# PREFIX: namespace on the registry's server.
# RegistryURI: address of the local registry
# TAG: tag of module images.
Generate the corresponding installation file packages based on the above configuration.
bash generate_config.sh
This step generates installation packages for both worker machines. If a user modify the `parties.conf` or `.env` configuration files, installation package(s) should be regenerated.
Deploy FATE and FATE-Serving on the two machines.
bash docker_deploy.sh all
This step copies the installation packages to the corresponding directory of the destination host by scp and ssh, and then start the FATE cluster.
The parameter all
is used to directly deploy FATE and FATE-Serving to party10000 and party9999. For more details about the parameter please refer to usage details
# Go to the work directory
cd /data/projects/fate/confs-<party_id>/
# View the status of containers
docker-compose ps
# Check if fateflow is running successfully
docker-compose logs python
If the status for all components are UP; The log message "\* Running on http://x.x.x.x:9380/ (Press CTRL+C to quit)" appeared on the python container, then the FATE cluster has been deployed successfully.
# Go to the container of fateflow
docker-compose exec python bash
# Go to the toy_example directory
cd ../examples/toy_example/
# Run the toy_example command
python run_toy_example.py 9999 10000 1
# Note the sequence of <party_id> if on partyB machine
# python run_toy_example.py 10000 9999 1
If the "job status is running" message appeared, indicating that the job has started running. If the "success to calculate secure_ sum, it is 2000.0" message appeared, indicating that the job has been completed.
When a user use WSL for deployment, the screen will display the FATEBoard and Notebook address.
Open FATEboard for both parties in the browser
- partyA: http://192.168.0.9:8080
- partyB: http://192.168.0.10:8080
- partyA: http://192.168.0.9:20000
- partyB: http://192.168.0.10:20000
Step 1: Open example toy_example in partyA's Notebook: http://192.168.0.9:20000/notebooks/Toy\_Example/toy\_example\_submit\_job.ipynb
Step 2: modify the default party_id
Step 3: run toy_example
If the "success" message appeared, indicating that toy_example was successfully run by Notebook.
https://github.com/FederatedAI/KubeFATE/blob/v1.5.0/docker-deploy/README.md
https://github.com/FederatedAI/FATE/blob/v1.5.0/examples/toy\_example/README.md