Deploy FATE on TKG with KubeFATE - FederatedAI/KubeFATE GitHub Wiki
The TKG (Tanzu Kubernetes Grid) cluster is a platform base on the open-source Kubernetes that is built, signed and supported by VMware. The Tanzu Kubernetes cluster can be configured and run on the supervisor cluster by using the Tanzu Kubernetes Grid service. The supervisor cluster is a vSphere cluster enabled vSphere with Tanzu.
Then, let's start the tutorial.
item | version |
---|---|
Kubernetes | v1.18.15+vmware.1 |
KubeFATE | v1.6.0-a |
Because we will install FATE on two clusters but all operations will be performed on the same host. So, two working directories respectively need to be created, namely PartyA
and PartyB
.
(PartyA)$ # This represents the command running in PartyA
(PartyB)$ # This represents the command running in PartyB
$ # This represents runs simultaneously in two working directories
Item | PartyA | PartyB |
---|---|---|
PartyID | 9999 | 10000 |
KubeFATE serviceurl | partya.example.com | partyb.example.com |
Kubernetes context | tkc-1 | tkc-2 |
ingress IP | 192.168.18.131 | 192.168.20.135 |
Kubernetes context is obtained by logging in to TKC, the ingress IP
is installed after installing ingress-controller View.
Here, we prepared two TKC cluster, use kubecctl vsphere login
login to see the k8s version.
(PartyA)$ kubectl --context=tkc-1 version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.1", GitCommit:"c4d752765b3bbac2237bf87cf0b1c2e307844666", GitTreeState:"clean", BuildDate:"2020-12-18T12:09:25Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.15+vmware.1", GitCommit:"9a9f80f2e0b85ce6280dd9b9f1e952a7dbf49087", GitTreeState:"clean", BuildDate:"2021-01-19T22:59:52Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
(PartyB)$ kubectl --context=tkc-2 version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.1", GitCommit:"c4d752765b3bbac2237bf87cf0b1c2e307844666", GitTreeState:"clean", BuildDate:"2020-12-18T12:09:25Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.15+vmware.1", GitCommit:"9a9f80f2e0b85ce6280dd9b9f1e952a7dbf49087", GitTreeState:"clean", BuildDate:"2021-01-19T22:59:52Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
If you cannot access dockerhub, you need to download the image and upload it to the corresponding worknode or your own harbor. We use dockerhub to download directly for this installation.
Since we can directly access GitHub, we don't need to download the chart manually. If it is an offline environment, you need to download and upload the chart file, fate-v1.6.0-a.tgz。
Install an ingress-controller (for example ingress-nginx). After the installation is complete, obtain the ingress IP through kubectl get svc -n ingress-nginx
.
(PartyA)$ kubectl --context=tkc-1 get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller LoadBalancer 10.108.42.103 192.168.18.131 80:32250/TCP,443:32437/TCP 1d
ingress-nginx-controller-admission ClusterIP 10.99.180.187 <none> 443/TCP 1d
(PartyB)$ kubectl --context=tkc-2 get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller LoadBalancer 10.102.226.6 192.168.20.135 80:30036/TCP,443:30941/TCP 1d
ingress-nginx-controller-admission ClusterIP 10.104.177.237 <none> 443/TCP 1d
After the environment setup is finished, we can start to install KubeFATE and FATE on TKG.
Download the installation package file.
$ curl -LO https://github.com/FederatedAI/KubeFATE/releases/download/v1.6.0/kubefate-k8s-v1.6.0.tar.gz && tar -xzf ./kubefate-k8s-v1.6.0.tar.gz
$ ls
cluster-serving.yaml cluster-spark.yaml cluster.yaml config.yaml examples kubefate kubefate-k8s-v1.6.0.tar.gz kubefate.yaml rbac-config.yaml
Install the KubeFATE command tool.
$ chmod +x ./kubefate && sudo mv ./kubefate /usr/bin
*KubeFATE command line tool is written in golang, you can also compile it yourself. *
Install KubeFATE service on two TKC kubernetes respectively.
(PartyA)$ kubectl --context=tkc-1 apply -f ../rbac-config.yaml
(PartyB)$ kubectl --context=tkc-2 apply -f ../rbac-config.yaml
modify KubeFATE serviceurl
(PartyA)$ cat ./kubefate.yaml
...
spec:
rules:
- host: partya.example.com
http:
paths:
...
(PartyB)$ cat ./kubefate.yaml
...
spec:
rules:
- host: partyb.example.com
http:
paths:
...
Write to hosts file
(PartyA)$ echo "192.168.18.131 partya.example.com" >> /etc/hosts
(PartyB)$ echo "192.168.20.135 partyb.example.com" >> /etc/hosts
If you use a private mirror warehouse, you need to modify the mirror-related fields in kubefate.yaml
.
Modify the serviceurl
of config.yaml
.
(PartyA)$ cat config.yaml
# TODO
# persistent layer
log:
level: info
user:
username: admin
password: admin
serviceurl: partya.example.com
(PartyB)$ cat config.yaml
# TODO
# persistent layer
log:
level: info
user:
username: admin
password: admin
serviceurl: partyb.example.com
Use kubefate version
to check whether the KubeFATE environment is installed.
(PartyA)$ kubefate version
* kubefate commandLine version=v1.4.1
* kubefate service version=v1.4.1
(PartyB)$ kubefate version
* kubefate commandLine version=v1.4.1
* kubefate service version=v1.4.1
If kubefate service version=
appears, the installation is successful.
The KubeFATE environment of both parties has been completed. Then, FATE can be installed.
Our TKC supports LoadBalancer, so our FATE is exposed through LoadBalancer.
(PartyA)$ cat cluster.yaml
name: fate-9999
namespace: fate-9999
chartName: fate
chartVersion: v1.6.0-a
partyId: 9999
registry: ""
imageTag: ""
pullPolicy:
imagePullSecrets:
- name: myregistrykey
persistence: false
istio:
enabled: false
podSecurityPolicy:
enabled: true # The TKC cluster turns on podSecurityPolicy authentication by default, you need to configure true here.
modules:
- rollsite
- clustermanager
- nodemanager
- mysql
- python
- fateboard
- client
backend: eggroll
rollsite:
type: LoadBalancer
nodePort: 30091
(PartyB)$ cat cluster.yaml
name: fate-10000
namespace: fate-10000
chartName: fate
chartVersion: v1.6.0-a
partyId: 10000
registry: ""
imageTag: ""
pullPolicy:
imagePullSecrets:
- name: myregistrykey
persistence: false
istio:
enabled: false
podSecurityPolicy:
enabled: true # The TKC cluster turns on podSecurityPolicy authentication by default, you need to configure true here.
modules:
- rollsite
- clustermanager
- nodemanager
- mysql
- python
- fateboard
- client
backend: eggroll
rollsite:
type: LoadBalancer
nodePort: 30101
Create the corresponding namespace before deploying FATE.
(PartyA)$ kubectl --context=tkc-1 create namespace fate-9999
(PartyB)$ kubectl --context=tkc-2 create namespace fate-10000
Then use kubefate command to deploy.
(PartyA)$ kubefate cluster install -f cluster.yaml
(PartyB)$ kubefate cluster install -f cluster.yaml
Wait for the deployment to succeed.
(PartyA)$ kubefate job describe <jobID> # You will get <jobID> when you install cluster in the previous step
(PartyB)$ kubefate job describe <jobID> # You will get <jobID> when you install cluster in the previous step
Wait for the job status to become Success
, indicating that the deployment is success.
You can also use kubefate cluster list
, kubefate cluster describe <clusterID>
to check the status of the cluster. Running
indicating that the deployment is successful.
Since LoadBalancer's IP is allocated by LoadBalancer service in real time, it is necessary to configure each other's address information after installing FATE,
(PartyA)$ kubectl --context=tkc-1 get svc/rollsite -n fate-9999
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rollsite LoadBalancer 10.103.75.93 192.168.18.132 9370:30091/TCP 12m
(PartyB)$ kubectl --context=tkc-2 get svc/rollsite -n fate-10000
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rollsite LoadBalancer 10.99.31.113 192.168.20.136 9370:30101/TCP 12m
Obtain the LoadBalancer IP of the rollsite for both parties through the above command.
Configure each other's information in cluster.yaml,
(PartyA)$ cat cluster.yaml
...
rollsite:
type: LoadBalancer
nodePort: 30091
partyList:
- partyId: 10000
partyIp: 192.168.20.136
partyPort: 9370
(PartyB)$ cat cluster.yaml
...
rollsite:
type: LoadBalancer
nodePort: 30101
partyList:
- partyId: 9999
partyIp: 192.168.18.132
partyPort: 9370
Then update the configuration information,
(PartyA)$ kubefate cluster update -f cluster.yaml
(PartyB)$ kubefate cluster update -f cluster.yaml
Waiting for the job status to be Success
or the cluster status to Running
indicates that the update is successful.
Run toy_example test,
(PartyA)$ kubectl --context=tkc-1 exec -it svc/fateflow -c python -n fate-9999 -- bash
(app-root) bash-4.2# cd /data/projects/fate/examples/toy_example/
(app-root) bash-4.2# python run_toy_example.py 9999 10000 1
...
Finally, the log appears similar to success to calculate secure_sum, it is 2000.0000000000002
, means the toy_example interoperability test was successful.
We deployed the fateboard component in the previous deployment, so by default, you can view the fateboard page by visiting http://.fateboard.example.com.
You need to write the hosts file before viewing,
(PartyA)$ echo "192.168.18.131 party9999.fateboard.example.com" >> /etc/hosts
(PartyB)$ echo "192.168.20.135 party10000.fateboard.example.com" >> /etc/hosts
Then you can view the fateboard page page through the URL.
The notebook page is similar to FATEBoard. We also deployed the client component in the previous deployment. By default, you can view the notebook page by visiting http://.notebook.example.com.
You need to write the hosts file before viewing,
(PartyA)$ echo "192.168.18.131 party9999.notebook.example.com" >> /etc/hosts
(PartyB)$ echo "192.168.20.135 party10000.notebook.example.com" >> /etc/hosts
Then you can view the fateboard page page through the URL.
The previous configuration uses the default URL. We can use a customize URL through configuration. Similar to the following configuration:
(PartyA)$ cat cluster.yaml
...
host:
fateboard: party9999.fateboard.vmware.com
client: party9999.notebook.vmware.com
...
(PartyB)$ cat cluster.yaml
...
host:
fateboard: party10000.fateboard.vmware.com
client: party10000.notebook.vmware.com
...
Then configure the hosts file,
(PartyA)$ echo "192.168.18.131 party9999.notebook.vmware.com" >> /etc/hosts
(PartyB)$ echo "192.168.20.135 party10000.notebook.vmware.com" >> /etc/hosts
Then we modify kubefate cluster update
to update the cluster configuration. After the update is complete, you can access the UI interface through a custom URL.