Executing python code on remote server from local installation of PyCharm - rcasero/doc GitHub Wiki

PyCharm's professional edition allows you to write python code on a local machine (e.g. laptop), and execute it on a remote machine (e.g. a server with several GPUs) via ssh.

Erik Hallström wrote a post ("Work remotely with PyCharm, TensorFlow and SSH") about this in Nov 2016.

This document is a small update on some aspects of that post, as well as explaining how to create a tunnel if the GPU server is behind another server.

Note: I'm going to explain what to do in the following particular cases, although I think it'd be easy to adapt my instructions to other cases:

  • Ubuntu Linux locally, and some form of Linux in the servers.
  • You need to go through a login server before you can connect to the server with the GPUs.
  • Conda local environments to install python packages.

In the instructions below, I'm going to use the notation:

  • little:: name of your local machine, e.g. laptop (this name doesn't matter).
  • login_server.com: name of a remote login server. This server has no GPUs, and is only used for logins. In this example, you need to ssh to this server before you can ssh to the server with the GPUs.
  • gpu_server: name of a remote server with GPUs. This server sits behind login_server.com, and you cannot ssh directly to it. Instead, you need to ssh to login_server.com, and from there you can ssh to gpu_server
  • person: username (for simplicity, we assume the same in both servers).

Installing PyCharm

  1. Buy the professional version of PyCharm, or download the trial version for free. If your machine has the program snap installed, you can do this with
sudo snap install pycharm-professional --classic
  1. If you don't have snap, download the tarball e.g. pycharm-professional-2018.1.4.tar.gz to some local directory, e.g. ~/Software, and untar it
tar xvzf pycharm-professional-2018.1.4.tar.gz
  1. You need to install PyCharm both in your local machine (e.g. laptop), and in the server where you want to run your code. If the server doesn't have a browser, you can copy the tarball from your local machine to the server using scp.

  2. Run PyCharm in both the local machine and the server, so that configuration files are created. If you installed PyCharm with snap, you can launch pycharm-professional from the command line. If you installed it with the tarball,

cd Software/pycharm-2018.1.4/bin/
./pycharm.sh

Setting up an ssh tunnel

Because gpu_server is behind login_server, and PyCharm doesn't let you specify jumps from server to server, we need to set up an ssh tunnel, that transparently sends traffic from your local machine to gpu_server via login_server.

  1. Open a terminal on your local machine, and run (don't kill the terminal or the tunnel will die!)
ssh -N -L 5000:gpu_server:22 person@login_server.com

This will start a foreground process that directs the traffic sent to port 5000 of localhost to port 22 (default ssh port) of gpu_server via your credentials in login_server. You don't need to worry about the password now.

You can make it a background process instead by adding option -f to the ssh command above.

The local port doesn't need to be 5000, it can be any other high enough number, but you'll need to remember it when you later set up the remote interpreter in PyCharm.

Configuring a python environment in the remote server

You don't need to install your project in the server (that'll be done automatically), but you need to install the python dependencies (e.g. numpy, keras, tensorflow-gpu, etc) that will be needed by your project.

  1. Login onto the gpu_server. We are assuming that you cannot get to it directly.
ssh -XC person@login_server.com
ssh -XC person@gpu_server
  1. Create a conda environment for your project. Here I assume that the server administrator has made conda available, e.g. by installing Miniconda on the server. Alternatively, some other environment manager may be available, e.g. pyenv, or maybe you can install python packages directly without a local environment. Check with your server administrator.
conda create -n myenv python=3.6
  1. Activate the local environment, and install the python packages your project depends on (note that conda allows to install packages either with conda or pip), e.g.
source activate myenv
pip install keras tensorflow-gpu
conda install -y cudnn h5py
...

This will create a local directory in the server with the remote interpreter that we'll use from the local machine:

  • the python binary in ~/.conda/envs/myenv/bin/python
  • the python packages in ~/.conda/envs/myenv/lib/python3.6/site-packages/

Configuring local PyCharm to execute code on the remote server

  1. Back in your local machine, launch PyCharm Professional (as of this writing, v2018.1.4).

  2. Open or create a new project.

  3. Select File -> Settings -> Project -> Project Interpreter.

  4. On the right hand side, click on the clog wheel icon, and select "Add" so that we can add a remote python interpreter.

  5. Select "SSH Interpreter", "New server configuration": "Host" = localhost, "Port" = 5000, "Username" = person. Click Next.

  6. For Authentication, you can enter your login_server password (tick "Save password"), or set up a Key pair as described by Hallström. (I did the former). Click next.

  7. If the ssh tunnel is up and running, and your username and password are correct, now you'll be connected to gpu_server via localhost. As interpreter, choose the python binary you previously set up in the server, e.g. ~/.conda/envs/myenv/bin/python.

  8. In "Sync folders", choose as destination a directory that you can easily check, e.g. ~/Software/myproject, and tick the box "Automatically upload project files to the server". You don't need to worry about the code in ~/Software/myproject, because any changes you make to your project in the local machine will be transferred automatically to the server.

  9. Click Finish.

At this point, your local PyCharm project is configured with a remote interpreter. That is, when you run a python script, or run some code on the Python Console of your local PyCharm install, instead of using your local CPU/GPU, the code is executed remotely in gpu_server.

Of course, you can set up an additional Python Interpreter on your local machine, so that you can also work when you are not online or have no access to the servers.

Configuring local PyCharm to display plots while running on the remote server

If you try to plot a figure while running a remote interpreter, e.g.

import matplotlib.pyplot as plt
plt.clf()
plt.imshow(im)

PyCharm will return an error like this

Traceback (most recent call last):
  File "<input>", line 108, in <module>
  File "/home/rcasero/.conda/envs/myenv/lib/python3.6/site-packages/matplotlib/pyplot.py", line 686, in clf
    gcf().clf()
  File "/home/rcasero/.conda/envs/myenv/lib/python3.6/site-packages/matplotlib/pyplot.py", line 601, in gcf
    return figure()
  File "/home/rcasero/.conda/envs/myenv/lib/python3.6/site-packages/matplotlib/pyplot.py", line 548, in figure
    **kwargs)
  File "/home/rcasero/.conda/envs/myenv/lib/python3.6/site-packages/matplotlib/backend_bases.py", line 161, in new_figure_manager
    return cls.new_figure_manager_given_figure(num, fig)
  File "/home/rcasero/.conda/envs/myenv/lib/python3.6/site-packages/matplotlib/backend_bases.py", line 167, in new_figure_manager_given_figure
    canvas = cls.FigureCanvas(figure)
  File "/home/rcasero/.conda/envs/myenv/lib/python3.6/site-packages/matplotlib/backends/backend_qt5agg.py", line 24, in __init__
    super(FigureCanvasQTAgg, self).__init__(figure=figure)
  File "/home/rcasero/.conda/envs/myenv/lib/python3.6/site-packages/matplotlib/backends/backend_qt5.py", line 234, in __init__
    _create_qApp()
  File "/home/rcasero/.conda/envs/myenv/lib/python3.6/site-packages/matplotlib/backends/backend_qt5.py", line 125, in _create_qApp
    raise RuntimeError('Invalid DISPLAY variable')
RuntimeError: Invalid DISPLAY variable

The reason is that when PyCharm creates the ssh connection to the remote interpreter, it uses something like

ssh://rcasero@localhost:5000/home/rcasero/.conda/envs/myenv/bin/python -u /home/rcasero/.pycharm_helpers/pydev/pydevconsole.py 0 0

without the -X option that exports the DISPLAY variable from the server. You can find out the value of the DISPLAY variable by connecting with ssh to the gpu_server, and running

echo $DISPLAY
localhost:11.0

The problem is not the matplotlib backend, and it's fine to have e.g.

import matplotlib.pyplot as plt
plt.get_backend()
'Qt5Agg'

To solve this error there are two options:

  • Option 1: Include code like this in your local python script before you import matplotlib
import os
os.environ['DISPLAY'] = 'localhost:11.0'
  • Option 2: In PyCharm, go to File -> Settings -> Build, Execution, Deployment -> Console -> Python Console, click on "Environment variables" and add a new entry with "Name" = "DISPLAY" and "Value" = "localhost:11.0".
⚠️ **GitHub.com Fallback** ⚠️