4a. Troubleshooting: FAQ - EY-Data-Science-Program/2021-Better-Working-World-Data-Challenge GitHub Wiki
Welcome to the FAQ. Here you'll find solutions to common problems with deploying and using your data science environment.
If you don't find what you're looking for here, please contact us and provide as much detail as possible for your problem.
We'll update this page with more solutions as we discover them, so be sure to check back!
In some cases it is useful to connect to the command line shell of a VM in order to diagnose an issue.
If Jupyter is working on your VM, navigate to the Jupyter file tree view, click the "New" drop down item at the top left of the file list table and select Terminal. This will open a terminal to the VM in a new browser window.
If you Cannot Access Jupyter, you can connect to the VM command shell using an SSH client:
When you deployed your VM, you generated an SSH key pair which you will now need.
Locate your SSH key; if you used the default location on creation, you should
find it at either C:\Users\username/.ssh/id_rsa
(for Windows) or
~/.ssh/id_rsa
for MacOS/Linux.
Open a terminal; On Windows, use the start menu to launch either Command Prompt or PowerShell, on MacOS launch Terminal.app, on Linux you should know what you're doing.
From the terminal, connect by running ssh -i /path/to/key ubuntu@VM_IP
;
replacing /path/to/key
with the actual key path (if you are using the
default, you can omit -i /path/to/key
) and replacing VM_IP
with the IP of
your VM found in the Azure console. The ubuntu
username is set on the VM and
doesn't change.
If your VM did not work when deploying, something may have gone wrong during installation of the software. Sometimes this can happen due to, for example, temporary network issues.
First you'll need to access the VM terminal using SSH.
Once connected you can view the log by running this command:
cat /var/log/install-cube.log
This will print the contents of the installation log to the terminal. If there was a problem during installation, there'll likely be some clue in the last few lines of this file.
In some cases our support team may request the installation log to help diagnose issues with your VM.
To download the log file from the VM to your local machine open another terminal window (as described above, e.g. PowerShell), and run this command:
scp ubuntu@VM_IP:/var/log/install-cube.log ~/Downloads
Note that ~/Downloads/
can be changed to whatever path you want to download
the file to. Use just .
for the current folder (where the scp
command is
being executed from). On Windows you may need to use %USERPROFILE%
in place
of ~
.
You can now email the install-cube.log file to the support team!
If your deployment fails with "code":"DeploymentFailed"
and
"code":"ConflictingUserInput"
, have a closer look at the entire error
message. Should the error message contain the following:
Disk DatadiskCubeInABox already exists in resource group EY-DATACHALLENGE.
then an old disk from a previous deployment is left over and Azure refuses to
overwrite it. Simply delete this disk first in the Azure Portal (search for
Disk
and delete the one called DatadiskCubeInABox
) and try again to deploy
a new environment.
If your VM has deployed successfully but you cannot access Jupyter at the VM IP address, please check the following:
-
Do not use a fully numerical password for Jupyter:
When choosing your Secret Password during the Azure deployment, make sure to use a password that has letters and numbers. Number-only passwords are known to cause an error in Jupyter loading. Also avoid special characters such as ;, $, \, (, ), *, {, }. If you have already deployed, you can check your password in the/opt/odc/docker-compose.yml
file on your VM at theservices.jupyter.command
value. Modify this file to change it, then runsudo docker-compose up -d
to recreate the container. -
Make sure the VM is running:
Confirm that the VM is running via the Azure Portal and check that the status of yourCubeInABox
VM isrunning
. If it is not, select the VM and hit theStart
button. Remember; stopping your VM when not in use is a good way to keep costs down. See Cost Management for more information. -
Check that Jupyter is running:
To check that Jupyter is running on the VM, you'll need to access the VM terminal using SSH.Once connected, run
cd /opt/odc
to navigate to the Open Data Cube installation directory, then runsudo docker-compose ps
. This command will print the status of Jupyter (and its database); under "State" we hope to see "Up". If state is not "Up", runsudo docker-compose -d up
. After a couple of minutes Jupyter should be ready.If an error is printed when running the above commands, then something likely went wrong with the installation. You can either manually re-run the installation (read on) or delete and re-create the VM.
-
Manually install Open Data Cube:
To run the installation first download the script with this command:cd # Navigate back to your home directory curl -o install-cube.sh https://raw.githubusercontent.com/EY-Data-Science-Program/2021-Better-Working-World-Data-Challenge/main/install-cube.sh
Then run the installation script (replace
MYPASSWORD
with the password you want to use to log in to Jupyter):chmod +x install-cube.sh # Make the file executable sudo ./install-cube.sh MYPASSWORD
Installation may take a while, so just sit back and wait!
If anything else goes wrong please contact us.