Using BigQuery with older Jupyter notebooks - bennettoxford/openprescribing GitHub Wiki

Google has changed how authentication with BigQuery works, meaning that without some (one-off) changes to each repo's Docker environment, data cannot be refreshed. Please follow the steps below to get authentication up-and-running again.

Step 1: clone the repository

I've found it easier to use the Docker Desktop for this, but others may wish to use the command line.

As an example we're going to use the oat_prescribing repo, which hasn't been updated since BQ authorisation changes.

Please note, this fix only works within the Docker notebook environments - any older notebooks will need to be put within a Docker environment first

  1. Go to File -> Clone repository in Github Desktop (or press ctrl+shift+O)
  2. Filter by typing part of the repository name in the search bar, then select the correct repository.
  1. Click on Clone.
  2. Once cloning has finished, click on Fetch origin to make sure the repo is up-to-date on your PC.

Step 2: Obtaining and copying necessary files

  1. You need to contact the tech team and get them to get a BQ service account for you. They will send you a file called bq-service-account.json. Note: this is your personal login details for BigQuery, and mustn't be shared with anyone else
  2. You need to download two files from GitHub: Dockerfile and .gitignore.
  3. Your browser may have renamed them, so you will need to rename them back:
  • gitignore.txt -> .gitignore
  • Dockerfile.txt -> Dockerfile
  1. Open the folder where your GitHub repository clone is stored.
  1. Copy the three files you have obtained into this folder. You may asked if you want to replace existing files - click yes

Step 3: Updating the ebmdatalab package

  1. Run the Docker environment as normal.
  2. When the launcher has opened, selected a new Bash console
  1. Type pip-compile --upgrade-package ebmdatalab and enter by pressing Shift+Enter (if this leads to an error message such as ImportError: cannot import name 'BAR_TYPES' from 'pip._internal.cli.progress_bars' then try the command pip install -U pip-tools (with Shift+Enter) to update pip-tools first).
  2. When this has completed, type pip-sync (and Shift+Enter)

This should have updated your ebmdatalab package to the latest version (>=0.0.30)

After this you should find that your notebooks will work as normal with BigQuery.

WARNING: do not commit your repo to GitHub is you do not have the downloaded version of .gitignore in the correct folder, as this will cause problems with the BQ service account authorisation