Using BigQuery with older Jupyter notebooks - bennettoxford/openprescribing GitHub Wiki
Google has changed how authentication with BigQuery works, meaning that without some (one-off) changes to each repo's Docker environment, data cannot be refreshed. Please follow the steps below to get authentication up-and-running again.
Step 1: clone the repository
I've found it easier to use the Docker Desktop for this, but others may wish to use the command line.
As an example we're going to use the oat_prescribing
repo, which hasn't been updated since BQ authorisation changes.
Please note, this fix only works within the Docker notebook environments - any older notebooks will need to be put within a Docker environment first
- Go to
File -> Clone repository
in Github Desktop (or pressctrl+shift+O
) - Filter by typing part of the repository name in the search bar, then select the correct repository.
- Click on
Clone
. - Once cloning has finished, click on
Fetch origin
to make sure the repo is up-to-date on your PC.
Step 2: Obtaining and copying necessary files
- You need to contact the tech team and get them to get a BQ service account for you. They will send you a file called
bq-service-account.json
. Note: this is your personal login details for BigQuery, and mustn't be shared with anyone else - You need to download two files from GitHub: Dockerfile and .gitignore.
- Your browser may have renamed them, so you will need to rename them back:
gitignore.txt -> .gitignore
Dockerfile.txt -> Dockerfile
- Open the folder where your GitHub repository clone is stored.
- Copy the three files you have obtained into this folder. You may asked if you want to replace existing files - click
yes
Step 3: Updating the ebmdatalab package
- Run the Docker environment as normal.
- When the launcher has opened, selected a new Bash console
- Type
pip-compile --upgrade-package ebmdatalab
and enter by pressingShift+Enter
(if this leads to an error message such asImportError: cannot import name 'BAR_TYPES' from 'pip._internal.cli.progress_bars'
then try the commandpip install -U pip-tools
(withShift+Enter
) to update pip-tools first). - When this has completed, type
pip-sync
(andShift+Enter
)
This should have updated your ebmdatalab package to the latest version (>=0.0.30)
After this you should find that your notebooks will work as normal with BigQuery.
WARNING: do not commit your repo to GitHub is you do not have the downloaded version of .gitignore in the correct folder, as this will cause problems with the BQ service account authorisation