Setup: Python - JonasEngstrom/overleaf-article-template GitHub Wiki

Using Python in Rmarkdown

If you are using Python version 3.12 or later installed via Homebrew you will need to set up a virtual environment for use in you Rmarkdown file if you want to use any third-party packages, such as pandas or scikit-learn.

Initialize Virtual Environment

To initialize a new virtual environment, cd into the project directory and run the following command (using the path to the Python version you would like to use):

/opt/homebrew/bin/python3 -m venv .venv

[!NOTE] To find out which is your default Python setup, run which python3 in bash.

Installing Packages in the Virtual Environment

To activate the virtual environment, run the following command in the project directory:

source .venv/bin/activate

Then install the packages as you would usually do, i.e. running a command similar to the following.

pip3 install numpy pandas scikit-learn matplotlib

To exit the virtual environment use the following command:

deactivate

Setup RStudio to use the Virtual Environment

To use the virtual environment add Sys.setenv(RETICULATE_PYTHON = '.venv/bin/python') to the file .Rprofile in the project directory. This can be done by running the following command in bash:

echo "Sys.setenv(RETICULATE_PYTHON = '.venv/bin/python')" >> .Rprofile

Transferring Data between R and Python

R to Python

To access R data from a Python block use the r object:

variable_created_in_r <- 'so long and thanks for all the fish'
print(r.variable_created_in_r)

Python to R

To access data Python data from R, the reticulate package needs to be loaded in R, after which the data can be accessed by using the py object:

library(reticulate)
variable_created_in_python = ['all work', 'no play']
print(py$variable_created_in_python)