Configuring Script Element Resources - myantandco/RA-BitnobiPilotJuly2020 GitHub Wiki
The Script Element in a Bitnobi workflow allows you to execute a python or R script that reads in data, processes it and passes it along to the next element.
To provide some isolation between users, each time you run a Script element, Bitnobi launches new docker container to do the processing and when the script completes (or when the user presses the “Stop” button) this container is removed.
Because the user is not constrained in the type of python or R code that can be written, it is possible to make coding errors that have infinite loops or consume all available CPU or eat up large amounts of memory. Here is a simple example of such a python script:
MB = 1024 * 1024
GB = 1024 * MB
number_of = 900
eat = "a" * (number_of * MB)
x = 12345.
while True:
x*x
This python script consumes 900 Mbytes of memory and has an infinite loop that keeps squaring a number, which will use up all the CPU cycles that it can get. You can adjust the memory consumption by changing “number_of” or use GB instead of MB.
There are a number of configuration variables that control the behaviour of the Script container.
Variable name | Description | Default value |
---|---|---|
SCRIPT_IMAGE | docker image used by the backend to execute python and R code for Script workflow element. | jupyter/datascience-notebook |
SCRIPT_CONTAINER_CPUS | maximum number of CPU’s to allow for each script container. | no limit |
SCRIPT_CONTAINER_CPU_SHARES | script container’s relative CPU weight vs other containers when there is contention. Other Bitnobi containers have a weight of 1024. | 512 |
SCRIPT_CONTAINER_MEMORY_LIMIT_MBYTES | memory limit for script container in MBytes. | no limit |
Note that if you change one of these values, you will need to restart Bitnobi to have it take effect.
If you can SSH to the VM running Bitnobi, you can use the “docker stats” command to see the CPU usage and memory consumption of all running docker containers, and this display refreshes every second or so. The Script containers have a name that looks like bitnobi-3f2d8eed-610e-4f8b-ad60-e413db01d5cd
.
If you use the Bitnobi UI to create a workflow with a data source and Script element, you can use our sample python script to observe the resource usage.
If SCRIPT_CONTAINER_CPUS is not set, then your script container will consume 100% CPU. If you set it to “0.5” then it will be limited to 50% CPU. Note that by default python will use at most 1 vCPU, so if your VM has 2 vCPUs, the script container will only use 1.
The SCRIPT_CONTAINER_CPU_SHARES value sets how CPU cycles will be shared between containers when 2 or more containers compete for CPU. By default we set it to 512 which means that if the backend container and your Script container are competing for CPU, then the backend will get 66% and the Script container 33%. If the backend is idling, then the Script container will get 100%.
The SCRIPT_CONTAINER_MEMORY_LIMIT_MBYTES value sets the maximum memory limit of the Script container. For example if it is set to 1024 and if your Script container attempts to consumes 1025 Mbytes, then docker will generate an Out Of Memory signal and terminate the container.
The sample python program in the section above cannot be run in a Bitnobi workflow because the “Run” operation in the Script editor never finishes. Here is a modification of the python sample template that completes normally in the workflow editor, but consumes memory and runs forever when it runs in a workflow:
# gets and displays preview input from canvas
bitnobi_input = io.get_input()
for row in bitnobi_input:
print(row)
# write to output
io.write_output(['name','id'])
io.write_output(['justin','1'])
io.write_output(['john','2'])
if (sys.argv[1] == '/bitnobi-canvas-obj/IO/BackEnd/inputBackend.csv'):
MB = 1024 * 1024
number_of_chunks = 1500
eat = "a" * (MB * number_of_chunks)
x = 12345.
while True:
x*x
If you set the script memory limit to 1000 and then run a workflow containing the above Script element then the Script container will be terminated when it reaches 1000 MB and the workflow will fail with an out of memory error.