Concepts - SCECcode/ucvm_docker GitHub Wiki
Motivation for Creating UCVM Docker Images
SCEC's UCVM velocity model software is designed to run on Linux computers, and the software must be installed, compiled, and tested on the users system before routine use. UCVM software framework presents testing, development, and distribution challenges because it is written in multiple computing languages and new codes are added frequently. Also it has many added capabilities, which may be configured.
To simplify development, we selected a scientific Linux computing environment as our target computer platform. We develop and test on this environment because that's where our main users, the HPC modelers, tend to run. We don't maintain support for multiple computing platforms.
We believe we can leverage computer virtualization to help users avoid the long UCVM installation process. We tried to meet the needs of Mac, Windows users using virtualization, first Virtual Box, and now, Docker. This github repo contains codes and documents for a prototype version of UCVM distributed as docker images.
Docker is offered by many scientific computing groups and becoming an alternative to Virtual Box. Some groups has converted their Docker image into singularity images which can run on supercomputers.
Previous use of Virtual Box
In 2017, SCEC created Virtual Box versions of our UCVM and Broadband Platform software. The images we created for those workshop are posted online. However, Virtual Box has updated, and the older images no longer run. We have not diagnossed the problems.
An important limitation for Virtual Box is defining all needed disk space when you define the image. It is difficult for the users to pre-define available disk space beforehand.
Also, moving files in and out of the Virtual box was a common issue. There was filesharing capabilities in virtual box, but they required some extensions and they were difficult to configure to work in a reliable manner.
Virtual Machine and Docker Image
We call files with executable commands, "images". We have posted UCVM images onto dockerhub which is publicly accessible. When we startup an image, and run the image, we call the running image a "container". The container includes SCEC's scientific software (UCVM), as well as the operating system and other files needed to run the software. We'd like our docker image to be as close to our ucvm development environment as possible. We based our image on rockylinux (a centos-like distribution), then we added required software packages including gcc, python, automake and others. An important benefit of the Docker version of UCVM model is they can be run, using Docker, on most modern computers with no additonal installation.
Challenges using Docker
In order to use the Docker version of UCVM, users must be comfortable using UCVM from the command line. Once the UCVM container is running, Users issues standard command line UCVM queries, such as "ucvm_query".
In addition, users must be comfortable invoking the container on their system. There is an expected directory structure, specifically, there should be a subdirectory in the location where the command is issued. This is to facilitate sharing of files between the users' local host with the running Docker container. In this documentation, we have designated 'target' subdirectory as the common subdirectory. Also, the Docker invocation includes parameters (--rm and -it) with single and double dashes, environment variable references $(pwd), and operating system level commands (e.g. mount, bind). Although the invocation looks complex, once the container is running, it works much like a standard UCVM installations.
Distribution Format
To help avoid large Docker images, we put one CVM into each image. The rational is that you can retrieve as many of the images as you like, but you do not get all UCVM models in a single Docker image, because that image would be excessively large.
Potential Benefits to using UCVM Docker Images
- If full UCVM software installation is not needed, running UCVM Docker images is simple.
- UCVM software is now portable to previously unsupported operating systems including Mac and Windows.
- Docker images with individual models require less disk space on users computers. Users can retreieve and/or remove images easily.
Potential Limitations
- Users must work within limits of images and local computers. UCVM is used on supercomputers, for example, to build simulation meshes. The docker version of UCVM may not work for this purpose on laptops. There may be a query limit on number of inputs points that an image can query. So users need to understand large-scale usage will probably require native installation of UCVM on Linux systems with MPI.
- Users cannot "tile" seismic veloicty models. UCVM Tiling allows users to specify a list of velocity models (e.g. -m cvms5,cvmh,cvms) and UCVM will query them in order until it finds material properties defined for a given point. All models referenced in the -m model list must be installed for this UCVM command to work. If you wish to tile multiple seismic velocity models, users may need to do a full UCVM installation on a linux system and install all needed models.