Artifact: Overview - prl-julia/julia-type-stability GitHub Wiki
Getting Started
We show here the simplest way to (dry) run the artifact, which is via Docker. The alternatives are explained in the comments to setting up the environment. Every step shouldn't take longer than 5 minutes.
-
Extract the artifact tarball somewhere on the disk, and open a shell session in the root directory. The
ls
(accordingly,dir
on Windows) should show you:Stability pkgs start README.md shell.nix top-1000-pkgs.txt
-
From here, run Docker as follows:
docker run --rm -it -v "$PWD":/artifact nixos/nix sh -c "cd artifact; nix-shell --pure"
Note: running Docker may require root privileges. Other notes from the Steps To Setup section also apply.
-
Make sure (e.g. with
ls
) that you are in the artifact root (e.g.ls
shows theStability
directory among others). Pull in dependencies of our code:JULIA_PROJECT=Stability julia -e 'using Pkg; Pkg.instantiate()'
-
To run our type stability analysis on one simple Julia package
Multisets
, do the following:a. Enter
start
directory and process the package:cd start ../Stability/scripts/proc_package.sh "Multisets 0.4.12"
There should be a number of new files with raw data in
start/Multisets
, e.g.stability-stats-per-instance.csv
.b. Convert the raw data into a CSV file:
../Stability/scripts/pkgs-report.sh
There should be a new file
start/report.csv
.c. Generate tables:
JULIA_PROJECT=../Stability ../Stability/scripts/tables.jl
You should see stability data about the Multisets package as an output:
Table 1 3×3 DataFrame Row │ Stats Stable Grounded │ String Float64 Float64 ─────┼────────────────────────────── 1 │ Mean 73.0 55.0 2 │ Median 73.0 55.0 3 │ Std. Dev. 0.0 0.0 Table 2 1×7 DataFrame Row │ package Methods Instances Inst/Meth Varargs (%) Stable (%) Grounded (%) │ String Int64 Int64 Float64 Int64 Int64 Int64 ─────┼───────────────────────────────────────────────────────────────────────────────── 1 │ Multisets 7 11 1.6 0 73 55
Note that the format is the same as Tables 1 and 2 of the paper, only the numbers are computed over one small package.
d. Generate plots:
JULIA_PROJECT=../Stability julia -L ../Stability/scripts/plot.jl -e 'plot_pkg("Multisets")'
There should be a group of figures created under
Multisets/figs
. The ones that correspond to Figure 10 of the paper areMultisets/figs/Multisets-size-vs-stable.pdf Multisets/figs/Multisets-size-vs-grounded.pdf
You should be able to browse the figures using your host system's PDF viewer (all files created inside Docker container under the
/artifact
directory will be visible in your host file system).
The figures should be similar to ones instart/Multisets/figs-ref
. -
To shut down the Docker session, run
exit
.
List of Claims From Paper Supported by Artifact
This artifact aims for two things:
-
give exact directions on how to reproduce experiments reported in Section 5 (Empirical Study) of the Type Stability in Julia paper submitted to OOPSLA '21; in particular:
- Table 1,
- Table 2,
- Figure 10;
-
show how to get stability metrics for Julia code (be it a function, module or package) using our framework in the interactive mode -- this supports the claim from Section 5.4 that the framework "can be employed by package developers to study where in their code instability concentrates, as well as check for regressions".
Step By Step Instructions
There are four parts to the rest of this document:
-
Metadata about our artifact and experiment as suggested by AEC Guidelines.
-
Discussion of environment setup.
-
Reproducing results from Section 5 of the paper.
-
Some examples of interactive usage.
Metadata
Hardware and Timings
AEC guidelines suggest adding approximate timings for all operations.
First, our hardware specs. We run experiments in a virtualized environment on server hardware:
-
32 Intel (Skylake) CPUs -- speed up our experiments quite a bit (nearly linear thanks to GNU parallel)
-
64 Gb RAM -- shouldn't be critical to the experiment, but the more CPUs you have the more memory you need (our 32 CPUs consumed 15 Gb at some point)
-
1.5 Tb disk space -- same as RAM, shouldn't be critical but be sure to provision several Gb per processor.
Using that hardware we get:
- Reproducing Table 1 (1K packages): 7 hrs
- Reproducing Table 2 (10 packages): 20 mins
- Reproducing Fig. 10 (heat maps): TODO (< 5 minutes?)
These numbers, again, are heavily dependent on the number of CPUs available on the system.
Expected Warnings
AEC guidelines suggest mentioning expected warnings in the README.
-
During package analysis (e.g. reproducing Tables 1 and 2) Julia warns about us overriding its test-suite functionality. This is fine because that appeared to be the most convenient way of gathering statistics we need.
WARNING: Method definition gen_test_code##kw(Any, typeof(Pkg.Operations.gen_test_code), String) in module Operations at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Operations.jl:1316 overwritten in module Stability at /home/artem/stability/repo/Stability/src/pkg-test-override.jl:6. ** incremental compilation may be fatally broken for this module **
-
GNU parallel wants to be cited and prints the following message unless you run it dry with the
--citation
flag first. It is safe to ignore.Academic tradition requires you to cite works you base your article on. If you use programs that use GNU Parallel to process data for an article ... Come on: You have run parallel 13 times. Isn't it about time you run 'parallel --citation' once to silence the citation notice?
-
The main body of the experiment for Tables 1 and 2 uses Julia Package manager to get and run tests of various Julia packages. The package manager as well as test suites are rather verbose so expect a lot of text on the screen. Some tests will fail, but this is expected: the idea is that no matter what the outcome of test is, some Julia code will be compiled during the test, and this code we can analyze anyway.
Some test suites among 1000 packages considered will fail fatally for various reasons (usually, dependencies-related). As noted in the paper, Section 5.1 (Methodology) we had over seven hundred test suites produced results that we could analyze.
Comments on Setting Up Environment
Our artifact has modest dependencies and in theory can be tested by manually installing them, but, of course, we provide two other essential pieces of infrastructure: a) automatic dependency management (via the Nix package manager), b) isolation from the host system (via Docker). Still, both of these pieces are optional.
We first expand a little more on the setup and then provide the sequence of steps that get you there.
Dependencies
-
Bash shell
-
Julia v1.5.4
-
GNU parallel
-
timeout
from GNU coreutils
Dependency Management via Nix and Reproducibility
To make sure the version of dependencies are the same as we used, and also to automate the process of getting them, we use the Nix package manager -- a general package manager available on Linux, MacOS and Windows (via WSL).
The shell.nix
file in the root of the artifact describes the dependencies we have. If the user has Nix installed and don't need isolation (e.g. believes that we don't rm -rf
their home directory), they can use it directly without Docker and can skip the next section.
There is another kind of dependencies, though -- the analyzed Julia packages, which are not covered by Nix or any other (known to us) general package manager. To get those we use the Julia package manager Pkg
(part of the Julia dependency). To make the figures from the paper as close as possible during the experiment, we supply the metadata about versions of the packages we analyzed. These metadata are stored in pkgs
and start
directories of the artifact. They have a set of subdirectories named after particular packages we analyzed, and every package subdirectory contains two files related to Pkg
: Project.toml
and Manifest.toml
. Supplying these ensures that reproduction use the same versions of packages as we did, but it does not guarantee 100% reproducibility of numbers because test suites of some analyzed packages contain non-determinism (naturally).
Isolation Plus Nix Dependency via Docker
We suggest using Docker to avoid unexpected interactions with the host system. This also gets you Nix. In theory, you could use a Docker image without Nix and get all dependencies there manually, or we could supply a custom Docker image with them (this would cost several hundreds megabytes in the artifact's real estate). But we don't discuss these configurations because Docker+Nix seems like a strictly better option.
Note that insider Docker the commands are run under the root user. This makes life easier in one aspect: Julia package manager stores a lot of metadata and makes some of it inaccessible for cleanup in the end. This is no problem under reoot but if running ouside Docker and without root privileges, this will eat up some disk space, proportional to the number of packages analyzed. For 1000 packages this can add up to hundreds of gigabytes. If you don't have that much disc space, run either in Docker or under root.
Steps To Setup
We mark the first two steps as optional but the simplest and most likely to succeed path is to apply both of them.
-
[Optional, apply if Isolation is desired]
Start a Docker container with Nix installed by executing the following in the root directory of the artifact:docker run --rm -it -v "$PWD":/artifact nixos/nix sh -c "cd artifact"
Note 1: running the
docker
command may require root privileges on your system. We assume that you know how to use Docker on your system.Note 2:
"$PWD"
expands to the path of the current directory in most shells. You can check it withecho "$PWD"
. If not, please use equivalent command for your shell or type in the path manually.Note 3 for Windows and MacOS users: note that Docker is sometimes reported to be slow with file operations concerning bind mounts (that's the
-v
flag) on these platforms. If you happen to use them and want to reproduce the experiment on 1000 packages, you may be better off with applying this manual inside a virtual machine with some Linux installation. Getting a Linux VM is not covered here because it's assumed to be a common knowledge by now. -
[Optional, apply if Dependency Management is desired]
Make sure the artifact files are in the current directory in your shell session (e.g. by executingls
in the shell). From there, enter the Nix shell by executingnix-shell --pure
. Nix shell is a Bash shell that has all the dependencies specified inshell.nix
visible (i.e. in$PATH
). -
[Sanity Checking] Make sure you have Julia and GNU parallel up:
julia --version && parallel --version
.
Reproducing Results from Section 5
-
Table 2 (The 10-packages table from Section 5):
- the package list (TODO either a separate file or the 1K list + enhance
proc_package_parallel.sh
with additional param: how many packages to take) scripts/proc_package_parallel.sh ../top-10.txt
scripts/pkgs-report.sh
Output is in
report.csv
which is still not quite close to Table 2, as it needs:- Percentages (instead of absolute numbers) for stable/grounded/varargs columns
- Remove nospec and (maybe?) fail column
- Add calculated instance per method column.
- the package list (TODO either a separate file or the 1K list + enhance
-
Table 1 of Sec 5 (data about 1K packages).
This should be similar to previous:
proc_package_parallel.sh
. -
Fig. 10 (st/gr heat map from Sec 5)
scripts/plot.sh
??? (TODO)