CLI Reference - sidaw/codalab-worksheets GitHub Wiki
Having gone over the basics of the command-line interface (CLI),
let us provide a more complete picture of the capabilities of the CLI, along
with more details about the general structure of CodaLab. Note that many
(though not all) of the commands work in the web terminal (the CodaLab>
prompt at the top of the web interface).
For a complete list of CLI commands, type:
cl help -v
Here is the output:
CodaLab CLI version 0.1.8
Usage: cl <command> <arguments>
Commands for bundles:
upload (up):
Create a bundle by uploading an existing file/directory.
upload <path> : Upload contents of file/directory <path> as a bundle.
upload <path> ... <path> : Upload one bundle whose directory contents contain <path> ... <path>.
upload -c <text> : Upload one bundle whose file contents is <text>.
upload <url> : Upload one bundle whose file contents is downloaded from <url>.
upload : Open file browser dialog and upload contents of the selected file as a bundle (website only).
Most of the other arguments specify metadata fields.
Arguments:
path Paths (or URLs) of the files/directories to upload.
-c, --contents Specify the string contents of the bundle.
-L, --follow-symlinks Always dereference (follow) symlinks.
-x, --exclude-patterns Exclude these file patterns.
-g, --git Path is a git repository, git clone it.
-p, --pack If path is an archive file (e.g., zip, tar.gz), keep it packed.
-w, --worksheet-spec Upload to this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
-n, --name Short variable name (not necessarily unique); must conform to ^[a-zA-Z_][a-zA-Z0-9_\.\-]*$.
-d, --description Full description of the bundle.
--tags Space-separated list of tags used for search (e.g., machine-learning).
--license The license under which this program/dataset is released.
--source-url URL corresponding to the original source of this bundle.
-e, --edit Show an editor to allow editing of the bundle metadata.
make:
Create a bundle by combining parts of existing bundles.
make <bundle>/<subpath> : New bundle's contents are copied from <subpath> in <bundle>.
make <key>:<bundle> ... <key>:<bundle> : New bundle contains file/directories <key> ... <key>, whose contents are given.
Arguments:
target_spec [<key>:](<uuid>|<name>)[/<subpath within bundle>]
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
-n, --name Short variable name (not necessarily unique); must conform to ^[a-zA-Z_][a-zA-Z0-9_\.\-]*$. (for makes)
-d, --description Full description of the bundle. (for makes)
--tags Space-separated list of tags used for search (e.g., machine-learning). (for makes)
--allow-failed-dependencies Whether to allow this bundle to have failed dependencies. (for makes)
-e, --edit Show an editor to allow editing of the bundle metadata.
run:
Create a bundle by running a program bundle on an input bundle.
Arguments:
target_spec [<key>:](<uuid>|<name>)[/<subpath within bundle>]
command Arbitrary Linux command to execute.
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
-n, --name Short variable name (not necessarily unique); must conform to ^[a-zA-Z_][a-zA-Z0-9_\.\-]*$. (for runs)
-d, --description Full description of the bundle. (for runs)
--tags Space-separated list of tags used for search (e.g., machine-learning). (for runs)
--allow-failed-dependencies Whether to allow this bundle to have failed dependencies. (for runs)
--request-docker-image Which docker image (e.g., codalab/ubuntu:1.9) we wish to use. (for runs)
--request-time Amount of time (e.g., 3, 3m, 3h, 3d) allowed for this run. (for runs)
--request-memory Amount of memory (e.g., 3, 3k, 3m, 3g, 3t) allowed for this run. (for runs)
--request-disk Amount of disk space (e.g., 3, 3k, 3m, 3g, 3t) allowed for this run. (for runs)
--request-cpus Number of CPUs allowed for this run. (for runs)
--request-gpus Number of GPUs allowed for this run. (for runs)
--request-queue Submit run to this job queue. (for runs)
--request-priority Job priority (higher is more important). (for runs)
--request-network Whether to allow network access. (for runs)
-e, --edit Show an editor to allow editing of the bundle metadata.
-W, --wait Wait until run finishes.
-t, --tail Wait until run finishes, displaying stdout/stderr.
-v, --verbose Display verbose output.
edit (e):
Edit an existing bundle's metadata.
edit : Popup an editor.
edit -n <name> : Edit the name metadata field (same for other fields).
Arguments:
bundle_spec (<uuid>|<name>|^<index>)
-n, --name Change the bundle name (format: ^[a-zA-Z_][a-zA-Z0-9_\.\-]*$).
-d, --description New bundle description.
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
detach (de):
Detach a bundle from this worksheet, but doesn't remove the bundle.
Arguments:
bundle_spec (<uuid>|<name>|^<index>)
-n, --index Specifies which occurrence (1, 2, ...) of the bundle to detach, counting from the end.
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
rm:
Remove a bundle (permanent!).
Arguments:
bundle_spec (<uuid>|<name>|^<index>)
--force Delete bundle (DANGEROUS - breaking dependencies!)
-r, --recursive Delete all bundles downstream that depend on this bundle (DANGEROUS - could be a lot!).
-d, --data-only Keep the bundle metadata, but remove the bundle contents on disk.
-i, --dry-run Perform a dry run (just show what will be done without doing it).
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
search (s):
Search for bundles on a CodaLab instance (returns 10 results by default).
search <keyword> ... <keyword> : Match name and description.
search name=<name> : More targeted search of using metadata fields.
search size=.sort : Sort by a particular field.
search size=.sort- : Sort by a particular field in reverse.
search size=.sum : Compute total of a particular field.
search .mine : Match only bundles I own.
search .floating : Match bundles that aren't on any worksheet.
search .count : Count the number of bundles.
search .limit=10 : Limit the number of results to the top 10.
Arguments:
keywords Keywords to search for.
-a, --append Append these bundles to the current worksheet.
-u, --uuid-only Print only uuids.
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
ls:
List bundles in a worksheet.
Arguments:
-u, --uuid-only Print only uuids.
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
info (i):
Show detailed information for a bundle.
Arguments:
bundle_spec (<uuid>|<name>|^<index>)
-f, --field Print out these comma-separated fields.
-r, --raw Print out raw information (no rendering of numbers/times).
-v, --verbose Print top-level contents of bundle, children bundles, and host worksheets.
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
cat:
Print the contents of a file/directory in a bundle.
Note that cat on a directory will list its files.
Arguments:
target_spec (<uuid>|<name>)[/<subpath within bundle>]
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
wait:
Wait until a run bundle finishes.
Arguments:
target_spec (<uuid>|<name>)[/<subpath within bundle>]
-t, --tail Print out the tail of the file or bundle and block until the run bundle has finished running.
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
download (down):
Download bundle from a CodaLab instance.
Arguments:
target_spec (<uuid>|<name>)[/<subpath within bundle>]
-o, --output-path Path to download bundle to. By default, the bundle or subpath name in the current directory is used.
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
mimic:
Creates a set of bundles based on analogy with another set.
mimic <run> : Rerun the <run> bundle.
mimic A B : For all run bundles downstream of A, rerun with B instead.
mimic A X B -n Y : For all run bundles used to produce X depending on A, rerun with B instead to produce Y.
Arguments:
bundles Bundles: old_input_1 ... old_input_n old_output new_input_1 ... new_input_n ((<uuid>|<name>|^<index>)).
-n, --name Name of the output bundle.
-d, --depth Number of parents to look back from the old output in search of the old input.
-s, --shadow Add the newly created bundles right after the old bundles that are being mimicked.
-i, --dry-run Perform a dry run (just show what will be done without doing it)
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
-W, --wait Wait until run finishes.
-t, --tail Wait until run finishes, displaying stdout/stderr.
-v, --verbose Display verbose output.
macro:
Use mimicry to simulate macros.
macro M A B <=> mimic M-in1 M-in2 M-out A B
Arguments:
macro_name Name of the macro (look for <macro_name>-in1, ..., and <macro_name>-out bundles).
bundles Bundles: new_input_1 ... new_input_n ((<uuid>|<name>|^<index>))
-n, --name Name of the output bundle.
-d, --depth Number of parents to look back from the old output in search of the old input.
-s, --shadow Add the newly created bundles right after the old bundles that are being mimicked.
-i, --dry-run Perform a dry run (just show what will be done without doing it)
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
-W, --wait Wait until run finishes.
-t, --tail Wait until run finishes, displaying stdout/stderr.
-v, --verbose Display verbose output.
kill:
Instruct the appropriate worker to terminate the running bundle(s).
Arguments:
bundle_spec (<uuid>|<name>|^<index>)
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
write:
Instruct the appropriate worker to write a small file into the running bundle(s).
Arguments:
target_spec (<uuid>|<name>)[/<subpath within bundle>]
string Write this string to the target file.
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
Commands for worksheets:
new:
Create a new worksheet.
Arguments:
name Name of worksheet (^[a-zA-Z_][a-zA-Z0-9_\.\-]*$).
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
add:
Append text items, bundles, or subworksheets to a worksheet (possibly on a different instance).
Bundles that do not yet exist on the destination instance will be copied over.
Arguments:
item_type Type of item(s) to add {text, bundle, worksheet}.
item_spec Item specifications, with the format depending on the specified item_type.
text: (<text>|%%<directive>)
bundle: ((<uuid>|<name>|^<index>)|(<alias>|<address>)::(<uuid>|<name>))
worksheet: [<alias>::|<address>::](<uuid>|<name>)
dest_worksheet Worksheet to which to add items ([<alias>::|<address>::](<uuid>|<name>)).
-d, --copy-dependencies If adding bundles, also add dependencies of the bundles.
wadd:
Append all the items of the source worksheet to the destination worksheet.
Bundles that do not yet exist on the destination service will be copied over.
The existing items on the destination worksheet are not affected unless the -r/--replace flag is set.
Arguments:
source_worksheet_spec [<alias>::|<address>::](<uuid>|<name>)
dest_worksheet_spec [<alias>::|<address>::](<uuid>|<name>)
-r, --replace Replace everything on the destination worksheet with the items from the source worksheet, instead of appending (does not delete old bundles, just detaches).
work (w):
Set the current instance/worksheet.
work <worksheet> : Switch to the given worksheet on the current instance.
work <alias>:: : Switch to the home worksheet on instance <alias>.
work <alias>::<worksheet> : Switch to the given worksheet on instance <alias>.
Arguments:
-u, --uuid-only Print only the worksheet uuid.
worksheet_spec [<alias>::|<address>::](<uuid>|<name>)
print (p):
Print the rendered contents of a worksheet.
Arguments:
worksheet_spec [<alias>::|<address>::](<uuid>|<name>)
-r, --raw Print out the raw contents (for editing).
wedit (we):
Edit the contents of a worksheet.
See https://github.com/codalab/codalab-worksheets/wiki/User_Worksheet-Markdown for the markdown syntax.
wedit -n <name> : Change the name of the worksheet.
wedit -T <tag> ... <tag> : Set the tags of the worksheet (e.g., paper).
wedit -o <username> : Set the owner of the worksheet to <username>.
Arguments:
worksheet_spec [<alias>::|<address>::](<uuid>|<name>)
-n, --name Changes the name of the worksheet (^[a-zA-Z_][a-zA-Z0-9_\.\-]*$).
-t, --title Change title of worksheet.
-T, --tags Change tags (must appear after worksheet_spec).
-o, --owner-spec Change owner of worksheet.
--freeze Freeze worksheet to prevent future modification (PERMANENT!).
-f, --file Replace the contents of the current worksheet with this file.
wrm:
Delete a worksheet.
To be safe, you can only delete a worksheet if it has no items and is not frozen.
Arguments:
worksheet_spec [<alias>::|<address>::](<uuid>|<name>)
--force Delete worksheet even if it is non-empty and frozen.
wls (wsearch, ws):
List worksheets on the current instance matching the given keywords.
wls tag=paper : List worksheets tagged as "paper".
wls .mine : List my worksheets.
Arguments:
keywords Keywords to search for.
-a, --address (<alias>|<address>)
-u, --uuid-only Print only uuids.
Commands for groups and permissions:
gls:
Show groups to which you belong.
Arguments:
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
gnew:
Create a new group.
Arguments:
name Name of new group (^[a-zA-Z_][a-zA-Z0-9_\.\-]*$).
grm:
Delete a group.
Arguments:
group_spec Group to delete ((<uuid>|<name>|public)).
ginfo:
Show detailed information for a group.
Arguments:
group_spec Group to show information about ((<uuid>|<name>|public)).
uadd:
Add a user to a group.
Arguments:
user_spec Username to add.
group_spec Group to add user to ((<uuid>|<name>|public)).
-a, --admin Give admin privileges to the user for the group.
urm:
Remove a user from a group.
Arguments:
user_spec Username to remove.
group_spec Group to remove user from ((<uuid>|<name>|public)).
perm:
Set a group's permissions for a bundle.
Arguments:
bundle_spec (<uuid>|<name>|^<index>)
group_spec (<uuid>|<name>|public)
permission_spec ((n)one|(r)ead|(a)ll)
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
wperm:
Set a group's permissions for a worksheet.
Arguments:
worksheet_spec [<alias>::|<address>::](<uuid>|<name>)
group_spec (<uuid>|<name>|public)
permission_spec ((n)one|(r)ead|(a)ll)
chown:
Set the owner of bundles.
Arguments:
user_spec Username to set as the owner.
bundle_spec (<uuid>|<name>|^<index>)
-w, --worksheet-spec Operate on this worksheet ([<alias>::|<address>::](<uuid>|<name>)).
Commands for users:
uinfo:
Show user information.
Arguments:
user_spec Username or id of user to show [default: the authenticated user]
uedit:
Edit user information.
Note that password and email can only be changed through the web interface.
Arguments:
user_spec Username or id of user to update [default: the authenticated user]
--first-name First name
--last-name Last name
--affiliation Affiliation
--url Website URL
-t, --time-quota Total amount of time allowed (e.g., 3, 3m, 3h, 3d)
-d, --disk-quota Total amount of disk allowed (e.g., 3, 3k, 3m, 3g, 3t)
Other commands:
help:
Show usage information for commands.
help : Show brief description for all commands.
help -v : Show full usage information for all commands.
help <command> : Show full usage information for <command>.
Arguments:
command name of command to look up
-v, --verbose Display all options of all commands.
status (st):
Show current client status.
uedit:
Edit user information.
Note that password and email can only be changed through the web interface.
Arguments:
user_spec Username or id of user to update [default: the authenticated user]
--first-name First name
--last-name Last name
--affiliation Affiliation
--url Website URL
-t, --time-quota Total amount of time allowed (e.g., 3, 3m, 3h, 3d)
-d, --disk-quota Total amount of disk allowed (e.g., 3, 3k, 3m, 3g, 3t)
alias:
Manage CodaLab instance aliases.
alias : List all aliases.
alias <name> : Shows which instance <name> is bound to.
alias <name> <instance> : Binds <name> to <instance>.
Arguments:
name Name of the alias (e.g., main).
instance Instance to bind the alias to (e.g., https://codalab.org/bundleservice).
-r, --remove Remove this alias.
server:
Start an instance of the CodaLab bundle service.
Arguments:
--watch Restart the server on code changes.
rest-server:
Start an instance of a CodaLab bundle service with a REST API.
Arguments:
--watch Restart the server on code changes.
-p, --processes Number of processes to use. A production deployment should use more than 1 process to make the best use of multiple CPUs.
-t, --threads Number of threads to use. The server will be able to handle (--processes) x (--threads) requests at the same time.
-d, --debug Run the development server for debugging.
logout:
Logout of the current session.
bs-add-partition:
Add another partition for storage (MultiDiskBundleStore only)
Arguments:
name The name you'd like to give this partition for CodaLab.
path The target location you would like to use for storing bundles. This directory should be underneath a mountpoint for the partition you would like to use. You are responsible for configuring the mountpoint yourself.
bs-rm-partition:
Remove a partition by its number (MultiDiskBundleStore only)
Arguments:
partition The partition you want to remove.
bs-ls-partitions:
List available partitions (MultiDiskBundleStore only)
bs-health-check:
Perform a health check on the bundle store, garbage collecting bad files in the store. Performs a dry run by default, use -f to force removal.
Arguments:
-f, --force Perform all garbage collection and database updates instead of just printing what would happen
-d, --data-hash Compute the digest for every bundle and compare against data_hash for consistency
-r, --repair When used with --force and --data-hash, repairs incorrect data_hash in existing bundles
This section describes bundles in more detail. Recall that a bundle consists of (i) metadata and (ii) contents. The metadata consists of a set of key-value pairs, some of which can be edited by the user, and some automatically generated. The contents is an immutable file or directory, which can store programs, datasets, results, etc.
There are three types of bundles (metadata field bundle_type
):
- dataset: any bundle that is uploaded by the user.
- run: any bundle that is created in CodaLab as a result of executing a command.
- make: any bundle that is created in CodaLab by combining parts of multiple bundles.
Each bundle has a uuid
, which is unique and immutable. It looks like this:
0xa7da173bc0474344b326b406307dd1c7
Each bundle also has a name
(a generally short string consisting of letters,
digits, underscores and dashes), description
(which is generally longer and arbitrary),
and tags
(a list of strings). These fields need not be unique and can be changed:
cl edit <bundle> --name <new name>
cl edit <bundle> --description <new description>
cl edit <bundle> --tags <tag_1> ... <tag_n>
Each bundle has an owner
, which can also be changed:
cl chown <user> <bundle>
A bundle also has data_size
, which is how much space (unzipped) the bundle
contents takes up.
Each run bundle has a command
, which is a (bash) shell command that is executed
to produce the contents of the run bundle. When a run bundle is created, one can specify
other options:
--request-docker_image tensorflow/tensorflow:0.8.0-gpu
--request-time 2d
--request-memory 5g
--request-network # by default disabled
Each run has a state
, which evolves through the following values:
-
created
: initial state -
staged
: for run bundles, meaning dependencies areready
-
waiting_for_worker_startup
: we launched a worker just for this run, waiting for it -
running
: a worker is running the command -
ready
/failed
: terminal states corresponding to a successful or unsuccessful run
In addition, the run_status
field is filled in by the worker who is running
the job to provide more information (downloading dependencies, running the job,
uploading results, etc.).
As the run is in progress, several other metadata fields are updated (this is not the full set):
-
remote
: machine where this bundle is running -
time
: time spent on this run so far -
memory
: memory spent on this run so far -
exitcode
: set when command terminates -
failure_message
: if job ended badly, here's the reason
Run and make bundles have dependencies (think parents in a graph).
Each dependency has the form <key>:<target>
. For example:
lib:lib
train.json:dataset3/train.json
The key is like an alias or local variable that points to the target, which can be any path inside a bundle, or just the bundle itself. For a run bundle, these dependencies are not part of the bundle contents, but only present when the run is executing.
cl run lib:lib train.json:dataset3/train.json <command>
For a make bundle, these dependencies are copied to form the contents of the bundle:
cl make lib:lib train.json:dataset3/train.json
A special case of a make bundle is when there is a single dependency with a null key. Then the contents are simply copied from the target:
cl make dataset3/train.json
Each bundle has a set of children bundles, which are simply those that depend on the bundle.
To show all this verbose information about a bundle:
cl info -v <bundle>
A worksheet consists of (i) metadata and (ii) contents. The metadata includes:
-
uuid
: globally unique, automatically assigned -
name
: unique across the CodaLab server (unlike bundle names) -
title
: any textual description
The contents is a list of items, where each item is one of the following:
- text (e.g.,
Here is some *text*
) - bundle reference (
{<bundle_spec>}
) - worksheet reference (
{{<worksheet_spec>}}
) - directive (e.g.,
% display table s1
)
Worksheets are modified by editing the markdown source directly (see worksheet markdown reference) or by running commands that remove/add bundles to the worksheet.
cl wedit <worksheet>
We can display the contents of a worksheet from the CLI as follows:
cl print # Shows all items (bundles, text, and worksheets)
cl ls # Only shows the bundles
Unlike bundles, worksheets are mutable until they are frozen (this cannot be undone!):
cl wedit --freeze
One useful analogy is to think of a worksheet as a directory and the bundles in the worksheet as the files in that directory. But unlike a directory, a worksheet remembers the order of the bundles (which can be changed), which is useful for organization, and other text and directives, which is useful for presenting the worksheet.
Two special symbols denoting worksheets are:
-
.
: refers to the current worksheet -
/
: refers to your home worksheet (e.g.,home-pliang
)
We can add items (text or bundle references) to the current worksheet, which appends to the end:
cl add text "Here's a simple bundle:" .
cl add bundle sort.py .
cl add worksheet home-pliang .
We can create a worksheet as follows (try to use the <username>-<name>
convention for naming worksheets):
cl new pliang-scratch
We can switch back and forth between worksheets:
cl work pliang-scratch
cl work /
To remove a worksheet:
cl wrm pliang-scratch
So far, we have referred to bundles by their names. In a large CodaLab system with many users, names are not unique, and possibly not even within the same worksheet. A bundle_spec refers to the string that identifies a bundle, importantly given the context (current worksheet).
There are a number of ways to reference bundles:
- UUID (
0x3739691aef9f4b07932dc68f7db82de2
): this should match at most one bundle. You can use a prefix (an error is thrown if the prefix doesn't resolve uniquely). - Name (
foo
): matches the last bundle on the current worksheet with the given name. You can usefoo%
to match bundles that begin withfoo
or%foo%
to match bundles that containfoo
(SQL LIKE syntax). You can usew1/foo
to refer to a bundle by name on worksheetw1
. - Ordering (
^, ^2, ^3
): returns the first, second, and third bundles from the end of the current worksheet. - Named ordering (
foo^, foo^2, foo^3
): returns the first, second, and third bundles from the end with the given name. - You can refer to a range of bundles:
^1-3
resolves to^1 ^2 ^3
. - In the worksheet interface, if you press 'u', then this will paste the UUID of the current bundle into the command. This is a very convenient way of mixing command-line and graphical interfaces.
In practice, ^
and ^2
are used frequently because future operations tend to
depend on the bundles you just created.
Warning: ordering references are not stable. For example, if you run:
cl ls
cl rm ^1
cl rm ^2
This does not delete the first and second last bundles, but rather the first and third! The intended behavior is:
cl rm ^1 ^2
Also, if someone else is adding to your worksheet while you're editing it, you might end up referring to the wrong bundle.
Mimic and macros are the most advanced features of CodaLab which really leverage the fact that we have the full dependency graph. It allows you to rerun many commands at once with newer versions of code or alternative datasets.
Let us return to our sorting example. Suppose we have run the following command that sorts a file and extracts a single file.
cl run :sort.py input:a.txt 'python sort.py < input' -n sort-run
cl make sort-run/stdout -n a-sorted.txt
Now suppose we upload a new file b.txt
. Can we easily do this?
CodaLab macros allow you to do this, although understanding this concept requires us to take a step back.
In CodaLab, bundles form a directed acyclic graph (DAG), where nodes are bundles and a directed edge from A to B means that B depends on A. Imagine we have created some runs that produces some output bundle O from some input bundle I; I is an ancestor of O in the DAG. Now suppose we have a new input bundle I', how can we produce the analogous O'. The mimic command does exactly this.
First, recall that we have created a.txt
(I) and sort-run
(O). Let us
create another bundle called b.txt
:
6
3
8
and upload it:
cl upload b.txt
Now we can apply the same thing to b.txt
that we did to a.txt
:
cl mimic a.txt a-sorted.txt b.txt -n b-sorted.txt
We can check that b.txt.sorted
contains the desired sorted result:
cl cat b-sorted.txt
Normally, in a programming language, we define macros as abstractions. In CodaLab though, notice that we've started instead by creating a concrete example, and then used an analogy to re-apply this. A positive side-effect is that every macro automatically comes with an example of how it is used!
We can make the notion of a macro even more explicit. Let's rename a.txt
to
sort-in1
and a-sorted.txt
to sort-out
:
cl edit a.txt -n sort-in1
cl edit a-sorted.txt -n sort-out
Then we can use the following syntactic sugar:
cl macro sort b.txt -n b-sorted.txt
In CodaLab, macros are not defined ahead of time, but are constructed on the fly from the bundle DAG.
CodaLab implements the following permissions model:
- Users belong to groups.
- Each group has access to some bundles and worksheets.
There are three levels of access or permission:
-
none
: You can't even see that the worksheet exists. -
read
: You can read/download, but not edit. -
all
: You can do anything (edit/delete/etc.).
Notes:
- There is a designated
public
group to which all users implicitly belong. If you want to make a worksheet world-readable, give thepublic
group read permission. - There is a designated root user (
codalab
) that hasall
permission to all bundles and worksheets. - Each user has
all
permission to all bundles and worksheets that he/she owns.
To grant/revoke permissions:
cl perm <bundle> <group> <(n)one|(r)ead|(a)ll>
cl wperm <bundle> <group> <(n)one|(r)ead|(a)ll>
For example:
cl perm bundle1 public r # grant read permission
cl perm bundle1 public a # grant all permission
cl perm bundle1 public n # revoke permissions
We can transfer ownership (and therefore permissions) of bundles and worksheets:
cl chown <username> <bundle-1> ... <bundle-n>
cl wedit <worksheet> -o <username>
To make a worksheet w1
mutually-writable with your research group, first
create a group g1
, add users u1
and u2
to it, and then give the group all
access:
cl gnew g1
cl uadd u1 g1
cl uadd u2 g1
cl wperm w1 g1 all
All bundles created on w1
will initially inherit the permissions of that
worksheet, but these permissions can be changed independently.
To list the groups that you've created or belong to:
cl gls
To look more into a given group g1
:
cl ginfo g1
The cl search
command allows us to find bundles and compute various
statistics over them. The search performs a conjunction over keywords.
cl search <keyword-1> ... <keyword-n>
Some initial examples:
cl search mnist # bundles whose name or uuid contains `mnist`
cl search e342f # bundles whose name or uuid contains `e342f`
cl search type=program # program bundles
cl search name=mnist # bundles whose names is exactly `mnist`
cl search state=running # all running bundles
cl search command=%python% # bundles whose command contains `python`
cl search dependency=0xa11% # bundles that depends on the given bundle
cl search worksheet=0xfdd% # bundles that are on the given worksheet
cl search owner=codalab # bundles that are owned by the given user name
cl search =%python% # match any field
You can combine search terms:
cl search type=program owner=codalab # programs owned by user `codalab`
You can change the number and ordering of results:
cl search .offset=50 .limit=100 # bundles 50-99
cl search size=.sort # sort by increasing size
cl search size=.sort- # sort by decreasing size
There are some special commands:
cl search .mine # show bundles that the current user owns
cl search .last # bundles in reverse order of creation
cl search .floating # bundles that aren't on any worksheet
Operations that return a single number rather than a list of bundles:
cl search .count # return total number of bundles in the system
cl search size=.sum # return total number of bytes (nominal)
cl search size=.sum data_hash=0x% # return total number of bytes (actual, where we only count bundles with data)
We can combine these keywords to yield the following handy queries:
cl search .mine .last # bundles that you just created
cl search .mine .floating # bundles that are floating (probably want to delete these periodically)
cl search .mine size=.sort- # what are the biggest bundles I own?
The search returns a list of bundles. We can use -u
to just get the uuids.
This can be piped into other commands:
cl search .mine .floating -u | xargs cl rm # delete the floating bundles
cl search mnist -u | xargs cl add # add mnist to the current worksheet
We can list and search worksheets in a similar fashion:
cl wsearch # all worksheets
cl wsearch .mine # my worksheet
cl wsearch .last .limit=3 # last worksheets created
cl wsearch name=.sort # worksheets sorted by name
cl wsearch bundle=0x3bb% # worksheets containing this bundle
cl wsearch owner=codalab # worksheets owned by `codalab`
cl wsearch =%Hello% # worksheets containing 'Hello'
When you're using the web interface, you are connected to one particular
CodaLab instance (e.g., worksheets.codalab.org
). If you're using the CLI,
you can connect to multiple CodaLab instances and copy information between
them.
Suppose you have set up a local instance in addition to the official instance:
http://localhost:2800
https://worksheets.codalab.org/bundleservice
To save typing, you can create an alias for instances:
cl alias # shows all aliases
cl alias localhost http://localhost:2800
cl alias main https://worksheets.codalab.org/bundleservice
At any point in time, your CodaLab session (identified usually by your shell) is pointing to a particular worksheet on a particular instance.
cl work # Show the current instance and worksheet
cl work localhost::
cl work localhost::home-pliang
cl work main::
cl work main::home-pliang
The general form is:
cl work <instance>::<worksheet>
cl work <instance>:: # Defaults to <worksheet>=/
Just as in Git, sometimes you want to work locally and then once things are ready, push things to the main server. You can do the same with CodaLab. The difference is that bundles are atomic and mutable, so there is no merging.
Suppose we are on the home worksheet of localhost
:
cl work localhost::
To copy a bundle a.txt
from localhost
to main
, do the following:
cl add bundle a.txt main::
If the bundle a.txt
(identified by UUID) already exists on main
, then
nothing will be copied. Otherwise, the contents of the bundle are copied from localhost
to main
.
In either case, a reference to the bundle is appended to the home worksheet on main
.
You can also copy a bundle from main
to localhost
:
cl add bundle main::a.txt .
By default, cl add bundle
does not copy the dependencies of a bundle. If you want to
copy the dependencies (for example, to reproduce a run on another machine),
then use cl add bundle -d
. The dependencies of the dependencies are not copied,
since only the immediate dependencies are required to execute a run.
In general, the add bundle
command is as follows:
cl add bundle [<address>::]<bundle> [<address>::]<worksheet>
To copy all the items from a worksheet (except nested worksheets) to another:
cl wadd [<address>::]<worksheet> [<address>::]<worksheet>
Note that worksheets themselves are not copied, just the items within a worksheet. Any bundles that don't exist on the destination CodaLab instance are copied over.
The following describes some common tip and tricks to make the most out of CodaLab.
Delete the last five bundles (remember this also removes all other instances of these bundles on the current worksheet):
cl rm ^1-5
To kill the last bundle:
cl kill ^
Most CodaLab commands generate one or more bundle UUIDs. These can be piped to further commands. To kill all running bundles (be careful!):
cl search state=running -u | xargs cl kill
To delete all floating bundles that do not appear on a worksheet (be careful!):
cl search .floating -u | xargs cl rm
To run a bundle and create another bundle that depends on it:
cl make $(cl run date)/stdout -n stdout
To wait for the last bundle to finish and then print out its output:
cl run 'sleep 10; date'
cl cat $(cl wait ^)/stdout
To find out what happened to the last bundle (e.g., why it failed):
cl info -v ^
To rerun the last bundle (-f args
prints out the command that was used to
generate the bundle):
cl info -f args ^ | xargs cl
To put the command of a bundle back on the command-line for editing, create this handy alias in bash:
clhist() {
history -s cl $(cl info -f args $1)
}
Dependent bundles are read-only during a run, so to change files or add to a dependent directory, everything must first be copied. Example of compiling a source tree as a run bundle:
cl run :src 'cp -r src src-build && cd src-build && make'
To compare two worksheets:
vimdiff <(cl print -r worksheet1) <(cl print -r worksheet2)
To replace the contents of worksheet2 with worksheet1 (be careful when we do this, since all the contents of worksheet1 are removed, although the bundles themselves are not removed and will be floating):
cl wedit -f /dev/null -w worksheet2
cl wadd worksheet1 worksheet2
To change the metadata of a worksheet (e.g., rename or change the owner):
cl wedit <worksheet> -n <new name>
cl wedit <worksheet> -o <new owner>
To change the metadata of a bundle (e.g., rename or change the description):
cl edit <bundle> -n <new name>
cl edit <bundle> -d <new description>
By default, you will use cl wedit
to edit worksheets. However, it is
convenient to just keep a text editor open. Here's one way to do this:
-
Save the contents of a worksheet to a local file:
cl print -r codalab > codalab.ws
-
Edit
codalab.ws
. -
Save the worksheet back into CodaLab:
cl wedit codalab -f codalab.ws
It is useful to define editor macros to execute the first and third commands.
For example, in vim, you could define a save and load command by adding the
following two lines to your .vimrc
:
map mk :wa<CR>:!cl wedit % -f %<CR>
map mr :wa<CR>:!cl print -r % > %<CR>
The file that you load is in general not identical to the one you save (because references get interpreted and commands get executed), so it's a good idea to load right after you save.
Also, if you add bundles to the worksheet on the CLI, then you should reload the worksheet before you make edits or else you will lose those changes.
To update to the newest version of CodaLab, cd into codalab-cli
and run:
git pull
When you do this, the database schema might have changed, and you need to perform a database migration. To be on the safe side, first backup your database. Then run:
./setup.sh client
venv/bin/alembic upgrade head
Additionally, note that if you run your own worker, it will upgrade itself automatically. To avoid having to type in your username and password after a worker upgrades, you can pass in a file containing your credentials using the --password-file
flag.
For reference, your CodaLab settings here:
~/.codalab/config.json
The session state and authentication tokens are stored here:
~/.codalab/state.json
By default, the metadata is stored in a SQLite database (you should switch to a real database such as MySQL if you're going to do anything serious):
~/.codalab/bundle.db
If you're running a server, all the bundles are stored here:
~/.codalab/partitions