Skip to content
Justin Bassett edited this page Jul 14, 2021 · 12 revisions

Welcome to the OptSched wiki!

This is a place for helpful tips and tricks for working with OptSched as a project developer.


SSH Configuration

Rather than typing your password every time you want to ssh onto the machines, use a ssh key to authenticate; it's both arguably more secure and more convenient.

Generate a SSH Key

Use ssh-keygen (man ssh-keygen for documentation). GitHub has documentation which is also quite straightforward: https://docs.github.com/en/free-pro-team@latest/github/authenticating-to-github/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent .

As a warning, the private key of the pair (stored as ~/.ssh/id_rsa as compared to the public key ~/.ssh/id_rsa.pub) should never be shared. Do not store any private keys on any machine other than your own, including Athena.

Copy SSH Key to Athena

Use ssh-copy-id:

ssh-copy-id -i ~/.ssh/the_key.pub <username>@athena.ecs.csus.edu

After this, you should be able to just write ssh <username>@athena.ecs.csus.edu to connect without requiring you to type in your password.

SSH "Tunneling"

To access our machines, you have to go to Athena then ssh into the desired machine. SSH knows how to do this for you if you set up your configuration correctly:

Host optimizer2
User <username-on-optimizer2>
ForwardAgent yes
ProxyCommand  ssh -W %h:%p  <ecs-username>@athena.ecs.csus.edu

If you do this, ssh optimizer2 will go directly to Grace2 via Athena without requiring you to ssh manually. Furthermore, because ForwardAgent yes is set, if you repeat the ssh-copy-id command to copy your public key to optimizer2, you will not be prompted for a password at all.

More convenient Athena access

You may find it helpful to set up a similar config for base Athena as by:

Host athena
User <ecs-username>
ForwardAgent yes
HostName athena.ecs.csus.edu

This allows you to write ssh athena instead of ssh <ecs-username>@athena.ecs.csus.edu.

Compare files side by side

You could open both files and compare however you want, but there's a tool specifically made for this: vimdiff. vimdiff will show you the differences between the two files, and scrolling will scroll both files simultaneously. To exit, :qa to "Quit All."

Running a long-running command over SSH

Building the benchmark suites take a long time. There are also several long-running tasks which we may need to do. For this, the screen command can be useful.

In it's simplest form, run: screen -q the command to run (e.g. screen -q ninja), or screen -q bash -c '...' for more complex scripts which need redirection or pipes. Ctrl+AD puts the screen into the background, and you can then log out of your ssh session. screen -r resumes.

Building a task queue with screen

More generally, you can have screen work as a task queue and queue up several commands to run after each other. See this SuperUser answer:

startqueue: starts the queuing system.

#!/usr/bin/env bash
screen -d m -S queue

queue: enqueue a command

#!/usr/bin/env bash
screen -S queue -X stuff "$@^M"

Where the ^M is a single special character. In vim in insert mode, produce it by typing Ctrl+V M. You may want to :set list to show the special character.

viewqueue: Look at the screen.

#!/usr/bin/env bash
screen -S queue -r

A more advanced task queue

With a little work, we can make it possible to view what has started/ended in the queue:

startqueue: starts the queuing system.

#/usr/bin/env bash
screen -d -m -S queuerun
screen -d -m -S queueview
screen -S queueview -X stuff "while true; do read; done >/dev/null^M"

queue: enqueue a command

#!/usr/bin/env bash
tmstmp=$(date '+%Y-%m-%d %r-%Z')
screen -S queueview -X stuff "Queue: $tmstmp  $@^M"
screen -S queuerun -X stuff "screen -S queueview -X stuff \"Start:   \$(date '+%Y-%m-%d %r-%Z')  $@\n\"^M"
screen -S queuerun -X stuff "$@^M"
screen -S queuerun -X stuff "screen -S queueview -X stuff \"End:   \$(date '+%Y-%m-%d %r-%Z')  $@\n\"^M"

viewqueue: Look at the queue.

#!/usr/bin/env bash
screen -S queueview -r

viewqueueout: Look at the terminal on which everything is being run.

#!/usr/bin/env bash
screen -S queuerun -r

Again, the ^Ms are a single special character.

Run over corresponding files across mirrored directories

We often have multiple differently-configured run results in directories. As an example, consider a/ and b/ both with 0.log and 1.log. How do we run a tool over the corresponding files a/0.log b/0.log and a/1.log b/1.log?

$ paste -d ' \n' <(find a/ -type f) <(find b/ -type f) | xargs -L1 echo
a/0.log b/0.log
a/1.log b/1.log

You could also just paste -d '\n' and use xargs -L2 instead.

paste -d ' \n' takes the two redirected shell results (<(...)) and zips the lines together; with just '\n', the lines are interleaved. xargs uses stdin as the commandline arguments of what follows it, but -L# tells it to stick to # lines per command invocation.

Nanosecond Timing

If the millisecond timing used by OptSched in its logs is not precise enough, you can upgrade to nanosecond timing (or microsecond) by editing utilities.h to change std::chrono::duration<double, std::milli> to std::chrono::duration<double, std::nano>:

inline Milliseconds Utilities::GetProcessorTime() {
  auto currentTime = std::chrono::high_resolution_clock::now();
  std::chrono::duration<double, std::nano> elapsed = currentTime - startTime;
  return elapsed.count();
}

Storage management

Prevent your files from taking up too much space.

How much space is left?

df -h .

What is taking up space in this directory?

du -sh * | sort -h

Delete all of my files from the CPU2006/results directory

Warning: you cannot undo this command.

find CPU2006/results -type f -user <MY_USER> -exec rm {} \;

Compress these files

Replace *.log with your pattern. You may want to use -name '*.log' -not -name 'somespecific.log' or -maxdepth 1 -name '*.log' or other variants. See man find for more info.

To uncompress, use bunzip2.

On one thread, in the background

tsp find . -type f -name '*.log' -exec bzip2 {} \;

Alternatively:

find . -type f -name '*.log' -exec bzip2 {} \; &

On all threads

find . -type f -name '*.log' -print0 | xargs -0 -L1 -P$(nproc) bzip2