monitor_dbs - biologyguy/RD-MCL GitHub Wiki

Display a dynamic output of the worker queue

Computationally, RD-MCL tends to be much heaver earlier in a run, and then tapers off as many of the largest clusters are pre-analyzed and stored in the database. With this in mind, it's a great idea to start with a lot of workers, and then start releasing them as the run progresses. You can get a feel for how busy the queue is using this program.

Generalized usage

$: monitor_db <args>

If no arguments are passed in, the program will look for the worker databases in the current working directory.

args: All flagged arguments are explained in detail below.

Arguments

-wdb, --workdb ( path )

Specify the directory where worker databases are located

$: monitor_db -wdb "/home/worker_dir/

Example with expected output

$: monitor_db -wdb '/home/rdmcl_workers'

Press return to terminate.
#Master  AveMhb   #Worker  AveWhb   #queue   #subq   #proc   #subp   #comp   #ProcComp #HashWait #IdWait  ConnectTime
6        23.7     4        14.7     2        24      3       4       12      0         5         5        0.025

Heading	Description
#Master	The number of RD-MCL threads that are talking to the database. Each instance of RD-MCL will have one 'master' thread, and then every MCMCMC chain will add a thread if it needs to feed a job into the queue
AveMhB	Average master thread heartbeat. By default, master threads will ping the database once per minute to let everyone know it's still alive. If this number starts to creep up over 60 seconds, you know that a master thread has died somewhere.
#Worker	The number of Worker nodes currently monitoring the queue
AveWhb	Average worker heartbeat. Similar to AveMhb, just for workers.
#queue	Number of new jobs waiting in the queue. These are placed by master threads
#subq	Number of subjobs waiting in the queue. These are placed by workers, which break up large jobs to spread the load.
#comp	Number of complete jobs waiting for their master threads to pick them up. This number includes complete subjobs, so can grow larger than the number of master threads currently waiting.
#ProcComp	Number of complete jobs being processed by master threads. This usually happens pretty fast, so expect this value to mostely sit at zero unless something has gone wrong.
#HashWait	Number of unique jobs currently being waited on. Multiple master threads may submit the same job to the queue, but the workers are smart enough to recognize this and only process the job once.
#IdWait	Number of master threads currently waiting around. Related to #HashWait, there may be multiple masters waiting on the same job.