monitor_dbs - biologyguy/RD-MCL GitHub Wiki
Computationally, RD-MCL tends to be much heaver earlier in a run, and then tapers off as many of the largest clusters are pre-analyzed and stored in the database. With this in mind, it's a great idea to start with a lot of workers, and then start releasing them as the run progresses. You can get a feel for how busy the queue is using this program.
$: monitor_db <args>
If no arguments are passed in, the program will look for the worker databases in the current working directory.
args: All flagged arguments are explained in detail below.
Specify the directory where worker databases are located
$: monitor_db -wdb "/home/worker_dir/
$: monitor_db -wdb '/home/rdmcl_workers'
Press return to terminate.
#Master AveMhb #Worker AveWhb #queue #subq #proc #subp #comp #ProcComp #HashWait #IdWait ConnectTime
6 23.7 4 14.7 2 24 3 4 12 0 5 5 0.025
Heading | Description |
---|---|
#Master | The number of RD-MCL threads that are talking to the database. Each instance of RD-MCL will have one 'master' thread, and then every MCMCMC chain will add a thread if it needs to feed a job into the queue |
AveMhB | Average master thread heartbeat. By default, master threads will ping the database once per minute to let everyone know it's still alive. If this number starts to creep up over 60 seconds, you know that a master thread has died somewhere. |
#Worker | The number of Worker nodes currently monitoring the queue |
AveWhb | Average worker heartbeat. Similar to AveMhb, just for workers. |
#queue | Number of new jobs waiting in the queue. These are placed by master threads |
#subq | Number of subjobs waiting in the queue. These are placed by workers, which break up large jobs to spread the load. |
#comp | Number of complete jobs waiting for their master threads to pick them up. This number includes complete subjobs, so can grow larger than the number of master threads currently waiting. |
#ProcComp | Number of complete jobs being processed by master threads. This usually happens pretty fast, so expect this value to mostely sit at zero unless something has gone wrong. |
#HashWait | Number of unique jobs currently being waited on. Multiple master threads may submit the same job to the queue, but the workers are smart enough to recognize this and only process the job once. |
#IdWait | Number of master threads currently waiting around. Related to #HashWait, there may be multiple masters waiting on the same job. |