JobSplitting Argorithms - dmwm/WMCore GitHub Wiki

Job splitting is to make the job length for optimal use of resources (~8 hours in default) - There are various parameters are used to calculate approximate job length (most importantly TimePerEvent)

The main parameter used for job splitting is "events_per_job" (in splitting algo). This is set in the Spec(EventsPerJob)/Splitting Algorithm, if this value is not set it will be calculated by "TimePerEvent". If "events_per_job" is specified "TimePerEvent" is ignored for job splitting (only used for estimated JobTime)

events_per_job = int((8.0 * 3600.0) / timePerEvent)

EventAwareLumiBased algorithm (code)

  1. It converts events_per_job to lumis_per_job then create the job by iterating through files in the same location

    • In case a job cannot be created on multiple input files. "halt_job_on_file_boundaries == True"
    For each file f,
    if the file contains events below formula is how lumisPerJob is calculated
    
    f['avgEvtsPerLumi'] = round(float(f['events'])/f['lumiCount'])
    lumisPerJob  = events_per_job / f['avgEvtsPerLumi']
    
    if the file has 0 event,
    lumisPerJob = f['lumiCount']
    
    • In case the job can be created over multiple input files,
    Add more than one file until event in the job reaches to events_per_job. (also converting events_per_job to lumisPerJob
    [(code)](https://github.com/dmwm/WMCore/blob/1.1.3.pre2/src/python/WMCore/JobSplitting/EventAwareLumiBased.py#L182)
    

    When lumisInJob reaches lumisPerJob, create one job. (code)

  2. There is a case that a job is created but make it fail right away.

    If an inputfile has only one lumi and avgEvtsPerLumi (events in the file/lumis in the file) is bigger than max_events_per_lumi (default 20K) - fail this job (on creation).
    

EventBased algorithm (code)

  1. In case file (fake file for MC) contains more events than events_per_job.

    Job is created on partial file (using mask)
    
  2. In case file (fake file for MC) contains less events than events_per_job.

    Add more files until events in the job reaches events_per_job