WeSearch_ChartPruning - delph-in/docs GitHub Wiki

Chart Pruning Investigation

We tried a range of chart pruning options over both the cb and sc01 profiles, and compared performance and coverage against two control cases:

none180: no chart pruning, timeout=180 (configuration used to produce gold profiles)
none60: no chart pruning, more realistic timeout of 60 seconds

Along with the standard coverage and average parse times, the statistics were are interested in are:

how often the top parse is lost due to pruning; and
how often the gold parse is lost due to pruning

All the chart pruning parse runs used the CPU, with the 1111 version of the ERG

  (pvm:make-cpu
          :host (short-site-name)
          :spawn (logon-file "bin" "cheap" :string)
              :options (list "-tsdb" "-packing"
                "-repp" "-tagger" "-cm" "-default-les=all"
                "-memlimit=1024" "-timeout=60"
                "-cp=${cpopt}"
                (registry:lookup :erg "~a~a~a" :ln :rt :cp))
           :class :tempcpu :grammar (registry:lookup :erg "ERG (~a)" :vn) 
           :name "pet-cp"
           :task '(:parse)  :flags '(:generics t)
           :wait 300 :quantum 180)

Results are available on Google Docs here. One subtlety in these numbers is the treatment of items which had results, but also had a "timed out" error. Since we are interested in getting the exact right result, these items are treated as if they produced zero readings when calculating coverage etc.