Minutes20150415 - snoplusuk/echidna GitHub Wiki
echidna meeting - 2015-04-15
Session 1: Introduction and current status of echidna
Brief overview of echidna
Ashley presented gave a walk through of the code on GitHub, highlighting main parts:
- Core data structure
- Creating spectra from [ntuples] (https://github.com/snoplusuk/echidna/blob/master/echidna/scripts/dump_spectra_ntuple.py)
and writing to hdf5s using the
store
method. - Limit setting:
- Chi-squared calculations
- Limit-setting [algorithm] (https://github.com/snoplusuk/echidna/blob/master/echidna/limit/limit_setting.py#L201)
- "book-keeping" class
LimitConfig
SystAnalyser
class for off line analysis
- Example limit setting scripts
- Documentation
- Unittests
Comments/questions
- Jack: does the answer you get depend on the order the backgrounds are included?
- TODO (Ashley): check if we get the same limit reversing the order B8 and 2nu2B are added
- Jeanne: at the moment have to edit parameters within the code, seems easy to make a mistake
- Should move more towards a config file/database for parameters
- Matt: do you ever need to scale up?
- When you first create
Spectra
instance, supply the number of simulated events that spectrum should represent (i.e. if creating from ntuple(s), how many events were simulated by rat in producing those ntuples? Then when scaling just input the number of simulated events you would like theSpectra
to represent now - scaled accordingly. Currently you can input any number of events, including a number larger than that used to create theSpectra
. Could add in some warning here?
- When you first create
- Matt: complained about documentation being before how to run echidna in README
A few recent updates
See #56, for most relevant updates made by James and Ashley for collaboration meeting and IOP.
TODO (Ashley): check decay.py
, looks like older version with g_a, still included
Current status
Open pull requests
- #45 - root crashes when using
--help
option. Ashley was assigned - TODO (Ashley): review PR #45 - #51 - non-graphical option for batch farm running. Evelina was assigned - TODO (Evelina): review PR #51
- #56 - changes from collaboration meeting and IOP. Evelina was assigned - TODO (Evelina): review PR #56
Current goals/milestones:
- James: Josh is keen to look at higher light yields --> echidna can do that through smearing
- James: AV position --> have we optimised position of FV correctly --> higher backgrounds from HD ropes
- Follows on from work that James S was doing.
- Jeanne: if you move FV, systematics are not as a function of R
Solar signal fitting (Stefanie):
- Log-likelihood fit in solar region
- Could use log-likelihood calculation already in place
- More of a fit than limit-setting
- TODO (Stefanie): write a simple fitting script with a few backgrounds
Timing information --> current problem --> need to come up with a solution
- James: just applying different weights --> need to separate out from ntuple code
- Evelina: will file size be larger with timing weights applied?
- Would be slightly larger but not noticeable, not main motivation for change
Energy resolution fitting:
- Evelina: lots of loops, want to fit each background separately with other backgrounds Bi, Po210 and 2nu --> internal lines
- Pile-up can effect 2nu shape --> could develop one technique for both analyses
- Other energy systematics, not just smearing.
Session 2: SNO+ sig-ex and other analyses --> where can echidna help?
SNO+ sig-ex: current thoughts
Jack's report:
- Understanding what people typically do, task has really been to work out what already exists
- Probably looking more at likelihood
- Just playing around with toy models for the moment
- e.g. only B8 background with a signal --> 2D likelihood space
- Check Andy's document that explains where these Likelihood formulae comes from docdb-2266
- Andy's code looks at likelihood space to find most likely parameters for PDF --> doesn't look at parameter space around minima --> possible improvement
- Josh suggested first step to fit in both energy and PSD
- Correlated systematics --> some sort of interface where you could mix two highly correlated parameters
- e.g. if energy and radius were correlated you could form some new energy-radius parameter
Path ahead for sig-ex and echidna - how can echidna help?
A few things to think about and check in echidna to makes sure it is suitable for potentially including Likelihood fit.
- Jack: how does timing scale as you add in more backgrounds?
- roughly exponential, need to investigate further
- Jeanne: parallelisation and optimisation
- James: currently limit-setting tries to do too much --> should compartmentalise more
- Various different options to try
- Could experiment looking at other ROIs, sidebands
- Could also look at what happens if you swap order of 2nu and B8 smearing in x,y,z
- Matt: Side-band fit outside FV ~4m
- Matt: have you looked at low-background, zero bin effects
Goals for echidna
- Jeanne: two main types of goal
- Thesis goals: Ashley and James' theses analyses are quite well defined
- SNO+ goals: how can we use echidna to make a competitive analysis software --> how does this fit in with Jack's goals?
- UK perspective --> good to have software that new students can quickly get involved in
- Also want main SNO+ analysis --> at least two rigorous analysis frameworks to cross-check each other
- Robustness to biases: not sensitive to binning choices, order of parameters etc.
Next meeting: mid to end May
Session 3: Key issues for next echidna release
Timing model
- Jeanne's suggestion --> remove timing from histogram and just use and analytic function for appropriate timing model
- Multiply by analytic function each time you return number of events
- TODO (Evelina) - assigned to implement these analytic functions #28
Analysis framework
- Jeanne: IRODs server
- TODO (Jeanne): email Francesca/Alex about setting up iRODs for SNO+ #57
Optimisation:
- First goal should be benchmarking with higher dimensionality to see what we need to aim for
- TODO (James): remove recursive file calling for backgrounds you don't want to float.
- TODO (Ashley and James): benchmarking --> do we actually need to optimise?
- Then we can look at different lines of optimisation
- Improving actual algorithms --> more sophisticated than grid search
- Parallelisation
- Re-visit next time
Data structure
- List of possible parameters:
- L^{Cosmo}
- R --> should bin in (R/R_{AV})^3, also do we want to store (x,y,z)
- Alphaness, alpha PID
- Directionality
- Change to a paradigm where we dynamically assign the variables we want to store on reading from the ntuple
- Provide a config when reading from ntuples
- Store config file along with spectra in hdf5
- Don't want to change DS once saved to hdf5
- TODO (DR MATT MOTTRAM): assigned to start looking into this #58
AOB
- Ice cream!!!
Action items summary
TODO (Ashley): check if we get the same limit reversing the order B8 and 2nu2B are added
TODO (Ashley): check decay.py
, looks like older version with g_a, still included
TODO (Ashley): review PR #45
TODO (Evelina): review PR #51
TODO (Evelina): review PR #56
TODO (Stefanie): write a simple fitting script with a few backgrounds
TODO (Evelina) - assigned to implement these analytic functions #28
TODO (Jeanne): email Francesca/Alex about setting up iRODs for SNO+ #57
TODO (James): remove recursive file calling for backgrounds you don't want to float.
TODO (Ashley and James): benchmarking --> do we actually need to optimise?
TODO (DR MATT MOTTRAM): assigned to start looking into this #58