Meeting 2014 10 31 - NCEAS/commdyn GitHub Wiki
Commdyn Weekly Meeting
Date: 31 October, 2014: Halloween! Participants: Matt, Lauren W., Peter, Syd, Chris, Lauren H.
Agenda and notes:
-
Package codyn development issues
- TODO: matt to add build and run instructions to the README
- Andrew and Lauren discussing the taylor metrics
- Testing and documentation still needed
- Matt: remove dependency on reshape
-
Review the recordr design
-
https://github.com/DataONEorg/sem-prov-design/blob/master/docs/PROV-capture/Run-manager-API.rst
-
Peter is going through a demo
-
Questions/comments
- Matt: can we have a 'localId' for referring to numbered runs?
- Lauren W: what determines the order of listing the runs from listRuns()? Because they seem to be out of order chronologically Peter: the listing will be ordered chronologically, but that isn't working for this demo -> ok thanks
-
Also, I would think the Start/End time would be listed right after the Script name rather than Published Time (maybe just a personal preference) Peter: yes, the listing should be useful to you so we can change the order/content as necessary
-
so listRuns() could take a parameter "orderBy"? yes ->ok cool - Matt: Can we 'tag' the runs so users can differentiate them? How else to do they differentiate them? - Maybe name it when you call record() and after listRuns() as well
-
what would the tag be based on?
-
the tag could be entered with the record(rc, scriptName, "tag text")
-
Lauren W: Anything the user wants, right
-
Chris: Yeah, the Run should probably support arbitrary comments to jog the memory of the user of what the run entailed
-
Lauren H.: Can run recording span multiple script executions? Especially for runs that take a lot of time?
- Peter: yes
- Matt: could startRecord(); do a bunch of stuff; endRecord()
- Matt: alternatively: record("someLongExpensiveRun.R"); then record("analysis1.R"); record("analysis2.R")
-
Chris: run cache could get large. Maybe need to purge the cache.
- add API method for
deleteRun(runid)
- add API for
deleteRuns(runFilter)
where runFilter might be a date older than a certain age
- add API method for
-
Syd: how many of the runs to record?
- Peter: up to the person using the package;
-
Syd: how can the person running the analysis indicate which objects are important to keep; especially for people that don't know the package well
- Chris: ability to edit a package to prune objects; add API method for pruning objects from resulting data package;
-
Syd: intimidating that every movement has to be deliberate; helps if you can run that run again, and delete runs
-
Matt: need to provide more context in the 'View' output, including better section headers, lists of all tracked products, parameters used in the run
-
Syd: can you 'pause' the recording? Even if it loses some provenance relationships?
-
Syd: ability to add a comment when starting a run or after a run was done.
- similar to the Sumatra tool's reason flag (smt --reason "Testing grid size of .05" run)
- similar to the 'tagging' approach
-
Matt: Is this useful? Would you use it in your work? Would you publish your analyses with this?
- Lauren H.: Would probably wait til the end and record all at once, and then share the process. Rather than do any recording while building the analysis.
- coding during exploratory phase would probably not use the run management
- when working on a paper, there is a flow; work on first hypothesis until its complete; then that is a segment of code; then iterate internally on that code segment; then push to github when that segment is done; then move on to hypothesis 2, possibly using data artifacts that came out of the first analysis segment
- Syd: similar to Lauren H; if she knew she wasn't working with the data for a while, would record it so that later there were details that weren't lost; hard to remember details over time; main reason to do it would be to share it with others eventually
- Lauren H.: Would probably wait til the end and record all at once, and then share the process. Rather than do any recording while building the analysis.
-
Peter: will incorporate into design docs