Competition - HallettLab/usda-climvar GitHub Wiki

Notes on Competition data cleaning

Started 2019-04-29 by CTW to help organize workflow for 1) organizing existing cleaning scripts and stating their purpose, and 2) reworking cleaning scripts using updated seeding key to move forward with analyses

Pertinent datasets:

Specimen seeds and dry weights (for allometric seeding)
Phytometer counts and biomass
Background competitor density counts and biomass
Seeding key
Maybe: Post-germination competition stem counts from Dec 2016 (but stem counts overestimated [i.e. quality issues that can't be resolved with that dataset])

Overview:

We counted all stems (number of individual plants) for phytometers and background stems for the entire competition plot (when sparse) or in a subsample (when dense). Biomass samples were clipped for phytometers and background. At least 20 specimens per competion experiment species were clipped, selecting a size gradient from small to large, in fall dry and wet plots, and seed count and dry mass recorded (seeds not developed on ESCA and TRHI to count, so no seed data for those species). Using the specimen dataset, project the relationship between individual biomass and fecundity per species in wet and dry. Using that relationship, estimate fecundity and biomass for background competitors at the competition plot level and fecundity and biomass for phytometers (sometimes phyto ANPP may be amount clipped, sometimes more than if didn't clip all [we know how many individuals grew at phytometer positions]).

Once compiled, questions to address:

What is the pairwise competitive relationship between species? Both in ANPP and fecundity.
Are some species relatively advantaged or disadvantaged in their competitive ability by background density? Both in ANPP and fecundity.
Can functional traits explain any of the above relationships? (e.g. we selected species by their ordination in 2D trait space)

Workflow for competition data cleaning:

Considerations: Pct_green generally only for AVFA, sometimes VUMY. Noted by CTW to monitor phenology informally. Stalks generally only counted for AVFA (also counted informally by CTW, stalks = main stem + tillers), sometimes exists for LACA (stalks = count mainstem + branches). 2nd samples taken generally in May just for forbs to compare April weights vs May weights. # flowers recorded for all LACA phyto stems by LMH in April (a few TRHI and ESCA if had flowers, but not many did), #flowers/buds for ANPP plants noted by CTW informally while weighing (mostly LACA only, and May ESCA where available).

(1) Clean and prep ANPP (phytometers and background competitors):

_Notes: Pct green recorded for VUMY and AVFA (CTW clipped, recorded for informal phenology data); LACA flowers/buds--also broken stems (how to treat this?); AVFA stalks (tillers/culms). _

There are a good number of late-season samples for ESCA, TRHI, and LACA background competitors. Maybe worth creating a script to sensitivity check those?

Since ANPP cleaned in same way, can write for loop that iterates through each dataset and:

Excludes sample2 = 1 (for information only)
Calculates mean invidual ANPP (mass/stems)
Produces table of biomass, stems, per individual mass, keeps cols on disturbance and notes (so in data analysis can decided if want to apply further treatment or exclusion before analyzing)

(2) Clean and prep stems counts (phytometers and background competitors):

Notes: Pct green recorded for VUMY and AVFA (CTW clipped, recorded for informal phenology data); LACA flowers -- for phytometers, dataset denotes whether flowers counted in whole plot or subsample (even if stems only counted in subsample)

Background stem counts: project stems to plot level, keep cols on disturbance and notes
Phytometer stems... not sure what we're doing with that yet? Number of seeds planted per plot per position not meted out nor recorded (initial plan was to plant roughly by weight to get some to germinate and then weed down to one phytometer, but later decided to keep all that grew), so don't think can compare recruitment across or by species.

(3) Derive allometric relationships:

Notes: Some specimens had more than 20 individuals collected (e.g. forbs in late-season). Seeds only counted for VUMY, AVFA, LACA and BRHO.

By species by treatment (e.g. fall wet, fall dry), fit curve to seeds ~ biomass
Extract model slope and intercept into data frame for each species and write out data table

(4) Combine all (i.e. project fecundity for phytometers and background competitors):

Goal is data table that contains plot, treatment info, background competitor, density treatment (low/high), phytometer species, phytometer individual weight, background competitor density at competition plot scale (50x50cm), and disturbance binary.. maybe with both phytometer disturbance notes and competitor disturbance notes .. maybe have disturbance to competitor and disturbance to phytometer separate since sometimes both did not occur (e.g. phytometer was browsed but background competitor fine)

Using cleaned background and phytometer biomass, stem counts, and allometric relationships, project fecundity for phytometers.. biomass(g) per individual x seeds/g x # individuals counted.

Questions/ideas from data cleaning:

Number of LACA flowers and buds present noted during weighing (not intentionally measured, CTW noted more from curiosity re: phenology).. if we use flowers for anything, should # flowers and # buds be summed?
Since flower info is available for LACA phytometers and background, could compare biomass projected fecundity (via allometric relationship) to count flowers/buds projected fecundity (via avg seed count per head? consider range of seeds per LACA head in specimen dataset)

Notes on data cleaning in practice:

CTW ended up cleaning background stems and ANPP in one script, and phytometer stems and ANPP in another. Not all ANPP in one and all stem counts in another.
Written out combined dataset does not include type of disturbance to background or phytometer (e.g. munched, rain disturbance trampling).. when analyzing dataset, may be good to consider specific type of disturbance, especially if plant was browsed.
Future needs: compare background stem density using 900cm^2 to projected density using area covered (only taken for grasses). CTW ran out of time to do this, but using area covered may help with low survivorship values (or one AVFA value that exceeds 100%?).