How to compile the results? - cogstat/cogstat GitHub Wiki

Main parts of the output

For most analyses, preferably the following components should be compiled:

  • General information
    • Name of the analysis
    • The variables and their measurement levels involved in the analysis
    • Preconditions of the analysis, if any of them are violated (see the additional docstring in cogstat.py)
    • CogStat should handle all variables with any measurement levels. When this is not possible, then add a message that the analysis is not available. If the analysis does not make sense with the chosen variable with the given measurement type, then that should be mentioned in the Preconditions above (in the GUI version, in longer term, these settings should disable the OK button in the dialog). If the analysis makes sense, but is unavailable in CogStat, add an explicit message that the analysis is not implemented in CogStat yet.
  • Raw data
    • (If several variables are given (e.g., several grouping variables, repeated measures variables), keep the order of those variables in the output)
    • (Display the group levels alphabetically)
    • Number of observed and missing cases. Missing cases are dropped here, and only observed cases are used in the rest of the analysis.
    • Graphically display raw data without any additional information
  • Sample properties
    • (See the variable/value order related viewpoints in the Raw data section)
    • Descriptives numerically
      • Add standardized effect sizes here
    • Graph displaying the descriptives
      • Add raw data - Graph with individual data
  • Population properties
    • (See the variable/value order related viewpoints in the Raw data section)
    • Assumptions
      • Interval estimates and hypothesis tests may require assumptions to be met. One possibility is to include those assumptions checks here, or they could be included in the specific estimation or test part. If the assumptions are common, then it may be more parsimonious to include only once in this subsection.
      • When the assumptions are violated, but an alternative solution is not available (either in CogStat or, more generally, in the literature), add a warning message that the inferential statistics may be biased.
      • In fact, all assumptions can be properties of the sample/population that may be of interest in themselves. For example, differences in the standard deviation of the groups, or normality of variables can be essential in themselves in some cases. From that viewpoint, it wouldn't make sense to have an assumption subsection, but different properties should be listed, where some properties may be assumptions for some other properties' calculation methods. On the other hand, most research/researchers are interested only in the common properties, e.g., difference of the means. To reflect these latter viewpoints and make the output similar to what can be seen in papers, textbooks, etc., assumptions are handled in a separate subsection of the CogStat output at the moment (but we may change this in later releases).
      • (Assumptions are not relevant for the properties of the sample, since those indexes can be interpreted in a more flexible way.)
    • Point and interval estimations (confidence interval or credible interval, issue #20, issue #28)
      • Display them in a table: point and interval estimations are the columns, and parameters are the rows.
      • Add standardized effect sizes here
    • Graphs displaying the population estimations
    • Hypothesis test results with checking the appropriate assumptions
      • Hypothesis test should follow interval estimations, because most probably interval estimations can be interpreted more easily
      • Sensitivity power analysis: what is the effect size that has appropriate power with the current sample size? (issue #120)
      • Use the following steps:
        • First, explicitly state what property/situation is tested, forming the null hypothesis possibly in everyday terms
        • Second, specify what test will be used, or what tests could be used, depending on the assumptions. Print the main steps and reasons (variable type, assumptions, etc.) why the specific test was chosen
        • Third, if needed, run the assumption check(s). Be explicit about what belongs to the assumption check (issue #75).
        • Fourth, if assumption check was applied, summarize the assumptions and print the test name.
        • Fifth, print the test result.
        • Sixth, if post-hoc test is needed/available, print the post-hoc test name and the results.

Numerical results

  • APA format When relevant, results should be in APA format
  • Tables When possible, and when it gives denser presentation, tables should be used
  • Precision
    • Descriptive data and parameter estimations The precision defined as the decimal places (number of digits to the right of the decimal point) of the results depend on the precision of the imported data. Usually, the decimal places of the results are the decimal places of the source data plus one. This is in line with the recommendation "Do not report statistics to a greater precision than is supported by your data simply because they are printed that way by the program." (Wilkinson, 1999, para. 47.) because CogStat does not display precision that is not supported by the data.
    • When the nature of the variable is known, specific precision could be used independent of the data.
      • For p-values, in line with the APA style, if the value >= 0.001, then the value is displayed with three decimal places without leading zero, otherwise, p < .001.
      • Test values of hypothesis test, correlations with two decimal places. TBA standardized effect sizes in general, reliability indexes, etc.
      • In behavioral data diffusion analysis, three decimals are used in error rate, reaction time and diffusion parameter values.
  • Table row name sorting In pivot tables (including behavioral data diffusion analysis), row names should be ordered in a case-insensitive way to be consistent with spreadsheet software packages.

Choosing effect size indices for sensitivity power analysis

  • We should prefer effect sizes that are in line with our effect size results. If several effect sizes are calculated elsewhere in CogStat, sensitivity analysis may include several effect sizes too.
  • Effect sizes that are used in other popular packages (e.g., in G*Power) could also be added.
  • Note that statsmodels may not always find the effect size.
  • For sensitivity power analyses, Python modules offer various solutions.
  • Missing power analyses: #120

Order of the categories/conditions in an analysis

These viewpoints apply to both figures and relevant tables.

  • Group levels should be sorted alphabetically.
  • Levels of repeated measures factors should follow the order of the levels as specified by the user.

Figures

How to create figures?

References

Wilkinson, L. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54(8), 594–604. https://doi.org/10.1037/0003-066X.54.8.594