02 11 2020 Meeting - euronite/ConversationalBrowser GitHub Wiki
02/11/2020 Meeting
Questions
- Next steps now that duration and occurrences visualisation is done
Discussion
Chi square- split data into two or more categories. Is one group showing interest var x more than other? Do a test. Chi square var is observed number - expected , divided by the total number of them. Do same for the other groups. Calculating the expectations: maybe category 1 has 100 observation and another has 50. But the other talks more than category 1. Need the therefore take into account the duration of the observations, so total time the category have in the overall dataset. Can discover that category 2 observed in 80% of the dataset compared to the category 1. Total time of interview for each categories to get % time each category duration is. X% of category.
T-test divide the corpus into two parts, median of trait. Test whether there is a difference between the two samples. Reject the null hypothesis. T variable- get average of both sides, sample variance for both sides. For each coefficient, take mean of the first coefficient, do T test, do each time. Find the critical value to accept/reject null hypothesis.
Think about the UI for next week. For the histograms, produce the duration for every occurrence of the cue, not the total.