The following is concluded from Anika's manual checking on the general sentences
we don't pick up goal, method descriptions, e.g. "to determine sth, we
calculated ..."
goals expressed with "to study", "To study the ... ", "To investigate
... ", "To determine", "We were interested in ... ", "The presented
study concerned/examined ... "
noun phrases ending in "test" or "score" or "performance" or
"examination" --> should be named entities
findings indicated with "were observed", "significantly
decreased/increased/different", "differed significantly", "this
indicates", "indicating", "correlated with", "(patients) showed",
"patients experienced", "was/were seen" in conjunction with brain
regions, "patients/subjects (...) reported", "was/were, as expected, ...
", "was/were abnormally", "correlation between", "were greater in ...
than in ... ", "our results show", "compared with", "significant (...)
correlations ...", "our main prediction", "was associated with",
"clusters were identified in", "these findings suggest", "was related to"
some sentences seem to have heading merged to them, which then can
cause problems with parsing and probably exceeds threshold of 500 characters
we seem to miss a lot of named entities that are brain regions
methods information for patients/study subjects/cohorts goes amiss
"patients with", "patients were identified with", "??? (n= *)", "control
subjects", "patients with", "were presented", "were required",
concatenated named entities e.g. tests, "the task was ...", "we assume
that", "that requires patients", "were classified as ... (according to)
.... (criteria)"
stanford parser seems to have a problem with recognising "CD" tags
when numbers are given as words, i.e. "16 individuals" could be
recognised as cardinal noun, in "sixteen individuals" sixteen is
annotated with "NN" instead of "CD"