Corplist XML - czcorpus/kontext GitHub Wiki
Corplist.XML
The corplist.xml file contains the definitions for a specific corpus. It defines some default behaviour as well as keywords and related interfaces.
Spoken corpora
A corpus with both audio and overlapping
<corpus
sentence_struct="sp"
speaker_id_attr="sp.oznacenishody"
speech_segment="seg.soundfile"
speech_overlap_attr="sp.prekryv"
speech_overlap_val="ano"
ident="ORAL2013" />
A corpus without audio and without overlapping
(...but we still want a speech-based KWIC detail rendering)
<corpus
sentence_struct="sp"
speech_segment="sp."
speaker_id_attr="sp.num"
ident="ORAL2008"
tagset="pp_tagset" />
Please note the dot in "sp." asigned to "speech_segment". It tells KonText that there are speeches defined but there are attributes defining speech audio.