Recogito Tutorial: Advanced Bulk Mode - pelagios/pelagios.github.io GitHub Wiki

If you want to have more control over how the annotations are re-applied, you can use the bulk “expert” mode, which has three options:

  • Apply this annotation if:

    • Require full word match: This is the default in Recogito’s bulk mode. It means that your annotation will be applied only to the occurences of the exact word “Roma”. In machine-readable terms, it means that the string “Roma” has to be preceded and followed by either a white space or punctuation mark. This will tell Recogito not to apply the annotation for “Roma” to words such as “Roman”, “Romans” or “Aromatic”.
    • Allow any string match: If, on the contrary, you want Recogito to be less strict in the re-application of the annotation (e.g. you are looking for both the place-name and the related adjective), you can choose this other option. You can check your annotations manually and remove those that are incorrect at any time.
    • Annotation status: When you re-apply an annotation, you may be modifying existing annotations in your text that have either been created by a user and verified, or automatically generated and unverified (we will discuss Automatic Annotations further in another section). By default, Recogito doesn’t care about this distinction, and will apply the annotation to all occurrences of the word in the text. But, if it is relevant to your annotation process, you can limit Recogito to either verified or unverified existing annotations.
  • Apply this change to: If you are creating your annotations manually, or partly manually and partly automatically, there might be some occurrences of the same word that are already annotated and some that are not. By default, Recogito will re-apply the annotation to all occurrences. But in the advanced bulk mode you can decide, according to your annotation purposes, if you want to limit this function to include only:

    • Annotated matches: occurrences of the same word that have already been annotated
    • Unannotated matches: occurrences of the same word that have not been annotated yet.

To make it clear Recogito’s Bulk mode advanced options interface will always tell you how many annotations your choice will affect , and how many of them are annotated or unannotated.

  • How to merge changes: This feature enables you to manage how much information you want to be replicated and / or replaced.You have the option to :
    • Append changes: this option will leave the existing annotations as they are, with all their tags, comments and gazetteer matches, but will append any new information you are adding. For example, in a text about the Roman Empire you may have annotated some occurrences of the word “Rome” as a match with ancient Rome in the Pleiades Gazetteer, and some other occurrences of “Rome” as the modern capital of Italy, as a match with the contemporary gazetteer geoNames, when it is mentioned as a conference venue. You may also want to add a tag “City” to all your annotations of “Rome” but retain the different gazetteer matches: to do this, you would choose the “append changes” option. All the annotations will keep their original gazetteer match (as well as any existing comments or tags) but will also acquire the new tag “City”.
    • Replace annotations: with this option, all occurrences of the word will replicate exactly the the annotation that you are creating and will overwrite anything pre-existing. To help you understand what all your annotations will look like, the left bar of the bulk interface provides a summary of the annotation that you are re-applying, including the gazetteer match and any tags. All those elements, and only those elements, will feature in all the occurrences of the same word after you re-apply the annotation. So, say you match the word “Rome” with the ancient city in the Pleiades Gazetteer of the Ancient World and choose the “replace annotations” options, all the occurrences of “Rome” in the text will be matched to Rome in the Pleiades gazetteer, even if some of them had previously been matched to Rome in GeoNames.
    • Mixed: this option will enable you to keep the current tags and comments to the existing annotations, append new comments and tags, but replace the entity matches if they are different. For example, you may have annotated the word “Rome” and matched it to the ancient city in the Pleiades Gazetteer. Then you have proceeded tagging all the occurrences of “Rome” according to their grammatical function as “subject” or “complement”. On reflection, you decide that a match with GeoNames would be more appropriate, but you want to keep all the grammatical tags you have assigned. In this case, the “mixed” feature is the right option for you. You will replace the existing Pleiades match with the new GeoNames match, but will retain all the “subject” and “complement” tags. You can also add new comments or tags. Say, for example, you also added the new tag “place-name”: that too would now be appended to all your annotations of this word. As a result, in some of the annotations for “Rome”, you will have a match with GeoNames plus the tags “place-name” and “subject”l in other annotations you will have the match with GeoNames plus the tags “place-name” and “complement”.