Data processing: remove intermediate diffs - JetBrains-Research/task-tracker-post-processing GitHub Wiki

Description

This module allows to remove intermediate diffs in the files. This means deleting all intermediate code snapshots that are collected during the writing of a code fragment.

For example, if we have three consecutive snapshots:

  • ...
  • prin
  • print
  • print(5)
  • …,

we would like to delete the first 2 fragments because these are not final states. The final state is a completed row entered by the user.

Usage

Use remove_intermediate_diffs method from intermediate_diffs_removing.py.

Argument Description
path path to a directory with files in a single language
output_directory_prefix the output directory name prefix. The default value is remove_intermediate_diffs

An example of the root input directory structure before usage:

-root
  --python
   ---task1
    ----user_N1_files
  --cpp
   ---task1
    ----user_N2_files