Using API to create datasets - SPRACE/track-ml GitHub Wiki

Configuration parameters

lib_path :

Path of track-ml from kaggle library.

input_dir:

Path of kaggle track-ml datasets directory (e.g. /data/trackMLDB/train/train_2/)

output_dir:

Path of output directory to save the filtered datasets

begin_id:

Initial id from the first file to be filtered. Check the initial id from files inside the input_dir.

end_id:

Final id from the last file to be filtered. The last id from files inside the input_dir can be used.

n_cores:

Number of files to be processed at the same time (This parameter should not be greater than the number of cores in the host machine) of each round

round_time:

The max time (in seconds) of each round. Each round will be process n_cores files at the same time.

n_hits_range_min and n_hits_range_max:

Just tracks with number of hits >= n_hits_range_min and <= n_hits_range_min will be filtered and saved in the output_dir. This parameter needs to be a integer greater than 0

eta_range_min and eta_range_max:

Just tracks with the value of ETA of the last hit between eta_range_min and eta_range_max will be filtered and saved in the output_dir. This parameters needs to be a integer.

phi_range_min and phi_range_max:

Just tracks with the value of PHI of the last hit between phi_range_min and phi_range_max will be filtered and saved in the output_dir. This parameters needs to be a integer.

pt_range_min and pt_range_max:

Just tracks with the value of pt between pt_range_min and pt_range_max will be filtered and saved in the output_dir. This parameters needs to be a integer.

silent:

If False,the generator function will not show the partial results in the screen.