Using API to create datasets - SPRACE/track-ml GitHub Wiki
Configuration parameters
lib_path :
Path of track-ml
from kaggle library.
input_dir:
Path of kaggle track-ml datasets directory (e.g. /data/trackMLDB/train/train_2/
)
output_dir:
Path of output directory to save the filtered datasets
begin_id:
Initial id from the first file to be filtered. Check the initial id from files inside the input_dir
.
end_id:
Final id from the last file to be filtered. The last id from files inside the input_dir
can be used.
n_cores:
Number of files to be processed at the same time (This parameter should not be greater than the number of cores in the host machine) of each round
round_time:
The max time (in seconds) of each round. Each round will be process n_cores
files at the same time.
n_hits_range_min and n_hits_range_max:
Just tracks with number of hits >= n_hits_range_min
and <= n_hits_range_min
will be filtered and saved in the output_dir
. This parameter needs to be a integer greater than 0
eta_range_min and eta_range_max:
Just tracks with the value of ETA of the last hit between eta_range_min
and eta_range_max
will be filtered and saved in the output_dir
. This parameters needs to be a integer.
phi_range_min and phi_range_max:
Just tracks with the value of PHI of the last hit between phi_range_min
and phi_range_max
will be filtered and saved in the output_dir
. This parameters needs to be a integer.
pt_range_min and pt_range_max:
Just tracks with the value of pt between pt_range_min
and pt_range_max
will be filtered and saved in the output_dir
. This parameters needs to be a integer.
silent:
If False
,the generator function will not show the partial results in the screen.