Job execution using manual steps - PyProphet/pyprophet-cli GitHub Wiki

Alternatively, if pyprophet-brutus-driver is not available or for integration with other workflow managers, it is also possible to execute all steps independently. In the following example, 3 example runs are used:

1. Prepare data

pyprophet-cli prepare --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder=/tmp/pyprophet_work/ --separator="tab" --extra-group-column="ProteinName"

2. Subsample

pyprophet-cli subsample --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --separator="tab" --job-number 1 --job-count 3 --sample-factor=0.4 &
pyprophet-cli subsample --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --separator="tab" --job-number 2 --job-count 3 --sample-factor=0.4 &
pyprophet-cli subsample --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --separator="tab" --job-number 3 --job-count 3 --sample-factor=0.4 &

3. Semi-supervised learning

pyprophet-cli learn --work-folder="/tmp/pyprophet_work/" --separator="tab" --ignore-invalid-scores

4. Scoring

pyprophet-cli apply_weights --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --separator="tab" --job-number 1 --job-count 3 &
pyprophet-cli apply_weights --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --separator="tab" --job-number 2 --job-count 3 &
pyprophet-cli apply_weights --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --separator="tab" --job-number 3 --job-count 3 &

5. Statistical validation

Run-specific context

pyprophet-cli score --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --result-folder="/tmp/pyprophet_result_run_specific" --separator="tab" \
--job-number 1 --job-count 3 --lambda=0.4 --statistics-mode=run-specific --overwrite-results &
pyprophet-cli score --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --result-folder="/tmp/pyprophet_result_run_specific" --separator="tab" \
--job-number 2 --job-count 3 --lambda=0.4 --statistics-mode=run-specific --overwrite-results &
pyprophet-cli score --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --result-folder="/tmp/pyprophet_result_run_specific" --separator="tab" \
--job-number 3 --job-count 3 --lambda=0.4 --statistics-mode=run-specific --overwrite-results &

Experiment-wide context

pyprophet-cli score --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --result-folder="/tmp/pyprophet_result_experiment_wide" --separator="tab" \
--job-number 1 --job-count 3 --lambda=0.4 --statistics-mode=experiment-wide &
pyprophet-cli score --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --result-folder="/tmp/pyprophet_result_experiment_wide" --separator="tab" \
--job-number 2 --job-count 3 --lambda=0.4 --statistics-mode=experiment-wide &
pyprophet-cli score --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --result-folder="/tmp/pyprophet_result_experiment_wide" --separator="tab" \
--job-number 3 --job-count 3 --lambda=0.4 --statistics-mode=experiment-wide &

Global context

pyprophet-cli score --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --result-folder="/tmp/pyprophet_result_global" --separator="tab" \
--job-number 1 --job-count 3 --lambda=0.4 --statistics-mode=global &
pyprophet-cli score --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --result-folder="/tmp/pyprophet_result_global" --separator="tab" \
--job-number 2 --job-count 3 --lambda=0.4 --statistics-mode=global --overwrite-results &
pyprophet-cli score --data-folder="/tmp/openswath_results/" --data-filename-pattern="*.tsv" \
--work-folder="/tmp/pyprophet_work/" --result-folder="/tmp/pyprophet_result_global" --separator="tab" \
--job-number 3 --job-count 3 --lambda=0.4 --statistics-mode=global --overwrite-results &