Cross validation (nodes or CSV files) - clumsyspeedboat/Decision-Tree-Neo4j GitHub Wiki

Steps Cross-Validation from nodes.

  1. Query data - map neo4j nodes.
  2. Run the cross-validation from nodes procedures.

Cross-validation from nodes for InfoGain.

RETURN main.cvIG("targetAttribute","number_of_folds")

This procedure is used to display cross-validation time for data from the graph database for InfoGain. "targetAttribute" is the target attribute of the dataset. "number_of_folds" defines the number of folds for the cross-validation.

Cross-validation from nodes for GainRatio.

RETURN main.cvGR("targetAttribute","number_of_folds")

This procedure is used to display cross-validation time for data from the graph database for GainRatio. "targetAttribute" is the target attribute of the dataset. "number_of_folds" defines the number of folds for the cross-validation.

Cross-validation from nodes for Gini index.

RETURN main.cvGI("targetAttribute","number_of_folds")

This procedure is used to display cross-validation time for data from the graph database for GiniIndex. "targetAttribute" is the target attribute of the dataset. "number_of_folds" defines the number of folds for the cross-validation.

Steps Cross-validation from CSV files.

  1. Run the Cross-validation for Csv files

Cross-validation time for data from CSV file for Info Gain.

RETURN main.cvIGcsv("csvPath","targetAttribute","number_of_folds")

This procedure is used to display cross-validation time for data from the graph database for InfoGain. "targetAttribute" is the target attribute of the dataset. "number_of_folds" defines the number of folds for the cross-validation.

Cross-validation time for data from CSV file for GainRatio.

RETURN main.cvGRCsv("csvPath","targetAttribute","number_of_folds")

This procedure is used to display cross-validation time for data from the graph database for GainRatio. "targetAttribute" is the target attribute of the dataset. "number_of_folds" defines the number of folds for the cross-validation.

Cross-validation time for data from CSV file for Gini Index.

RETURN main.cvGICsv("csvPath","targetAttribute","number_of_folds")

This procedure is used to display cross-validation time for data from the graph database for GiniIndex. "targetAttribute" is the target attribute of the dataset. "number_of_folds" defines the number of folds for the cross-validation.