evaluate_contrastive.py - cmikke97/Automatic-Malware-Signature-Generation GitHub Wiki

In this page

Imported Modules


  • import baker - easy, powerful access to Python functions from the command line - baker documentation
  • import mlflow - open source platform for managing the end-to-end machine learning lifecycle - mlflow documentation
  • import numpy as np - the fundamental package for scientific computing with Python - numpy documentation
  • import pandas as pd - pandas is a flexible and easy to use open source data analysis and manipulation tool - pandas documentation
  • import psutil - used for retrieving information on running processes and system utilization - psutil documentation
  • import torch - tensor library like NumPy, with strong GPU support - pytorch documentation
  • from logzero import logger - robust and effective logging for Python - logzero documentation

  • from nets.Contrastive_Model_net import Net
  • from nets.generators.fresh_generators import get_generator
  • from utils.ranking_metrics import mean_reciprocal_rank
  • from utils.ranking_metrics import mean_average_precision
  • from utils.ranking_metrics import max_reciprocal_rank
  • from utils.ranking_metrics import min_reciprocal_rank
  • from utils.ranking_metrics import max_average_precision
  • from utils.ranking_metrics import min_average_precision

Back to top

Classes and functions

compute_ranking_scores(rank_per_query) (function) - Compute ranking scores (MRR and MAP) and a bunch of interesting ranks to save to file from a list of ranks.

  • rank_per_query (arg) - List of ranks computed by the model evaluation procedure

normalize_results(labels, predictions) (function) - Normalize results to make them easier to be saved to file.

  • labels (arg) - Array-like (tensor or numpy array) object containing the ground truth labels
  • predictions (arg) - Array-like (tensor or numpy array) object containing the model predictions

evaluate_network(fresh_ds_path, checkpoint_path, training_run, train_split_proportion, valid_split_proportion, test_split_proportion, batch_size, rank_size, knn_k_min, knn_k_max, random_seed, workers) (function, baker command) - Evaluate the model on both the family prediction task and on the family ranking task.

  • fresh_ds_path (arg) - Path of the directory where to find the fresh dataset (containing .dat files)
  • checkpoint_path (arg) - Path to the model checkpoint to load
  • training_run (arg) - Training run identifier (default: 0)
  • train_split_proportion (arg) - Train subsplit proportion value (default: 7)
  • valid_split_proportion (arg) - Validation subsplit proportion value (default: 1)
  • test_split_proportion (arg) - Test subsplit proportion value (default: 2)
  • batch_size (arg) - How many samples per batch to load (default: 250)
  • rank_size (arg) - Size (number of samples) of the ranking to produce (default: 20)
  • knn_k_min (arg) - Minimum value of k to use when applying the k-nn algorithm (default: 1)
  • knn_k_max (arg) - Maximum value of k to use when applying the k-nn algorithm (default: 11)
  • random_seed (arg) - If provided, seed random number generation with this value (default: None, no seeding)
  • workers (arg) - How many worker (threads) the dataloader uses (default: 0 -> use multiprocessing.cpu_count())

__main__ (main) - Start baker in order to make it possible to run the script and use function names and parameters as the command line interface, using optparse-style options


Back to top

⚠️ **GitHub.com Fallback** ⚠️