evaluate_contrastive.py - cmikke97/Automatic-Malware-Signature-Generation GitHub Wiki
-
import configparser- implements a basic configuration language for Python programs - configparser documentation -
import json- json encoder and decoder - json documentation -
import os- provides a portable way of using operating system dependent functionality - os documentation -
import sys- system-specific parameters and functions - sys documentation -
import tempfile- used to create temporary files and directories - tempfile documentation -
import time- provides various time-related functions - time documentation -
from copy import deepcopy- creates a new object and recursively copies the original object elements - copy documentation
-
import baker- easy, powerful access to Python functions from the command line - baker documentation -
import mlflow- open source platform for managing the end-to-end machine learning lifecycle - mlflow documentation -
import numpy as np- the fundamental package for scientific computing with Python - numpy documentation -
import pandas as pd- pandas is a flexible and easy to use open source data analysis and manipulation tool - pandas documentation -
import psutil- used for retrieving information on running processes and system utilization - psutil documentation -
import torch- tensor library like NumPy, with strong GPU support - pytorch documentation -
from logzero import logger- robust and effective logging for Python - logzero documentation
from nets.Contrastive_Model_net import Netfrom nets.generators.fresh_generators import get_generatorfrom utils.ranking_metrics import mean_reciprocal_rankfrom utils.ranking_metrics import mean_average_precisionfrom utils.ranking_metrics import max_reciprocal_rankfrom utils.ranking_metrics import min_reciprocal_rankfrom utils.ranking_metrics import max_average_precisionfrom utils.ranking_metrics import min_average_precision
compute_ranking_scores(rank_per_query) (function) - Compute ranking scores (MRR and MAP) and a bunch of interesting ranks to save to file from a list of ranks.
-
rank_per_query(arg) - List of ranks computed by the model evaluation procedure
normalize_results(labels, predictions) (function) - Normalize results to make them easier to be saved to file.
-
labels(arg) - Array-like (tensor or numpy array) object containing the ground truth labels -
predictions(arg) - Array-like (tensor or numpy array) object containing the model predictions
evaluate_network(fresh_ds_path, checkpoint_path, training_run, train_split_proportion, valid_split_proportion, test_split_proportion, batch_size, rank_size, knn_k_min, knn_k_max, random_seed, workers) (function, baker command) - Evaluate the model on both the family prediction task and on the family ranking task.
-
fresh_ds_path(arg) - Path of the directory where to find the fresh dataset (containing .dat files) -
checkpoint_path(arg) - Path to the model checkpoint to load -
training_run(arg) - Training run identifier (default: 0) -
train_split_proportion(arg) - Train subsplit proportion value (default: 7) -
valid_split_proportion(arg) - Validation subsplit proportion value (default: 1) -
test_split_proportion(arg) - Test subsplit proportion value (default: 2) -
batch_size(arg) - How many samples per batch to load (default: 250) -
rank_size(arg) - Size (number of samples) of the ranking to produce (default: 20) -
knn_k_min(arg) - Minimum value of k to use when applying the k-nn algorithm (default: 1) -
knn_k_max(arg) - Maximum value of k to use when applying the k-nn algorithm (default: 11) -
random_seed(arg) - If provided, seed random number generation with this value (default: None, no seeding) -
workers(arg) - How many worker (threads) the dataloader uses (default: 0 -> use multiprocessing.cpu_count())
__main__ (main) - Start baker in order to make it possible to run the script and use function names and parameters as the command line interface, using optparse-style options