evaluate_contrastive.py - cmikke97/Automatic-Malware-Signature-Generation GitHub Wiki

In this page

Imported Modules

import configparser - implements a basic configuration language for Python programs - configparser documentation
import json - json encoder and decoder - json documentation
import os - provides a portable way of using operating system dependent functionality - os documentation
import sys - system-specific parameters and functions - sys documentation
import tempfile - used to create temporary files and directories - tempfile documentation
import time - provides various time-related functions - time documentation
from copy import deepcopy - creates a new object and recursively copies the original object elements - copy documentation

import baker - easy, powerful access to Python functions from the command line - baker documentation
import mlflow - open source platform for managing the end-to-end machine learning lifecycle - mlflow documentation
import numpy as np - the fundamental package for scientific computing with Python - numpy documentation
import pandas as pd - pandas is a flexible and easy to use open source data analysis and manipulation tool - pandas documentation
import psutil - used for retrieving information on running processes and system utilization - psutil documentation
import torch - tensor library like NumPy, with strong GPU support - pytorch documentation
from logzero import logger - robust and effective logging for Python - logzero documentation

from nets.Contrastive_Model_net import Net
from nets.generators.fresh_generators import get_generator
from utils.ranking_metrics import mean_reciprocal_rank
from utils.ranking_metrics import mean_average_precision
from utils.ranking_metrics import max_reciprocal_rank
from utils.ranking_metrics import min_reciprocal_rank
from utils.ranking_metrics import max_average_precision
from utils.ranking_metrics import min_average_precision

Classes and functions

compute_ranking_scores(rank_per_query) (function) - Compute ranking scores (MRR and MAP) and a bunch of interesting ranks to save to file from a list of ranks.

rank_per_query (arg) - List of ranks computed by the model evaluation procedure

normalize_results(labels, predictions) (function) - Normalize results to make them easier to be saved to file.

labels (arg) - Array-like (tensor or numpy array) object containing the ground truth labels
predictions (arg) - Array-like (tensor or numpy array) object containing the model predictions

evaluate_network(fresh_ds_path, checkpoint_path, training_run, train_split_proportion, valid_split_proportion, test_split_proportion, batch_size, rank_size, knn_k_min, knn_k_max, random_seed, workers) (function, baker command) - Evaluate the model on both the family prediction task and on the family ranking task.

fresh_ds_path (arg) - Path of the directory where to find the fresh dataset (containing .dat files)
checkpoint_path (arg) - Path to the model checkpoint to load
training_run (arg) - Training run identifier (default: 0)
train_split_proportion (arg) - Train subsplit proportion value (default: 7)
valid_split_proportion (arg) - Validation subsplit proportion value (default: 1)
test_split_proportion (arg) - Test subsplit proportion value (default: 2)
batch_size (arg) - How many samples per batch to load (default: 250)
rank_size (arg) - Size (number of samples) of the ranking to produce (default: 20)
knn_k_min (arg) - Minimum value of k to use when applying the k-nn algorithm (default: 1)
knn_k_max (arg) - Maximum value of k to use when applying the k-nn algorithm (default: 11)
random_seed (arg) - If provided, seed random number generation with this value (default: None, no seeding)
workers (arg) - How many worker (threads) the dataloader uses (default: 0 -> use multiprocessing.cpu_count())

__main__ (main) - Start baker in order to make it possible to run the script and use function names and parameters as the command line interface, using optparse-style options

Back to top

Repository file structure

root/
|
├── src/
|   |
|   ├── FreshDatasetBuilder/
|   |   |
|   |   ├── emberFeatures/
|   |   |   |
|   |   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   |   ├── features.py  - - - - - - - - - - - - - - - (features python code 📖Wiki)
|   |   |   └── vectorize_features.py  - - - - - - - - - - (vectorize features python code 📖Wiki)
|   |   |
|   |   ├── utils/
|   |   |   |
|   |   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   |   ├── fresh_dataset_utils.py - - - - - - - - - - (fresh dataset utils python code 📖Wiki)
|   |   |   └── malware_bazaar_api.py  - - - - - - - - - - (malware bazaar API python code 📖Wiki)
|   |   |
|   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   └── build_fresh_dataset.py - - - - - - - - - - (fresh dataset builder python code 📖Wiki)
|   |
|   ├── Model/
|   |   |
|   |   ├── nets/
|   |   |   |
|   |   |   ├── generators/
|   |   |   |   |
|   |   |   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   |   |   ├── dataset.py - - - - - - - - - - - - - - - - (dataset (base) code 📖Wiki)
|   |   |   |   ├── dataset_alt.py - - - - - - - - - - - - - - (dataset_alt code 📖Wiki)
|   |   |   |   ├── fresh_dataset.py - - - - - - - - - - - - - (fresh_dataset code 📖Wiki)
|   |   |   |   ├── fresh_generators.py  - - - - - - - - - - - (fresh_generators code 📖Wiki)
|   |   |   |   ├── generators.py  - - - - - - - - - - - - - - (generators (base) code 📖Wiki)
|   |   |   |   ├── generators_alt1.py - - - - - - - - - - - - (generators_alt1 code 📖Wiki)
|   |   |   |   ├── generators_alt2.py - - - - - - - - - - - - (generators_alt2 code 📖Wiki)
|   |   |   |   └── generators_alt3.py - - - - - - - - - - - - (generators_alt3 code 📖Wiki)
|   |   |   |
|   |   |   ├── utils/
|   |   |   |   |
|   |   |   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   |   |   └── Net.py - - - - - - - - - - - - - - - - - - (Net code 📖Wiki)
|   |   |   |
|   |   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   |   ├── ALOHA_net.py - - - - - - - - - - - - - - - (ALOHA_net code 📖Wiki)
|   |   |   ├── Contrastive_Model_net.py - - - - - - - - - (Contrastive_Model_net code 📖Wiki)
|   |   |   ├── Family_Classifier_net.py - - - - - - - - - (Family_Classifier_net code 📖Wiki)
|   |   |   ├── MTJE_net.py  - - - - - - - - - - - - - - - (MTJE_net code 📖Wiki)
|   |   |   ├── MTJE_net_cosine.py - - - - - - - - - - - - (MTJE_net_cosine code 📖Wiki)
|   |   |   └── MTJE_net_pairwise_distance.py  - - - - - - (MTJE_net_pairwise_distance code 📖Wiki)
|   |   |
|   |   ├── utils/
|   |   |   |
|   |   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   |   ├── contrastive_utils.py - - - - - - - - - - - (contrastive_utils code 📖Wiki)
|   |   |   ├── opt_utils.py - - - - - - - - - - - - - - - (opt_utils code 📖Wiki)
|   |   |   ├── plot_utils.py  - - - - - - - - - - - - - - (plot_utils code 📖Wiki)
|   |   |   └── ranking_metrics.py - - - - - - - - - - - - (ranking_metrics code 📖Wiki)
|   |   |
|   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   ├── evaluate.py  - - - - - - - - - - - - - - - (evaluate code 📖Wiki)
|   |   ├── evaluate_contrastive.py  - - - - - - - - - (evaluate_contrastive code 📖Wiki)
|   |   ├── evaluate_family_classifier.py  - - - - - - (evaluate_family_classifier code 📖Wiki)
|   |   ├── evaluate_fresh.py  - - - - - - - - - - - - (evaluate_fresh code 📖Wiki)
|   |   ├── gen3_speed_evaluation.py - - - - - - - - - (gen3_speed_evaluation code 📖Wiki)
|   |   ├── plot.py  - - - - - - - - - - - - - - - - - (plot code 📖Wiki)
|   |   ├── plot_contrastive.py  - - - - - - - - - - - (plot_contrastive code 📖Wiki)
|   |   ├── plot_family_classifier.py  - - - - - - - - (plot_family_classifier code 📖Wiki)
|   |   ├── plot_fresh.py  - - - - - - - - - - - - - - (plot_fresh code 📖Wiki)
|   |   ├── train.py - - - - - - - - - - - - - - - - - (train code 📖Wiki)
|   |   ├── train_contrastive.py - - - - - - - - - - - (train_contrastive code 📖Wiki)
|   |   └── train_family_classifier.py - - - - - - - - (train_family_classifier code 📖Wiki)
|   |
|   ├── Sorel20mDataset/
|   |   |
|   |   ├── generators/
|   |   |   |
|   |   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   |   ├── sorel_dataset.py - - - - - - - - - - - - - (sorel_dataset code 📖Wiki)
|   |   |   └── sorel_generators.py  - - - - - - - - - - - (sorel_generators code 📖Wiki)
|   |   |
|   |   ├── utils/
|   |   |   |
|   |   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   |   ├── download_utils.py  - - - - - - - - - - - - (download_utils code 📖Wiki)
|   |   |   └── preproc_utils.py - - - - - - - - - - - - - (preproc_utils code 📖Wiki)
|   |   |
|   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   ├── preprocess_dataset.py  - - - - - - - - - - (preprocess_dataset code 📖Wiki)
|   |   ├── preprocess_ds_multi.py - - - - - - - - - - (preprocess_ds_multi code 📖Wiki)
|   |   └── sorel20mDownloader.py  - - - - - - - - - - (sorel20mDownloader code 📖Wiki)
|   |
|   ├── utils/
|   |   |
|   |   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   |   └── workflow_utils.py  - - - - - - - - - - - - - - - - - (workflow_utils code 📖Wiki)
|   |
|   ├── __init__.py  - - - - - - - - - - - - - - - (python module init)
|   ├── config.ini - - - - - - - - - - - - - - - - (configuration file 📖Wiki)
|   └── main.py  - - - - - - - - - - - - - - - - - (main code 📖Wiki)
|
├── MLproject  - - - - - - - - - - - - - - - - (MLproject file)
├── README.md  - - - - - - - - - - - - - - - - (README)
└── conda.yaml - - - - - - - - - - - - - - - - (conda yaml environment)