Machine Learning Journal - GeorgeIniatis/Blood_Brain_Barrier_Drug_Prediction GitHub Wiki

Experiment 1: Try to build Classification models

Classification models will make use of the Class as the label
Two different model categories:
- Category 1: Models with just the Chemical Descriptors used as features
- Category 2: Models with Chemical Descriptors, Side Effects and Indications used as features (does the addition of Side Effects and Indications to the Chemical Descriptors improve our predictive performance?)
Training sets:
- For category 1 the whole dataset will be used, excluding the entries used in the Test set
- For category 2 a subset of the dataset will be used, those entries that have Side Effects and Indications available, again excluding the entries used in the Test set
Test set:
- Will be used to compare the models against against each other
- 20% subset of the dataset entries that have Chemical Descriptors and Side Effects and Indicators. This is to allow us to use compare the performance of the two different categories of models using the same test setperformance of the two different categories of models using the same test set

Regression models will make use of the LogBB as the label
Training set:
- A subset of the dataset will be used, those entries that have LogBB available, again excluding the entries used in the Test set
Test set:
- Will be used to compare the models against against each other
- 20% subset of the training set

Will not rely on just one metric as it can lead to extremely wrong conclusions about the model's performance
Classification Models
- Sensitivity/Recall:
  - How many of the actual positives are labelled as positive by our model
  - tp / (tp/fn)
- Precision:
  - How many of positive predictions were actually true
  - tp / (tp/fp)
- F1 Score:
  - Mean of precision and recall
  - Other versions that add more/less weight to precision or recall
- Matthews correlation coefficient
- Others that could be used:
  - ROC curce & AUC
  - PR curve (Better for class imbalance)
- What do we care about most? False Positives or False Negatives?
Regression Models
- Negated Mean Absolute Error
- R2

Data will be scaled
Some data exploration will be performed
Cross validation will be used to find the best hyperparameters for our models
Multiple metrics will be reported for each of our models
The models will take the class imbalance into account
The testing sets will be stratified, preserving the class imbalance and will be used to reach appropriate conclusions