Report 8 - GeorgeIniatis/Blood_Brain_Barrier_Drug_Prediction GitHub Wiki

Datalore Static Copy
Created a Logistic Regression model for chemical descriptors
Created 3D visualisations for PCA, TSNE and UMAP
Questions/Topics to discuss
- When using cross validation do I need to further evaluate my data using an independent test set?
- What metric should I try to optimise? Precision, Recall, F1 Score, AUC
- Should I add a class weight to the models?
- Any SK-Learn best practices I should be aware of?
- Any models that would work best and I should definitely give a try?
- Was thinking of splitting my models into 4 categories:
  - LogBB Regression
  - Chemical Descriptors Classification
  - Side Effects and Indicators Classification
  - Chemical Descriptors + Side Effects and Indicators Classification