Report 8 - GeorgeIniatis/Blood_Brain_Barrier_Drug_Prediction GitHub Wiki

  • Datalore Static Copy
  • Created a Logistic Regression model for chemical descriptors
  • Created 3D visualisations for PCA, TSNE and UMAP
  • Questions/Topics to discuss
    • When using cross validation do I need to further evaluate my data using an independent test set?
    • What metric should I try to optimise? Precision, Recall, F1 Score, AUC
    • Should I add a class weight to the models?
    • Any SK-Learn best practices I should be aware of?
    • Any models that would work best and I should definitely give a try?
    • Was thinking of splitting my models into 4 categories:
      • LogBB Regression
      • Chemical Descriptors Classification
      • Side Effects and Indicators Classification
      • Chemical Descriptors + Side Effects and Indicators Classification