Meeting 7 - GeorgeIniatis/Blood_Brain_Barrier_Drug_Prediction GitHub Wiki

Meeting Minutes

Mainly discussed the issues raised by the status report and did some brainstorming

Shortly discussed about the Springer API findings and that it wasn't particularly easy due to its incomplete documentation

I then mentioned that we are probably not going to need a relationship extraction library since I am done with all my searches. The supervisor then mentioned that he will let me know if he has some code that is worth pursuing further

We then agreed that I should focus on the ML aspect of the project and I mentioned that I would also like to do some visualisations of the data. The supervisor then pointed me to some useful dimensionality reduction methods methods/tools that I could use to spot any clusters/groupings:

TSNE (A bit controversial because it almost always finds clusters)
PCA

However he mentioned that strong conclusions should not be drawn from them, rather, they should be used as a sanity check to see if anything weird is going on with the data and to spot any potential outliers or errors

He then pointed me to some plotting libraries:

Plotly (Interactive)
Matplotlib
Seaborn

Then we shortly discussed about R, how it could also be used for ML but also about its drawbacks

Finally the supervisor mentioned that we could always add more chemical descriptors if needed and I added that I chose only a small number of them due to the Zhao et al. paper. We then agreed that it would be interesting to see if can confirm those results, dispute them, and even improve upon them.

Action Plan

SK-Learn tutorials
Visualisations of the dataset