Meeting 7 - GeorgeIniatis/Blood_Brain_Barrier_Drug_Prediction GitHub Wiki
Meeting Minutes
Mainly discussed the issues raised by the status report and did some brainstorming
Shortly discussed about the Springer API findings and that it wasn't particularly easy due to its incomplete documentation
I then mentioned that we are probably not going to need a relationship extraction library since I am done with all my searches. The supervisor then mentioned that he will let me know if he has some code that is worth pursuing further
We then agreed that I should focus on the ML aspect of the project and I mentioned that I would also like to do some visualisations of the data. The supervisor then pointed me to some useful dimensionality reduction methods methods/tools that I could use to spot any clusters/groupings:
- TSNE (A bit controversial because it almost always finds clusters)
- PCA
However he mentioned that strong conclusions should not be drawn from them, rather, they should be used as a sanity check to see if anything weird is going on with the data and to spot any potential outliers or errors
He then pointed me to some plotting libraries:
- Plotly (Interactive)
- Matplotlib
- Seaborn
Then we shortly discussed about R, how it could also be used for ML but also about its drawbacks
Finally the supervisor mentioned that we could always add more chemical descriptors if needed and I added that I chose only a small number of them due to the Zhao et al. paper. We then agreed that it would be interesting to see if can confirm those results, dispute them, and even improve upon them.
Action Plan
- SK-Learn tutorials
- Visualisations of the dataset