Retrieved all chemical descriptors I wanted from PubChem API. No need for RDKit as far I can see
Retrieved side effects from SIDER dataset
Current size of dataset: 2107, after removing any duplicates and unknown compounds. Available in the repo (Dataset.xlsx) along with the code used to create it (modify_dataset.py)
Had a short look at Automated Google Searches
Not entirely familiar with web scraping
Will search for a tutorial/resources
Thinking of using the drug names in the SIDER dataset to perform the Google Searches and discover BBB permeability so we have a larger set of drugs with known side effects
Question/Topics to discuss:
General feedback about the project so far. Anything to improve