Google Summer Of Code - deepchem/deepchem GitHub Wiki
DeepChem is hosting Google Summer of Code 2018 students through the Open Chemistry organization!
- The student will be mentored by at least one experienced DeepChem core developer and also receive support from Google.
- The student will contribute code to DeepChem.
Start by reading the contributing to DeepChem guidelines, and then come back here.
The application window opens on March 12, 2018 at 12:00 (PDT) and closes on March 27, 2018.
Eligibility
Eligibility requirements are described in the GSoC FAQ:
- You must be at least 18 years of age.
- You must be a full or part-time student at an accredited university (or have been accepted as of May 1, 2018).
- You must be eligible to work in the country you will reside in during the program.
- You have not already been accepted as a Student in GSoC more than once.
- You must reside in a country that is not currently embargoed by the United States. See Program Rules for more information.
How to Contact DeepChem
You are more than welcome to contact us before submitting your application, we will be happy to advise you on most aspects of the process. Getting in touch first is especially recommended if you are planning to apply to work on an original idea, rather than one of our suggestions.
So if you want to introduce yourself, discuss ideas or your application post on our gitter
Project Ideas
Transfer Learning Framework
Brief explanation: Create easy to use tools for common transfer learning scenarios.
Expected results: ChemNet discusses a powerful model independent transfer learning protocol. We would want to reproduce the results, and be able to apply the transfer learning protocol to arbitrary TensorGraph models. Jupyter notebook tutorials and blog posts will be expected over the course of the summer.
Mentor: Karl Leswing
Data Interfaces
Brief explanation: Transition deepchem.data.Dataset to tf.data.
Expected results: DeepChem data objects were created before tf.data existed. We need to make our existing Featurizers, Transformers, and Models work over tf.data objects. Jupyter notebook tutorials and blog posts for how to use the new improved interfaces.
Mentor: Karl Leswing
Model Visualization
Brief explanation: Node Importance Visualizations from Graph Models
Expected results: An argument often used against deep learning methods is that they are not understandable. This project would be to implement visual neural graph fingerprints into DeepChem. Stretch goals would be to implement DeepLift or masking techniques for atom level visualizations.
Mentor: Karl Leswing
Imaging Tools
Brief explanation: Enable chemical image segmentation and property prediction.
Expected results: We want an implementation of U-Net, and ResNet inside of the DeepChem framework. We want both pre-trained networks on problems of chemical importance and the image data augmentation techniques used to create the models.
Mentor: Bharath Ramsundar
Full List of Open Chemistry Project Ideas
Am I experienced enough?
We think that an interested and motivated student who is willing to learn is more valuable than anything else. We also value general software design above specific library API knowledge, or algorithmic expertise.
Our Expectations from Students
We expect you to be an engaged member of the DeepChem community. In particular, by being a GSoC student for DeepChem you agree to obey and uphold our Code of Conduct.
Communication
- Write a short report for us once a week
- Commit early and commit often! Push to github so that we can see and review your work.
- Actively work on our project timeline and communicate with us during the community bonding period
- Communicate every working day with your mentor. Just say "Hello" if you like. It can be via email, skype, github comments, etc
- If there is a reason why you can't work or can't contact us on a regular basis please make us aware of this.
- If you don't communicate with us regularly we will fail you.
Midterm and Final evaluations
- Set a realistic goal for mid-term. If you fail to meet your own goal we are more likely to fail you in the evaluations
- Have some code merged into our develop branch at the end of the summer to pass the final evaluation
- The last point is a hard requirement. Make sure that your time plan includes it.
How to write a great Application
Firstly, think about your choice of project carefully, you're going to be doing it for a couple of months, so it's important that you choose something you're going to enjoy. Once you've made your mind up:
- Make sure you've thought about the project and understand what it entails
- Don't be afraid to come up with original solutions to the problem
- Don't be afraid to give us lots of detail about how you would approach the project
- Contact us early! The earlier you contact us the earlier you will be able to get feedback from us to improve your application
Overall, your application should make us believe that you are capable of completing the project and delivering the functionality to our users. If you aren't sure about anything, get in touch with us, we're happy to advise you.
-
Introduce yourself on the gitter. Tell us what you plan to work on during the summer or what you have already done with DeepChem
-
Checkout the source and run the tests from the source code
-
Work through the tutorials to understand how DeepChem is being used.
-
Author at least one pull request to DeepChem. For instance, you could
- fix an easy bug
- write a new test
- update our documentation
Talk to us on the gitter if you need suggestions.
Dividing your project
The best way to achieve this is to divide your project into small self contained subprojects and plan to merge at least one of them around midterm.
Your application should include answers to the following questions.
- Why are you interested in working with us?
- Have you used DeepChem for your research already?
- Do you have any experience programming?
- Do you have any exams during GSoC or plan a vacation during the summer?
How to estimate time needed for development
To get a feeling for the code and get some experience with our code you can go and tackle some of our easy bugs. Look at the code that you want to change, check if it follows our coding guidelines. Do some research on the API's you want to use, plan what classes you will add and how their public API will look. Write down your algorithms in pseudo code. The better your research is and the better you plan ahead the easier it will be to judge how long a given task will take. For your time estimates you should also consider that you can do less stuff during exams and try to be a bit conservative. If you have never done anything like GSoC before you will tend to underestimate the time to complete a task. We know that giving these estimates is not easy and that also professionals have problems with it. Having a good plan, knowing its weak and strong points will help a lot.