Google Summer of Code

→ Open Library's 2025 Call For Proposals (CFP)

Welcome
- History, Your Chances, and Advice for Contributors
Drafting a Proposal
- Requirements, Walkthrough, Sample, and Tips
Selection Process
- Fellowship Qualities, Fellowship Checklist

Welcome

Google Summer of Code (GSoC) is a global program (paid mentorship program) focused on bringing more developers into open source software development. Contributors with an open source organization on a 3 month programming project during their break from school. You can read more on Google's GSoC website. The program is run by Google, which selects aligns organizations to mentor contributors on projects.

Candidates see eligibility requirements apply by submitting an application to one or more participating/mentoring organization. Open Library GSoC projects may span everything from writing bots to imports or organize books, to new crowd-source programs which allow patrons to fund and sponsor books they love for the library.

History

Internet Archive has participated in Google Summer of Code for 3+ years. Open Library has participated twice (2018, 2020). In 2019, the Internet Archive did not receive enough slots for Open Library to participate and so we designed and ran our own Internet Archive Summer of Code (IASoC) internship.

Your chances

Each year, the Internet Archive applies as an organization to participate in Google Summer of Code and most years we're accepted: we don't know in advance how many seats we will be awarded by Google. We first endorse certain proposals, google tells us how many seats the entire organization gets, and the Wayback Machine, Open Library, and Archive.org divide seats as available. Some years this means Open Library may not get a seat even if the Internet Archive is participating, but most years Open Library has received a seat (our team has worked with ~5 GSoC applicants).

Assuming Open Library is given a GSoC seat: Typically, hundreds of candidates email us to inquire about GSoC. 25 or so set up the Open Library code base, join the community slack channel, make contributions to the project, and submit applications. So if you're very dedicated and invest time meeting mentors and working with them to understand the codebase and the project's needs and problems, you're chances are ~4% (1/25).

Advice for Contributors

An essay on the topic by @mekarpeles: https://www.facebook.com/michael.karpeles/posts/10103690294172760

We love when folks are passionate and eager to participate! I'd like to offer some tips on how one might use eagerness to their advantage (because it's also possible for eagerness to work against us).

It can be tempting to demonstrate enthusiasm by showing we can jump into many different issues at once. However, one secret is: context switching across multiple projects is incredibly costly—for both contributors & staff.

Example: Imagine being the only chef in a restaurant and there are 800 patrons who want to be fed different meals from a menu with 10 recipes. We could offer to cook for every patron at the same time, but then you may need to know 10 different recipes, switch between chopping / simmering / sautéing / baking / plating, making sure no one waits too long, that tables get their meals together, that no sauce gets burned, no ingredient gets missed, and that no dish is cold. And that patrons are happy with their meal (because otherwise it may get sent back and disrupt all the other dishes you're working on).

In my experience (both in terms of making a bigger impact and gaining more experience), being given the opportunity to focus on a specific part of a project can both help a contributor better understand how each parts integrates together and allows us to align & stack our victories together to achieve greater impact.

We get a lot of confidence when a contributor:

asks to work on an issue because it is part of a thought out plan, that they are able to execute on.
asks a clarifying question because the requirements are unclear or something seems strange about the approach

One issue done well shows us a contributor:

has the ability to prioritize and strategically plan: to evaluate and identify which issues (and their parts) are important to the project and achievable to them
takes the time upfront to understand the issue and clarify questions before developing a solution that doesn't achieve the desired outcome
respects staff time by trying to make issues easier to review, following instructions, including screenshots, testing their code, maybe even asking chatgpt for feedback on their code before submitting (using tools available to them)

Asking for Feedback

How to not ask for advice: One of the biggest mistakes I see from applicants is inundating staff with frequent questions like, "can you please review my updated proposal".

This forces each mentor to understand what may have changed in 50+ applications and why. It also does not tell mentors what type of feedback you want, forcing them to have to understand what each applicant uniquely may need. It's both not scalable or sustainable for staff and it's not effective at getting feedback for applicants. We understand applicants want their effort and edits to be reviewed and dignified by staff and we're very happy to help. The best way to ask for advice or feedback is to leave a concise message letting us know what changed and exactly what type of feedback you need, with links to the right sections. Batch as many of your questions together as you can and put time into communicating clearly. The process by which you [thoughtfully] ask for feedback is also part of how we evaluate applicants and whether they will work well with the team and community. You may find this essay has some good stories and tips for reaching people in successful ways.

Identify a problem or opportunity staff mutually believes in
- Good: "Based on the 2025 roadmap, I have reason to believe XXX is important. I don't want to develop too much of a proposal for an idea that is not needed. The elevator patch is, building ABC that can be used by XYZ and may have ZZZ impact"
- Not great: "Can you help me come up with an idea? Is one of these good?"
- Worse: "I have submitted an entire proposal for an idea that I didn't check with staff and now I want feedback"
Propose and detailed solution that staff mutually endorses
Demonstrate that you are capable of programming (usually via links to github projects where you've done something similar), predicting problems (like performance, scale, other risks), and implementing the solutions you've identified
- Therefore you might ask... "Is it a problem that XXX? Will the system need to scale to YYY?"
  - Adding 5 paragraphs to your proposal that predicts every possible problem is not going to help you, it's going to force staff to read a lot of unrelated stuff. So, this may be a good opportunity for a conversation, where a single question to staff may save them from having to read 4 paragraphs. That's thoughtful and a good use of time.
    - Not Great: 5 paragraphs enumerating possible unlikely challenges and complex solutions
    - Good: Asking staff if XXX is a problem, and if so, proposing a solution (and offering to write up the solution if it's a good fit)
Show that if you have questions about the problem or the solution, that you spend time asking in a thoughtful and productive way:
- Good: "The Call For Proposals asks for Flask, but there are some cases where the application may benefit from responding asynchronously to requests. For this reason, would a proposal be considered that proposes FastAPI? Otherwise, if staff understands the limitations, I will happily use Flask
- Not Great: "Can you please review my proposal again and tell me if it's ok now"

Drafting a Proposal

We're not looking for you to have all the answers. We're looking for honesty, integrity, and well reasoned ideas that are achievable to implement and demonstrate that you have these fellowship qualities

Requirements

Your proposal should be focused, thematic, realistic, timely, high-impact, and metric-driven:

e.g.

I believe right now the Internet Archive's Open Library needs to focus on [Theme] because [Impact]...
[Theme] is best measured by metric X because...
Now is the right time [timely] to work on [Theme] because [impact]
I have a [realistic] idea on how to increase X by Y%
Here's my supporting evidence this idea will increase X by Y%
Here's is my proposal for increasing X by Y% (diagrams? wireframes? architecture overview? features?)
Here are the risks (what could go wrong)
Here are my open questions (for mentors)

Walkthrough

What Does a Good Proposal Look Like?

A good proposal:

Defines a very specific, focused problem and justifies its value in measurable terms
Demonstrates the value of Open Library and the problem it helps solve
Identifies open questions, risks, and concerns a mentor may have (it's okay for there to be risks and valuable for us to know you are someone who thinks about and can identify risks!)
Proposes a detailed & feasible step by step plan, with justification behind design decisions, and directly address the risks & questions.
Is specific enough that this can be handed to someone else and they'd be able to make progress towards the desired outcome
Engineering knowledge is demonstrated about the codebase
Product & design knowledge is demonstrated about the critical pieces required to produce a working product/prototype as well as original ideas
Shows the mentor how they will know if the plan is successful

A good proposal addresses the following 5 questions (replace Open Library with whatever project you're applying for):

What unique opportunity is Open Library missing and what is the potential impact?

What's more important than the impact being impressive is that it's realistic, well calculated, and that the type of value is well aligned with our library mission.

Today, Open Library catalog is limited to books the Internet Archive decides to acquire and make available through their library program. But there’s an opportunity to democratize Open Library’s bookshelves and extend this power to patrons so every reader in the world may be empowered to Sponsor books of their choosing. There are 10M patrons using Open Library and if even .05% (half a percent) donated a book, that would add 50,000 books. Some sources suggest an average paperback book costs ~$10 USD so this program could generate $.5M of book value for the community.

Why is now the ideal time for this opportunity (as opposed to another time?)

Give us confidence by showing us evidence: have other organizations succeeded at doing something similar? Has some discovery or change in the world made something new possible?

Last year, Open Library added a Want to Read button to their website enabling patrons to tell us which books we are missing from the library. To our surprise, over 400k unique patrons clicked this button since we added it, teaching us that patrons are eager to tell us which books they want. I believe some percentage of these patrons may also be willing to make a monetary donation to sponsor the accessioning of these desirable books.

What does the solution look like?

Start with an initial paragraph that provides a very clear, concise, short overview of the solution before going into the details. Flowcharts or architectural diagrams, designs, and other aids may be helpful.

For GSoC, I propose adding a Sponsor Book button that will show up on the 23M books on Open Library that are not yet readable. This button will connect to a minimal UX flow for sponsoring a book that I've [diagramed]. I've also [diagramed] approximately what the code flow will look like to implement this, with relevant services, data stores, APIs, URLs, files, and functions referenced.

Here's a micro-proposal with a more detailed example of a good "solution" breakdown:

Context: Open Library is an important platform because it helps underserved learners read library books online for free online -- many municipalities don't have as well funded libraries as NYPL and BPL and so the availability of digital reading options is a critical consideration. The Open Library catalog, however, is currently missing hundreds of high quality born-digital educational web books (for instance, [this book on the Rust programming language](https://github.com/rust-lang/book) published on github) are difficult to discover because they're not indexed or catalogued many places online. This book has 12k :star: and there are many books like it, signifying there's a large audience (likely tens of thousands of patrons) interested in the subject matter. Open Library already has a large patron-base and so adding support for books like these could have a big impact both for patrons and the authors of these high quality web books.

Proposal: A low-cost and effective program that will allow librarians on Open Library to propose linking books to URLs for open access readable editions online:

1. The /addbook page will be updated so any patron can submit the URL of a web book, along with some basic metadata to help reviewers assess the quality of the submission.
2. A privileged librarian will go to a new page called /review where (similar to the /merges UI) a librarian can evaluate the book's quality. Here is figma link to how this /review page might look:
   design showing a table where the fields are: (ol_edition_key, web_book_url, reviewer=None, status="approved")
3. In order to ensure only high quality of books are approved, there will be a standard checklist guide written that librarians can use to see if the web book meets all the criteria. This guide will be one of the deliverables.
4. Today, Open Library book pages have a read or borrow button when a readable edition is available in the library. I propose we make this button more flexible to include web_books as an option and possibly convert it into a dropdown button (see img) because this will allow us to offer many different options, such as external links to the web_book
5. Finally, we would make sure that when a web_book is approved by a librarian, it will enter the solr search engine so that patrons can facet/search by web_books specifically or see when a web_book happens to be available in the catalog. For this I will refer to the [video tutorial on adding fields to solr](https://archive.org/details/openlibrary-tour-2020/2021-10-26-OpenLibrary-Community-Celebration.mp4) and make the necessary changes to the main backend [search code](https://github.com/internetarchive/openlibrary/blob/master/openlibrary/plugins/worksearch/code.py) and the [search UI template](https://github.com/internetarchive/openlibrary/blob/master/openlibrary/templates/work_search.html) to make these changes.

What gives you confidence your solution is possible and achievable? What are the risks and challenges and how might we address them?

Show that you have the relevant experience or skills, understand the risks, and can problem solve. Identifying risks is only a good thing because the risk will exist whether acknowledged or not. Risks that are missed or brought up later in the process are some of the biggest threats to the success of your proposal.

In a previous job, I engineered a marketplace checkout flow and based on the flow I'm proposing, this will be very similar. A patron will click a button to sponsor a book, be brought to amazon to buy the book, and then enter Internet Archive has their shipping address. One challenge will be whether patrons get confused by the user experience and if they may forget to change the address. I propose we first prioritize integration of a minimal working system and then get feedback from a few patrons to see if this is an issue, and then consider whether a tighter integration may be a better approach (e.g. perhaps the patron donates the funds and then we try to automate submitting of these orders on their behalf). This also has risks, depending on when the donation is made, if there are chargebacks, or if the book price changes between time of donation and purchase. We may also want to consider whether there's a review process for sponsoring a book, if certain books shouldn't be eligible for sponsorship, and if there are any policy considerations we may wish to discuss (e.g. should we restrict to certain date ranges or topics?).

Evaluating success: If your proposal is successful, how will you know? What does success look like [as a metric] and what positive change does bring?

Success means establishing a new distribution channel to receive donations, promoting long-term sustainability. We also hope to improve and democratize our holdings by empowering thousands of patrons to participate in book sponsorship. Success looks like 1000 new books that were previously unavailable becoming available for the world and patrons having a way to crowdfund books that may otherwise be inaccessible to them. It hopefully inspires other libraries to try similar programs where patrons have the ability to more directly impact library holdings, which strengthens the whole library ecosystem. We will want analytics in place to see how often the Sponsor button is clicked versus how often it converts, as a baseline for how much time we invest supporting the feature.

Sample

You can view a sample proposal template here.

Tips

Your application should not just be a collection of issues. Group issues into logical themes. For instance, "my theme for phase one of my roadmap is making Open Library more accessible to people in non-US countries".
Ask lots of questions -- some of your ideas may be easy in principle, but hard because Open Library's code base can be difficult to work with, an issue may take longer than expected! Ask mentors how long they think an issue will take, it's one of the most important pieces of value they can offer.
Consider creating Open Library issues proposing features you'd like to include in your application. Get feedback from the community and see whether it's something they also value and will support.
Have an idea of what success means / looks like from the very beginning. How will you know if you've won? Brewster suggests one strategy is, "start with the aspirational blog post and work backwards".

Selection Process

When evaluating GSoC candidates, we create a table and individually rank proposals on a scale of 1-5. We dark out our scores so that other mentors are not influenced by our scores while performing their own independent evaluations. When all staff have voted, we tally up the results and discuss the top candidates.

While evaluating, staff considers aspects like whether the applicant demonstrates Fellowship Qualities and satisfies the Fellowship Checklist:

Fellowship Qualities

When selecting fellows, we try to identify individuals who demonstrate:

Initiative: proactively moving a project forward by doing what one can, even if there are blockers.
Strategy: choosing issues that are both impactful for the project and also part of some thoughtful greater plan.
Ability to Prioritize: discerning which elements of an issue are critical to invest time on v. just to get done (or asking if it's not clear), not taking on too much, factoring in how long things will take
Problem Solving: Figuring out how things work
Communication: asking as soon as one is blocked or needs help understanding how something works, with the right context and steps one has taken, and taking notes so others can also learn.

Fellowship Checklist

Follows code of conduct and demonstrates commitment to the open source ethos
Demonstrates ability to complete technical tasks mentioned in their proposal
Has produced a roadmap that is defensible, specific, focused, practical, impactful, and achievable
Has performed research and produced effective aids such as schemas, mockups, designs, or diagrams to demonstrate how their solution works and that their solution will work
Demonstrates understanding or sensitivity to real world limitations, risks, and costs of potential solutions (e.g. budget, compute needs, accuracy of models, potential bias of solutions, etc), as well as potential solutions
Demonstrates ability to organize, strategize, prioritize, and time manage
Respects staff's time by communicating effectively and not adding lots of "filler" paragraphs
Provides concrete solutions, not generic, indefensible, broad-sweeping, unqualified solutions like "will be solved using machine learning"

Good luck!

Google Summer of Code - internetarchive/openlibrary GitHub Wiki

Google Summer of Code

Welcome

History

Your chances

Advice for Contributors

Asking for Feedback

Drafting a Proposal

Requirements

Walkthrough

Sample

Tips

Selection Process

Fellowship Qualities

Fellowship Checklist

⚠️ GitHub.com Fallback ⚠️

Google Summer of Code - internetarchive/openlibrary GitHub Wiki

Google Summer of Code

Welcome

History

Your chances

Advice for Contributors

Asking for Feedback

Drafting a Proposal

Requirements

Walkthrough

Sample

Tips

Selection Process

Fellowship Qualities

Fellowship Checklist

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️