lab 2 - humphd/topics-in-open-source-2024 GitHub Wiki

Lab 2

Due Date

Friday Sept 20 by Midnight.

Overview

This week we are going to practice contributing and submitting Pull Requests to other repos we don't own. This lab will help you gain experience doing the following:

forking and cloning other projects
creating branches to work on new features and fix bugs
working on code you didn't write, trying to maintain the original style and not break things
learning more about working with LLMs
creating pull requests
collaborating with other developers on GitHub
reviewing code changes
updating your pull requests to include fixes for review comments

Step 1. Pick another Student Project

Pick another student's project from the Lab 1 Submissions - "Repo you Reviewed (URL)" list. You can work on any project other than your own, and you do not need to work with the same partner as last week. Ideally, make sure no one else is working on this repo if possible (one student per repo for this lab is ideal, but not a requirement). You can start by messaging the owner on Slack.

Step 2. Add a New Feature: Token Info

When programming with LLMs it is necessary to understand how many tokens you are sending, receiving, and being billed for with a given request/response. In addition, all models have a fixed context length (i.e., how many tokens they can process), so it is important to stay within a given token budget.

To better understand this, you are asked to add a new command-line flag: --token-usage or -t. When the program is run with the --token-usage/-t flag set, extra information will be reported to stderr about the number of tokens that were sent in the prompt and returned in the completion.

A typical OpenAI-style chat completion response looks like this:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The 2020 World Series was played in Texas at Globe Life Field in Arlington.",
        "role": "assistant"
      },
      "logprobs": null
    }
  ],
  "created": 1677664795,
  "id": "chatcmpl-7QyqpwdfhqwajicIEznoc6Q47XAyW",
  "model": "gpt-4o-mini",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 17,
    "prompt_tokens": 57,
    "total_tokens": 74
  }
}

It includes a usage property, which holds information about the number of tokens that were used in the completion, including:

completion_tokens: the number of tokens in the generated completion (i.e., the response)
prompt_tokens: the number of tokens in the prompt
total_tokens: the number of tokens used in the prompt and completion (i.e., completion_tokens + prompt_tokens)

Being able to obtain this information easily is useful when debugging.

Step 3. File an Issue

Search through the existing Issues to make sure no one has filed an Issue for this feature yet. If there is one already, move on to another project repo.

If there isn't, file an Issue to add this new feature. Describe what you want to do in detail, and mention that you'd like to work on this. Give enough information for the project owner to understand what you plan on doing, and give feedback about how they want it done.

Step 4. Fork, Clone, Branch

Fork the other student's project on GitHub, then clone your fork. Next, create a branch for your work. If you filed Issue #5, name your new branch issue-5:

git checkout -b issue-5

Do all of your changes (i.e., all of your commits) on this branch, not the main branch.

Step 5. Write the Code

First, read the existing code. Get a sense of how it's organized (files, classes, functions), and make sure you can run it. If the code is broken before you begin making changes, it will be hard to test your work. If you're unclear about how something is written, ask the owner for tips. Remember, open source is a team sport. You don't need to struggle on your own in silence. Use the community to discuss your work and get help.

You will need to make the following changes to your partner's repo:

Determine which LLM provider(s) they currently support, and research out how to get token usage information from the completion response. Different providers provide it in slightly different ways
Find the code where your partner handles command-line flags and try to understand where you'll add --token-usage/-t
Find the code where your partner parses the LLM response, and figure out where you'll add the token usage extraction
Find the code where your partner outputs their response and diagnostic info, and figure how how you'll integrate the token usage

Once you understand the existing code, start making changes in order to implement your feature. Make sure you write your code as closely to the style of the original author as possible. Make it look like the same person wrote all the code. Pay attention to how they name things, how they do formatting, where they put things, etc. You're not trying to rewrite their code in your style, but write new code in their style.

Try to change as little as possible in the existing code. Don't start rewriting everything because you like a different style. Don't touch code that is unrelated to your changes. Don't fix bugs unrelated to your current work (that should be done in another issue, pull request, branch). Be focused! Touch only the code you need to in order to make your changes work. Write as little code as possible, while still making sure the feature works. NOTE: if you do find other bugs while you are working, feel free to file additional issues in the other student's repo.

As you work, commit changes to your branch. For example, you might start by adding support for the --token-usage and -t flags. Once that's written, you should commit your code before proceeding further. Your commits should be small, and tell a story: "Add --token-usage and -t flags", "Update response parsing to extract token usage information", etc.

Make sure your changes don't break the original code. Test, test, test, and test again. When you are satisfied that things are working, proceed to step 6.

Step 6. Update the Docs or Other files

Because your code is adding a feature, it's likely that you need to update other non-code files as well. For example, the docs (README) will need to be updated with info about this new feature, as well as what the information it gives means and how to understand it. There could be other files that need to be updated as well.

Making changes to a project often involves updating code, tests, dependencies, etc. Make sure you look for all the places you need to update things. Include all of these related changes in your branch.

Step 7. Create a Pull Request

When you're finished Steps 1-6, create a Pull Request. Start by pushing your branch to your fork on GitHub (i.e. your origin). Assuming you were working on a branch called issue-5:

git push origin issue-5

Obviously you should rename issue-5 to the actual branch name you are using.

Follow these steps to create a Pull Request from your branch. Pay attention to the following:

Pick the correct branch in your repo (e.g., issue-5 for you and master or main for the original repo). You want your work to get merged into the original project's master or main branch eventually
Write a complete title for your pull request. For example, "Add support for --token-usage/-t flag"
Write a complete description of what you did, including info that this Fixes #5 (or whatever Issue number you are fixing). GitHub will automatically link an Issue and Pull Request for you if you use the correct syntax. In your description, talk about what you changed in the code, how you did it, explain why you made certain choices, and discuss any problems you encountered or bugs you know about. Make sure the project owner can understand why and what you want to change with your pull request. Be detailed!

Step 8. Get Feedback and Update your Pull Request

Find the original repo's owner on Slack, and politely ask them to review your Pull Request. It is almost guaranteed that they will ask you to make changes (NOTE: if you are reviewing someone else's changes to your repo, please ask them to change something so they can practice this part, even if it's small).

When you are asked to make changes, go back to your code and make sure you are on the same branch that you submitted. For example, git checkout issue-5 to get on the issue-5 branch.

Edit the code to address the reviewer's comments. Make sure you deal with all of them! When you're done, add another commit to this branch:

git checkout issue-5
git add file1
git commit -m "Updating x, y, and z based on review feedback"
git push origin issue-5

Again, change issue-5 to whatever branch you are working on. Once you've done this, go back to the Pull Request on GitHub and leave a comment telling the reviewer you have completed all their changes, and what you did to accomplish them.

Repeat this cycle as many times as necessary for the project owner to Approve your changes and merge your work.

NOTE: if you are merging another student's work on your main branch, make sure you pull these changes into your local machine afterward (assuming you are working on the main branch):

git checkout main
git pull origin main

This will bring all of the new code changes into the repo on your local machine so that you can build on top of them. If you forget to do this, the changes will be included in your repo on GitHub but not in your locally cloned repo.

Also, make sure the original Issue gets closed once the Pull Request is merged. It might have happened automatically, depending on whether or not the original issue included the text Fixes #5 (or whatever the issue number is) in the description.

Step 9. Write a Blog Post

Write a blog post about the process of contributing a code change to another project. In your post, include links to everything you discuss (e.g., the project repo, your issue, your pull request). Discuss what you did, the changes you made for your feature, and the process of getting your work accepted. What problems did you have? What did you learn? What would you do differently next time?

If your repo received a pull request, please also talk about what it was like to get a submission. How much did you need the author to change? How did that process go?

Submission

When you have completed all the requirements above, please add your details to the table below.

Name	Blog Post (URL)	Your PR (URL)	PR(s) to Your Repo, if any (URL)
Majd Al Mnayer	Open Source Adventures: First Contributions and Collaborations	GENEREADME, PR13	OptimizeIt, PR13
Lily Huang	https://vriskaserket2.wordpress.com/2024/09/16/contributing-to-open-source/	https://github.com/brokoli777/RefactorCode/pull/4	https://github.com/lilyhuang-github/rat-assistant/pull/12
Uday Rana	https://dev.to/udayrana/pulling-our-weight-contributing-to-each-others-projects-2eej	https://github.com/mayank-Pareek/dev-mate-cli/pull/12, https://github.com/mayank-Pareek/dev-mate-cli/pull/13	https://github.com/uday-rana/codeshift/pull/11, https://github.com/uday-rana/codeshift/pull/13, https://github.com/uday-rana/codeshift/pull/14
Abdullah Al Mamun Fahim	https://dev.to/aamfahim/making-acontribution-b40	https://github.com/SychAndrii/infusion/pull/21	https://github.com/aamfahim/explainer.js/pull/23
Bregwin Jogi	https://dev.to/bregwin/so-how-does-pull-requests-work-again-osd6003-19o3	https://github.com/lilyhuang-github/rat-assistant/pull/12	https://github.com/brokoli777/RefactorCode/pull/4
Cleo Buenaventura	My first open source contribution	chat-minal, PR9	GENEREADME, PR13
Nonthachai Plodthong	First Contribute	OptimizeIT, PR	ChatMinal, PR
Aldrin Fernandez	https://dev.to/aldrin312/first-collaboration-3c0d	https://github.com/tasbi03/ReadCraft/pull/7	https://github.com/aldrin312/AutoCommentingTool/pull/8
Peter Wan	https://dev.to/peterdanwan/improving-my-personal-records-with-pull-requests-59kl	https://github.com/Kannav02/DialectMorph/pull/17, https://github.com/Kannav02/DialectMorph/pull/20	https://github.com/peterdanwan/gimme_readme/pull/25
Amir Mullagaliev	https://dev.to/amullagaliev/learn-new-things-everyday-first-pull-request-548m	mastermind, PR12	PolyglotCode, PR6
Aryan Khurana	https://aryank1511.hashnode.dev/my-open-source-journey-week-02	ResumeEnhancer, PR	Harshil's PR
Theo	https://dev.to/theoforger/making-contributions-p3p	mulla028/PolyglotCode - #6	theoforger/mastermind - #12
Krinskumar Vaghasia	https://dev.to/krinskumar/first-open-source-pr-merged-2499	https://github.com/Elisassa/Code-Formatter-Advisor/pull/2	https://github.com/KrinsKumar/Scrappy/pull/5
Anh Chien Vu	https://dev.to/anhchienvu/creating-pull-requests-to-external-repositories-5183	https://github.com/KrinsKumar/Scrappy/pull/5	https://github.com/AnhChienVu/VShell/pull/12
Henrique Sagara	https://dev.to/htsagara/adding-new-features-to-an-open-source-project-2iph	https://github.com/MadhurSaluja/Release-0.1/pull/6	https://github.com/HTSagara/readme_genie/pull/5
Harshil Patel	https://dev.to/harshil_patel/my-first-open-source-pull-request-1dmb	https://github.com/AryanK1511/github-echo/pull/42	https://github.com/hpatel292-seneca/ResumeEnhancer/pull/17
Hyunjin Shin	https://dev.to/jinger-ale/osd600-lab2-2m6a	https://github.com/AnhChienVu/VShell/pull/12	https://github.com/gitdevjin/code-mage/pull/8
Kannav Sethi	https://dev.to/kannav02/effective-prs-code-reviews-and-issue-fixes-4hm3	gimme_readme PR	DialectMorph PR1 DialectMorph PR2
Rong Chen	https://dev.to/elisassa/ccp-lab2-2915	TBD	https://github.com/Elisassa/Code-Formatter-Advisor
Arina Kolodeznikova	https://dev.to/arilloid/code-contributions-making-pull-requests-25mj	https://github.com/gitdevjin/code-mage/pull/8	N/A
Ajo George	https://dev.to/ajogseneca/contributions-and-prs-lab-02-10ep	https://github.com/mpa-LHutchinson/Auto-README/pull/5	https://github.com/ajogseneca/DocBot/pull/6
Liam Hutchinson	https://dev.to/mpalhutchinson/week-3-lab-2-pull-request-38pj	https://github.com/ajogseneca/DocBot/pull/6	https://github.com/mpa-LHutchinson/Auto-README/pull/5
Andrii Sych	https://dev.to/sych_andrii/first-pull-requests-ever-4mhk	https://github.com/aamfahim/explainer.js/pull/23	https://github.com/SychAndrii/infusion/pull/21
Christian Duarte	https://dev.to/cduarte3/week-3-lab-v01-218h	https://github.com/Add00/DocBot/pull/9	https://github.com/cduarte3/f2read/pull/10
Tasbi Tasbi	https://dev.to/tasbi03/from-code-to-collaboration-my-journey-adding-token-usage-to-autocomment-2k0b	https://github.com/aldrin312/AutoCommentingTool/pull/8	https://github.com/tasbi03/ReadCraft/pull/7
Vinh Nhan	https://dev.to/vinhyan/my-contribution-to-the-addcom-cli-tool-538f	https://github.com/arilloid/addcom/pull/5	N/A
Madhur Saluja	https://dev.to/msaluja/contributing-to-open-source-my-experience-with-pull-requests-and-collaboration-1opb	https://github.com/HTSagara/readme_genie/pull/5	(https://github.com/MadhurSaluja/Release-0.1/pull/6)
Adam Davis	https://dev.to/add00_3/learning-getting-my-git-pr-3gd3	Added token information	added token-usage feature
Mayank Kumar	https://dev.to/mayankpareek/creating-first-pull-request-3j30	https://github.com/uday-rana/codeshift/pull/11, https://github.com/uday-rana/codeshift/pull/13, https://github.com/uday-rana/codeshift/pull/14	https://github.com/mayank-Pareek/dev-mate-cli/pull/12, https://github.com/mayank-Pareek/dev-mate-cli/pull/13
Fahad Ali Khan	https://dev.to/fahadalikhanca/my-first-open-source-contribution-adding-a-token-usage-feature-to-tailor4job-3em	https://github.com/InderParmar/Tailor4Job/pull/5	https://github.com/Fahad-Ali-Khan-ca/DPS909_Release_0.1/pull/6
Inderpreet Singh Parmar	https://dev.to/inderpreet/my-first-open-source-contribution-adding-token-usage-feature-to-a-cli-project-1gni	https://github.com/Fahad-Ali-Khan-ca/DPS909_Release_0.1/pull/6	https://github.com/InderParmar/Tailor4Job/pull/5