Home - CSC4790-Fall2024-Org/Grocery-Receipt-App GitHub Wiki

1. Final Artifacts

1.1 Final Abstract

Abstract

1.2 Final Poster

Poster

1.3 Lightning Talk

Lightning Talk

1.4 Project Management

Project Management

1.5 Final Demo Video

Demo Video

2. Requirements

2.1 Key User Stories

User Story Status
As a person living with roommates, I want to be able to upload a photo of my receipt, select items by roommate, and accurately split the bill between everyone, including tax. (Shallow trench) Met *Shallow Trench
As a user, I want to assign items to individuals and split items multiple ways Met
As a developer, I want the code to be well documented and maintainable so that I can update the code/find issues easily. Deferred, need to redocument code
As a user, I want the OCR to be able to read receipts of up to 100 items Met
As a college student, I want to be able to easily send my share of the bill via venmo or zelle to my roommates. Met, venmo integrated

2.2 Requirements Analysis Artifact

Interview with Stakeholder

3. Scope Adjustments

One of the things we were most unsure about when we started our project was scope. Since a lot of the technologies that we planned on using were new, we were not certain of what we could get done in the few months that we had. Our initial scope had users being able to upload a receipt and split it amongst however many people they wanted to. We thought that adding Venmo integration would be an additional step. However, once we were able to get Amazon Textract integrated properly and our Google AI Prompt perfected, we added Venmo back into our shallow trench and

4. Risks

Risks

5. Design

5.1 Design Narrative

For our final marketecture diagram, we chose to switch from a vertical layout to a horizontal layout with each major component having origin and destination arrows to show the flow of our mobile app. The app first begins with a connection of our React Native code to expo go through a command line instruction and then proceeds to send a photo of a grocery receipt to an S3 Bucket (divvybucket1) to AWS Textract, an OCR tool, to pull the text off of the receipt which then gets sent back to the app. From there, the raw text is sent to a DyanmoDB to be stored and then it is sent to Google AI where Google AI outputs a json like object which is processed in our React Native code so that the user can see the list of items, assign responsibility, and then go to the venmo payment page. On the venmo payment page, they choose who paid, define their venmo receipient's address, choose the method of sending and then we send a link through the phone's default messaging app with a link to pay on venmo. When the user clicks the link, it opens venmo on their phone with the entire payment process pre-filled so all they had to do is hit pay.

5.2 Design Diagram

Diagram

6. Implementation

6.1 Development Environment

Tech Stack

6.2 Implementation Challenges

One of the biggest obstacles we ran into during implementation was getting some of the styling to work across all different devices. All 3 of our mobile devices have different widths, and the styling only worked as intended on some of the devices. We had to rewrite our styling and architecture for the summary page in order to account for this. Another problem that we ran into was integrating Amazon Textract. AWS was something that we had never worked with before, and using Lambda functions to integrate Textract was difficult. Tutorials helped us out a lot and it took a lot of time to learn the ins and outs of AWS.

6.3 Code Samples

Prompt engineering was a very important part of our project. This is what we asked Google AI to do with the raw text from Textract:

```const prompt = `"extractedText":"${extractedText}"

        Using extractedText, generate a JSON object with the following structure:

        \`\`\`json
        {
          "items": [
            {"itemname": "string", "pricename": number, "discountamount": number},
            {"itemname": "string", "pricename": number, "discountamount": number},
            // ... more items
          ],
          "subtotal": number,
          "tax": number,
          "total": number
        }
        \`\`\`

        Use the provided \`extractedText\` to extract item details.  The \`itemname\` should be the product name. The \`pricename\` should be the item's original price. The \`discountamount\` should be the discount applied to that item (0 if no discount).  Accurately identify discounts, even if they are on a separate line like "2.90-". Associate discounts with the correct item based on their proximity in the text.  The \`subtotal\`, \`tax\`, and \`total\` should reflect the final calculated amounts from the receipt. There should only be ONE itemname, price, and discountamount per object.

        Do not include any explanatory text or code; only provide the JSON output.
        `;

Another code snippet that is important is the Venmo link generation:

```const createVenmoLink = (amount, note, recipient) => {

  let formattedUsername = recipient;
  
  // Check if it's a phone number (contains numbers)
  if (/\d/.test(venmoUsername)) {
      // Remove +1 prefix if it exists
      formattedUsername = formattedUsername.replace(/^\+1-?/, '');
      
      // Remove leading 1 if it exists
      formattedUsername = formattedUsername.replace(/^1-?/, '');
      
      // Get only the last 10 digits plus any dashes between them
      const digits = formattedUsername.replace(/\D/g, '').slice(-10);
      if (digits.length === 10) {
          // Keep original dashes if they exist in the correct positions
          formattedUsername = formattedUsername.slice(-12); // account for possible dashes
      }
  }
  console.log(formattedUsername);
  return `venmo://paycharge?txn=pay&recipients=${formattedUsername}&amount=${amount}&note=${encodeURIComponent(note)}`;
};

6.4 User Interface

Photos of user interface

7. Testing

7.1 Test Scenarios

Test Scenarios

7.2 Alpha Test Results

Alpha Tests

7.3 Known Issues

Radio button bug Fake item generation bug

8. Future Plans

We hope to release this to the app store. Luke will spend extra time on this in the Spring and researching what we need to change in order to get it ready for the app store. Some of the things that we will look into changing is the way we use textract and Google AI. Both of these services become paid services once we reach a certain number of uses per minute or per month. One thing that we may look into is training our own model on the images that we have been testing on since they are all stored in our S3 bucket with the responses that Google AI is giving us so that in the future we do not need to rely on Google AI and pay for their services.

9. Project Retrospective

9.1 Reflection

  • What went well: Finished everything we included in our limited and ambitious scope. We made an end to end working mobile app.
  • What could be improved for a future project: Less reliance on things we may have to pay for in the future (Google AI/AWS Services)
  • What "lessons learned" will you carry forward with you from this experience: Before you start coding flush out how both the styling and architecture will work on a given page. Work in chunks, we first got the upload functionality then AWS Textract working then Google AI working. Doing necessary research into external tools was pivotal to our groups success when we first uploaded a receipt to ChatGPT it failed, but we kept looking for alternative solutions. Since we worked on a mobile app, factor in different screen sizes (use % in styling not fixed values!), Dribble is a great source for UI ideas.

9.2 Curriculum

  • Which topics were most useful for your senior project: Platform Based Computing (React Native Development/Expo Go/Passing Info Page to Page), Software Engineering (Agile Development/Teamwork), Applied Machine Learning (Implementing AI into Code)
  • Are there any additional topics you recommend for inclusion in the course: Researching and Connecting APIs to your Mobile App,

9.3 Advice

  • Before you decide on a project idea find a project idea that really pulls you (don't do it if you're half sold), if you intend to use a service that has an available AI tool that knows the docs definitely use it (we wouldn't have been able to connect AWS as easily without Amazon Q), before you code any DS structure or follow a styling path flush out all possible options.