SRS_2 - COS301-SE-2025/Hands-Up GitHub Wiki
Imagine you are in a busy shopping centre and someone tries to get your attention — not with words, but through a series of hand gestures. You watch, trying to decipher the movements, but remain unsure. Gradually, you recognize it as sign language and find yourself wishing you understood, even just a little. You wish there were a way to translate their gestures. As TMKDT, we aim to turn those wishes into reality with Hands UP. A powerful application to help you — one sign at a time!
Hands UP is an innovative application that bridges the communication gap between signers and non-signers. Using advanced AI technology, the application detects and translates sign language in real-time through the device's camera, converting signs into both text and spoken language without significant delays. Beyond translation, it also serves as an interactive learning platform with structured lessons and immediate feedback on signing accuracy.

Note: A user refers to anyone using the application, while a learner specifically refers to someone who has created an account.
User Story | Acceptance Criteria | Definition of Done |
---|---|---|
Registration As a user, I want to create an account with my name, username, email and password So that I can access the sign language learning curriculum |
- All input fields are required - Email address is validated for correct format - Password must be at least 8 characters and include a special character - Duplicate usernames and emails are not allowed - Terms and Conditions are accepted |
My account has been created successfully and I am automatically logged in and redirected to the learning homepage. |
Login As a learner, I want to login into my account using my email and password So that I can continue with the sign language learning curriculum |
- All input fields are required - User is authenticated against stored credentials - Appropriate error message is shown for invalid login attempts |
I have successfully logged in and automatically redirected to the learning homepage. |
Forgot Password As a learner, I want to reset my password via a link sent to my email So that I can regain access to my account if I have forgotten my password |
- A reset link is sent to the email address if it exists - Link redirects to a secure page to set a new password - User receives confirmation of password change |
My password is updated and I can now log in with the new credentials. |
Manage Profile As a learner, I want to update my name, username, email and password So that my account information remains valid and up to date |
- All fields are editable - Input is validated before saving - Confirmation message is shown before updating |
My changes are saved and I can see the updated details. |
Learning Curriculum As a learner, I want to have a structured learning plan So that I can learn more effectively |
- There is a clear, well-organized curriculum in place - Lessons are mapped to specific goals and topics - Users can view their position in the learning path |
I can view a clear, structured curriculum that guides me from basic to advanced topics. |
Curriculum Lessons by Category As a learner, I want to access lessons grouped by category So that I can focus on specific topics systematically |
- Lessons are organized under clearly defined categories - Each lesson covers a specific goal or skill within the category - Users can track their progress within each lesson - Categories are locked until the previous category is completed |
I can access and complete lessons one category at a time, unlocking the next only after finishing the previous, to ensure structured and progressive learning. |
Quiz at the End of Each Category As a learner, I want to take a quiz after completing each category So that I can assess my understanding of what I have learned |
- Each category ends with a quiz covering the key concepts - Quiz questions are relevant and vary in format - Users receive immediate feedback and scores - Users can review correct answers and explanations |
I can complete a quiz after each category to test my knowledge and identify areas for improvement. |
Skill Adaptation As a learner, I want the application to adapt to my level of signing experience So that I can learn at a comfortable pace and skip content I already know |
- Learners are required to take a placement test to indicate their signing experience - If the learner is a beginner, the placement test is not required and the learner is placed at the start of the curriculum - If the learner has prior knowledge, lessons covering already known content are skipped - After placement, learners must follow the structured curriculum from their assigned level onward without skipping future lessons |
I start learning from a level that suits me, without wasting time on things I already know. |
Learning Progress As a learner, I want to view my achievements, day streak and total XP So that I can track my progress and stay motivated |
- Achievements, day streak and total XP are displayed - Data is updated after a task is completed |
I am able to view my learning progress. |
Feedback As a learner, I want to receive feedback on my signing accuracy and improvements I can make So that I can sign more accurately |
- Feedback must be valid and specific to the current learning scenario - Feedback must be provided immediately - User must understand what mistake was made and how to correct it - Suggestions must include correct gestures and movements |
The system provides accurate and timely feedback when I perform a sign, clearly explaining what was wrong, why it was incorrect and how to fix it. |
Reminders As a learner, I want to receive reminders to continue learning So that I can maintain my streak and stay motivated |
- Learner receives daily or scheduled reminders via notifications - Notifications can be turned off |
I receive friendly reminders to continue learning and maintain my daily streak. |
Review Lessons As a learner, I want to review lessons that I have completed So that I can revise what I have already learnt |
- There is a dedicated and easily accessible review page - Review content is quick, focused and does not affect lesson progress |
I can quickly find specific content I want to revise without going through the full lessons again. |
Camera Input for Translation As a user, I want to input signs using my device's camera So that it can be translated into spoken and written language in real time |
- Camera access is granted and working - Signs are detected and captured |
I have successfully entered a sign via the camera for translation. |
Upload Input As a user, I want to upload images and videos from my device So that they can be translated into spoken and written language in real time |
- User has uploaded a supported file format | The media I have chosen has been successfully uploaded and is now being processed. |
Translate Fingerspelling As a user, I want to sign individual letters So that they can be translated into spoken and written language in real time |
- System recognises the individual signed letters from the input - Signs are translated accurately - Signs are translated in real time - Translated signs are output as text and audio - Letters appear in the correct order if multiple are signed |
The letters I have signed are accurately translated and displayed immediately. |
Translate Words As a user, I want to sign individual words So that they can be translated into spoken and written language in real time |
- System recognises each signed word from the input - Words are translated accurately - Words are translated in real time - Translated words are output as text and audio - Output matches the intended meaning |
The words I have signed are accurately translated and displayed immediately. |
Translate Phrases As a user, I want to sign short and commonly used phrases So that they can be translated into spoken and written language in real time |
- System recognises the signs from the input - Signs are translated accurately - Signs are translated in real time - Translated signs are output as text and audio - Output matches the intended meaning |
The phrases I have signed are accurately translated and displayed immediately. |
Translate Sentences As a user, I want to sign full sentences So that they can be translated into spoken and written language in real time |
- System recognises the signs from the input - Signs are translated accurately - Signs are translated in real time - Translated signs are output as text and audio - Output matches the intended meaning |
The sentences I have signed are accurately translated and displayed immediately. |

FR 1: Application must allow user profile creation and customization
FR 1.1: Users must be able to create and manage a personal account
FR 1.2: Users must be able to log into their personal account
FR 1.3: Users must be able to set learning goals and preferences
FR 2: Application must allow users to input visual data
FR 2.1: The system must allow users to provide visual input via device cameras
FR 2.2: The system must allow users to upload media (images/videos)
FR 3: Application must be able to translate sign language
FR 3.1: Application must support translation at varying linguistic levels
FR 3.1.1: Application must support translation of fingerspelling
FR 3.1.2: Application must support translation of individual words
FR 3.1.3: Application must support translation of phrases
FR 3.1.4: Application must support translation of full sentences
FR 3.2: Application must provide text output
FR 3.3: Application must provide audio output
FR 4: Application must provide a structured curriculum
FR 4.1: There should be an overview of the entire course
FR 4.1.1: The course overview must clearly indicate completed, in-progress and locked lessons
FR 4.2: Lessons should be organized progressively, from basic to advanced topics
FR 4.2.1: Each lesson must unlock only after prerequisites are completed
FR 4.2.2: Categories should include key areas such as fingerspelling (letters), vocabulary (words and phrases) and sentence construction
FR 4.2.3: There must be thematic levels (e.g., greetings, food, directions) for contextual learning
FR 4.3: Each lesson should include clear objectives, interactive content and practice exercises
FR 5: Application should accommodate learners of all signing experience
FR 5.1: Beginner learners should be introduced to signing at a steady pace
FR 5.2: Advanced learners should be able to skip beginner lessons
FR 6: Application must support real-time feedback and correction
FR 6.1: Users must receive immediate feedback on incorrect signs
FR 6.2: Application must suggest correct hand gestures or movements
FR 7: Users should see their learning progress
FR 7.1: Application should provide comprehensive progress analysis
FR 7.1.1: The app must show the user's daily streak, total XP and achievements
FR 7.2: Progress should be presented in a simple, graphical manner
FR 8: Application must provide a separate review section for previously learned content
FR 8.1: Review lessons must be organized by category for quick reference
FR 8.2: Review lessons must be concise and focused on reinforcement, not full re-teaching
FR 8.3: Review lessons must not affect learner progress
FR 9: Application should support multiple dialects of sign language
FR 9.1: Users must change their preferred dialect in settings
FR 9.2: Content, translation and feedback should adapt based on the selected dialect
FR 10: Application must provide a built-in game to test sign language knowledge apart from lessons
FR 10.1: The game should include challenges
FR 10.2: Scores awarded should contribute to overall learner progress
FR 10.3: Users should be able to replay the game at anytime
Design Strategy: Decomposition
Justification: Our system focuses on three core functionalities: real-time sign language translation, interactive learning with feedback and progress tracking. The decomposition design strategy allows us to break down these core functionalities into smaller, manageable and well-defined components. This approach enhances clarity, simplifies implementation and makes the overall system easier to test and maintain over time.
Note: At this stage of the project, our top two architectural style choices are Layered Architecture and Hexagonal Architecture. A final decision between the two will be made as development progresses.
Layered Architecture
This architecture structures the system into separate layers that are each responsible for a specific concern.
Within the Hands UP system, the Presentation Layer would handle the user interface, including displaying translation results and managing the sign language learning curriculum.
The Application Layer would process the core logic such as handling user inputs, communicating with the AI model for real-time sign translation, and managing the lesson flow for the learning curriculum.
The Data Access Layer would be responsible for storing and retrieving data, including user details and learning progress records.
Hexagonal Architecture
This architecture allows loosely-coupled, interchangeable components to interact with the system through ports and adapters.
Within the Hands UP system, the core application would include the main logic for both the sign language translation and the learning curriculum.
The primary ports would provide interfaces for handling camera input, accessing the remote AI model, managing user progress, handling session management and interacting with the user interface.
The corresponding adapters would include the browser-based camera API adapter, the AI model adapter, the UI adapter, the database adapter , and the session storage adapter (for managing cookies).
Note: Quality requirements are ordered from highest to lowest priority.
Usability focuses on how intuitive, efficient and supportive the system is for users across all interactions: helping them learn, translate, navigate, and complete tasks with confidence and minimal confusion.
To ensure the system usability, it will follow these parameters:
-
System Learning Features
- Requirement: When a user requires assistance in any page of the application, the system shall provide a clearly visible and accessible help section that can be opened within one click or tap.
- Pass Criteria: At least 95% of users can successfully locate and access the help page with no more than one click or tap within 1 minute during usability testing.
- Test Method: End-to-end testing to verify that the Help icon is visible and accessible on all pages and is clickable or tappable. Additionally, the test measures the number of interactions required to open the Help page as well as the time taken to load it.
-
System Efficiency
- Requirement: When a user is performing a translation and navigates to any other page (e.g., Home, Learn, Profile or Help) to perform a different task, the system shall preserve the translation state so that when returning to the Translate page, the user can view their translations without data loss.
- Pass Criteria: At least 98% of users can resume their translation without data loss after navigating away and returning during usability testing.
- Test Method: End-to-end testing to simulate a user performing a translation, navigating to another page (Home, Learn, Profile or Help) and returning to the Translate page. The test verifies that the translation state is preserved without data loss or reset.
-
Minimising User Errors
- Requirement: When a user attempts to navigate away from an incomplete lesson or submits invalid input, the system shall provide clear, informative warnings or error messages and require explicit user confirmation before proceeding with any action that could cause data loss.
- Pass Criteria: At least 95% of users must be able to recognise, correctly respond to and regard the warnings or error messages as clear and helpful during usability testing.
- Test Method: Combination of unit testing, end-to-end testing and user surveys. Unit testing to verify that warning and error handling logic triggers under appropriate conditions. End-to-end testing to ensure that the correct warning or error messages are actually displayed in the user interface. User surveys to confirm that the messages are clear, informative and helpful.
-
Adapting System to User Needs
- Requirement: When a user wants to start their next lesson, the system shall allow them to access it directly from the Home page using no more than one click or tap, while still providing the option to navigate through the Learn page to select lessons manually.
- Pass Criteria: At least 99.9% of users must be able to start their next lesson directly from the Home page within one click or tap during usability testing.
- Test Method: End-to-end testing to ensure that users can start their next lesson directly from the Home page within one click or tap. The test verifies that the navigation leads to the correct lesson and records the number of interactions to get to this page.
-
Confidence and Satisfaction
- Requirement: When a user is engaged in a lesson, the system shall provide real-time feedback and update the user’s progress and correctness after the lesson is completed.
- Pass Criteria: At least 99.9% of users must receive real-time feedback during lessons and have their progress correctly updated after lesson completion during usability testing.
- Test Method: Combination of end-to-end testing and user surveys. End-to-end testing to confirm that real-time feedback is provided during lessons and that user progress is correctly updated after lesson completion. User surveys to assess subjective confidence and satisfaction levels.
Performance will be based on how well resources are controlled and managed.
To ensure system performance, it will follow these parameters:
- Real-Time Translation Response: An average translation request must complete within 500ms.
- System Responsiveness Under Load: When multiple users are accessing the system simultaneously, the system shall maintain normal response times for all user interactions including page loads, translation and lesson navigation.
- Memory and Resource Management: When the system processes sign language translations and manages user sessions, it shall not consume more than 512mb of device memory or 50% of available CPU resources.
Availability focuses on how well the system detects faults, how well it recovers from faults and how well it prevents faults from taking place.
To ensure system availability, it will follow these parameters:
- Uptime: The system has at least 99% uptime.
- Monitoring and Alerting: When a system health indicators deviate from normal parameters, 99% of issues must automatically trigger alerts within 1 minute of issue detection.
- Data Backup and Recovery: When the system data needs to be recovered due to failure or corruption, the system shall restore user data with zero data loss for changes completed.
Security focuses on how well the system detects attacks, resists attacks.
To ensure system security, it will follow these parameters:
- Security Monitoring and Incident Response: When security threats are detected, the system shall log all security events and trigger appropriate response measures within 5 minutes of detection.
- Data Encryption and Communication Security: When data is transmitted between the client and server, the system shall use HTTPS with TLS 1.3 encryption for all communications including API calls and file uploads.
Maintainability focuses on how easily the system can be updated, debugged, and enhanced while maintaining code quality and deployment efficiency.
To ensure system maintainability, it will follow these parameters:
- Code Quality and Coverage: When new code is added to the system, it shall maintain at least 80% test coverage and pass all static code analysis checks.
- Deployment Efficiency: When deploying system updates the deployment process shall complete within 10 minutes and include automated rollback capabilities in case of failure.
- Bug Resolution: When bugs are reported, critical issues shall be resolved within 2 days and non critical ones should be resolved within 7 days of confirmation.
- Progressive Web App (PWA): Must support offline functionality and provide an app-like user experience.
- Real-Time Translation: Response time must not exceed 200ms.
- Cross-Device Compatibility: Accessible via all modern browsers on desktop and mobile devices.
- Zero Setup: No additional installation or configuration required; the app must be usable immediately upon access.
- Budget Restriction: Must use free-tier or open-source solutions.
- Cloud-First Approach: All services must be remotely hosted.
- High Availability: App must maintain at least 99% uptime.
- Web-Only Platform: No native mobile apps; browser-based only.
- Camera Access: Exclusively through browser based APIs.
- AI/ML Model: Must be hosted remotely.
- Security Compliance: Must adhere to industry-standard security and privacy regulations.
- Performance: Must support concurrent usage by multiple users without performance degradation.
A relational, SQL database is needed to store structured data such as user details, learning curriculum, lessons, user progress and user results for each lesson. A relational database ensures data consistency through relationships and constraints, making it easier to manage and query complex data.
PostgreSQL | MySQL | Oracle Database |
---|---|---|
Pros: Open source, can handle complex queries and relationships, strong ACID compliance. Cons: Slightly more setup required than MySQL. |
Pros: Easy to setup and learn, good performance for reading data, well supported on most hosting services. Cons: Slightly weaker transaction handling compared to PostgreSQL. |
Pros: very powerful and reliable for huge systems, ACID compliant. Cons: Expensive and requires licensing, less commonly used with Javascript. |
Final Choice: PostgreSQL
Justification: Handles structured and relational data very well, which fits the basic need. It has strong ACID compliance to ensure that all data stays consistent and reliable. Integrates easily with Node.js and Python, making data access and management smooth. It is widely supported by various hosting platforms, which makes deployment, scaling, and maintenance easier in the long run.
The AI model takes in images and videos of sign language and processes them to recognise the signs being shown. It then translates these signs into text. The model should be accurate, efficient and scalable to allow multiple sign languages.
Javascript | Java | Python |
---|---|---|
Pros: Runs directly in the browser, models can run on any device, useful for real-time AI models. Cons: Not suitable for high-performance or complex AI models involving heavy image/video processing. |
Pros: Good for large scale systems. Cons: Limited AI/ML libraries for computer vision. |
Pros: Best AI/ML support for computer vision and deep learning, offers a wide range of libraries for image/video processing, simplifies model training and testing. Cons: Slower runtime. |
Final Choice: Python
Justification Offers many libraries and frameworks for training and evaluating models. Works well with TensorFlow, OpenCV and MediaPipe for data preprocessing, training, and visualisation, therefore making it easier to build a custom CNN with high accuracy.
The frontend will be a Progressive Web App (PWA) that allows users to translate sign language to text and learn sign language. The UI should be easy to use, consistent and responsive across different devices.
Plain HTML & CSS | React | Vue.js |
---|---|---|
Pros: full control over the app’s behaviour, lightweight and fast to load. Cons: Slower development, harder to maintain and scale. |
Pros: Good for consistency across devices, provides UI components, supports PWAs, works well with real-time features like camera access. Cons: Code can get a bit long and complicated if not organized well. |
Pros: Easy and fast setup, tools for managing app data, built-in PWA plugin. Cons: Does not have as many UI libraries as React, might not be to handle complex features or multiple users. |
Final Choice: React
Justification: Makes building a responsive and easy-to-use app across different devices straightforward. It has lots of ready-made UI components, which helps speed up development. Support for PWAs means the app can work offline and be installed easily. Additionally, it handles real-time features like camera access well, which is important for translating sign language live.
The backend will handle data processing, model inference and user management. The backend will translate sign language into text using the AI model, manage user profiles and progress, provide a sign language learning curriculum and retrieve translation and learning data. The backend needs to be reliable, scalable, and fast enough to process data in real-time or near real-time, ensuring smooth user interaction.
Node.js | Python (FastAPI) | Java (Spring Boot) |
---|---|---|
Pros: Works well with React frontend since both use Javascript, can handle many concurrent users and real-time interactions, efficient data handling through database libraries, can call Python AI model for inference. Cons: Not ideal for local AI tasks. |
Pros: Direct integration with Python AI model, high performance and easy API building, can handle multiple simultaneous requests. Cons: Scaling to very large user bases may require additional setup. |
Pros: Very scalable for large systems, strong type safety and reliability, support for structured data handling and user management. Cons: integrating Python AI model requires separate services and increased overhead, more setup required than other two options. |
Final Choice: Node.js
Justification Makes development and data sharing between frontend and backend easier. Handles many users and real-time interactions smoothly, which fits the need for fast and responsive user experience. Has a strong environment for building APIs and working with PostgreSQL. Since the AI model is built in Python, Node.js can easily integrate with it by calling the FastAPI service when model inference is needed, keeping the system modular and flexible for future updates.
The system will be hosted on a cloud platform to ensure scalability, reliability and easy access for users across different devices. Since the application is a PWA, hosting must support fast, responsive loading and allow features like offline access and real-time updates. The hosting solution should provide high availability, data security, and the ability to scale as the number of users and system complexity grows.
AWS | Google Cloud Platform | Microsoft Azure |
---|---|---|
Pros: Huge range of services, many data centers ensuring low latency and high availability, offers free tier. Cons: Complex to set up and learn for beginners. |
Pros: Great integration with AL/ML tools, simple interface, strong support for containerized apps. Cons: Only offers 3 months 'free'. |
Pros: Strong integration with Microsoft tools, wide range of services for AI, database and app hosting. Cons: Difficult to learn, expensive, not as advanced as AWS or GCP. |
Final Choice: AWS
Justification Offers a wide range of reliable and scalable services that fit the app’s needs. It supports running the Python AI model, Node.js backend, and PostgreSQL database smoothly. Has a global network of data centers, which helps keep the app fast and available to users. It offers tools for security, monitoring and scaling to make managing the system easier as it grows. Although it can be complex, its flexibility and comprehensive features make it ideal for hosting the application.