๐Ÿ  Transcription Aide Platform (TAP) - fuhui14/SWEN90017-2024-TAP GitHub Wiki

Transcription Aide Platform (TAP)

๐Ÿ“˜ Project Overview

๐Ÿ”น Introduction

This project aims to develop a transcription platform that operates in a local environment using OpenAI's Whisper software. The platform is designed for a team working within a secure local area network (LAN), allowing team members to upload audio files for transcription without requiring user login.

๐Ÿงฉ Key Components

1. Web Interface

  • ๐ŸŽง File Upload: Simple drag-and-drop upload through a web interface.
  • ๐Ÿ” User Simplicity: No account creation or login is required, streamlining the workflow for internal team members.
  • โœ‰๏ธ Email Input: Users can input email address to receive transcription results.
  • ๐Ÿ•“ History View & Expiry: View past files and their expiry dates (WIP).

2. Local Machine Execution

  • ๐Ÿ—๏ธ Transcription Engine: The local machine hosts and runs OpenAI's Whisper software to transcribe uploaded audio files.
  • ๐Ÿ—ฃ๏ธ Speaker Identification: Supports diarisation to differentiate multiple speakers.

โš™๏ธ Technical Overview

๐Ÿงฑ System Architecture

The platform is built with:

  • Frontend: ReactJS (for user interactions)
  • Backend: Django (REST API to handle upload and processing)
  • Speech Processing: Whisper for transcription + diarisation module

๐Ÿš€ Usage Flow

  1. User accesses the platform on the local network.
  2. User uploads an audio file (e.g., .wav or .mp3).
  3. Optionally, the user enters an email address.
  4. The backend receives the audio and sends it to Whisper for transcription.
  5. Once complete:
    • The transcription result is saved locally.
    • If an email was provided, the result is sent via email.
  6. The interface shows a history of previously uploaded and processed files.

๐Ÿงช Current Features (MVP Scope)

  • โœ… Audio file upload
  • โœ… Transcription using Whisper
  • โœ… Optional email notification with result
  • โœ… Basic UI (no login)
  • ๐Ÿ•“ History view (partially implemented)
  • โŒ› File expiry logic (to be developed)

๐Ÿง  Changelog (Release Notes)

Sprint 1

  • Initialized the file structure
  • Added meetings minutes during sprint 1
  • Create persona with explanation documentation
  • Implement user stories
  • Design the motivation model
  • Add Acceptance Criteria
  • Build low-fidelity & high-fidelity prototype with explanation documentations
  • Make technology selection

Sprint 2

  • Create system architecture diagrams
  • Class diagram
  • Use case diagram
  • Sequence diagram
  • Component diagram
  • Domain diagram
  • Deployment diagram
  • Activity diagram
  • ER diagram
  • Acknowledge speaker library
  • Build development environment configuration
  • Front End
  • Back End
  • Add risk management document
  • Create communication plan
  • Add mood board for high fidelity prototype

Sprint 3

  • Develop user stories for the priority of "Must Have"
  • Separate development of front end and back end
  • Relevant tests
  • Integrate back end and front end

Sprint 4

Sprint 5

Sprint 6

๐Ÿ”— Related Links

For technical setup and deployment instructions, please see: Setup Guide