🐍 Back‐end Configuration - fuhui14/SWEN90017-2024-TAP GitHub Wiki

🐍 Back-end Configuration

πŸ“ Overview

This document outlines the configuration of the TAP back-end, which provides API endpoints for speech transcription and speaker identification. It is built using Django + Django Rest Framework (DRF), and supports asynchronous task processing using Celery and Redis.

πŸ“‘ Table of Contents

  1. Project Structure
  2. Installation
  3. How to Run
  4. API Endpoints
  5. Technologies Used
  6. Database Setup
  7. Asynchronous Tasks
  8. Testing

πŸ“ Project Structure

transcription_project/
β”‚
β”œβ”€β”€ manage.py                    # Django management command
β”œβ”€β”€ requirements.txt             # Backend dependencies
β”œβ”€β”€ .env                         # Environment variables file
β”œβ”€β”€ .gitignore                   # Git ignore file
β”‚
β”œβ”€β”€ config/                      # Project settings and configuration
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ settings.py              # Django global settings (includes database settings)
β”‚   β”œβ”€β”€ urls.py                  # Main entry for routing API endpoints
β”‚   β”œβ”€β”€ asgi.py                  # ASGI for asynchronous support
β”‚   └── wsgi.py                  # WSGI for deployment
β”‚
β”œβ”€β”€ core/                        # Core application (generic logic and shared utilities)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ models.py                # Generic models, if needed
β”‚   β”œβ”€β”€ views.py                 # Core API views logic (if necessary)
β”‚   β”œβ”€β”€ urls.py                  # Core app API routes
β”‚   └── serializers.py           # Data serialization for generic core responses
β”‚
β”œβ”€β”€ transcription/               # Transcription app (handles transcription processes)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ models.py                # Transcription-related data models
β”‚   β”œβ”€β”€ views.py                 # Transcription API views logic
β”‚   β”œβ”€β”€ urls.py                  # Transcription app API routes
β”‚   β”œβ”€β”€ tasks.py                 # Asynchronous tasks (e.g., calling external speech recognition APIs)
β”‚   β”œβ”€β”€ transcribe_service.py    # Logic for integrating external transcription services
β”‚   └── serializers.py           # Transcription data serialization
β”‚
β”œβ”€β”€ speaker_identify/            # Speaker identification app
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ models.py                # Speaker identification-related data models
β”‚   β”œβ”€β”€ views.py                 # Speaker identification API views logic
β”‚   β”œβ”€β”€ urls.py                  # Speaker identification API routes
β”‚   β”œβ”€β”€ identify_service.py      # Logic for identifying speakers in audio files
β”‚   └── serializers.py           # Speaker identification data serialization
β”‚
β”œβ”€β”€ api/                         # General API (for non-specific or cross-app APIs)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ urls.py                  # API global routes (including versioning if needed)
β”‚   └── views.py                 # General API views
β”‚
└── tests/                       # Unit and integration tests
    β”œβ”€β”€ test_core.py             # Core app tests
    β”œβ”€β”€ test_transcription.py    # Transcription app tests
    └── test_speaker_identify.py # Speaker identification app tests

πŸ› οΈ Installation

1. Clone the Repository

First, clone the project to your local machine:

git clone https://github.com/yourusername/transcription_project.git
cd transcription_project

2. Set up a Python Virtual Environment

Create a Python virtual environment to install dependencies:

python3 -m venv venv
source venv/bin/activate  # For Windows: venv\Scripts\activate

3. Install Dependencies

Use `pip` to install the project dependencies listed in `requirements.txt`:

pip install -r requirements.txt

4. Environment Variables

Create a `.env` file in the root of your project and add the following environment variables:

SECRET_KEY=your_secret_key_here
DEBUG=True
DATABASE_URL=your_database_url_here  # or use settings from 'settings.py'
CELERY_BROKER_URL=redis://localhost:6379/0  # For Redis

▢️ How to Run

1. Database Setup

Before running the application, you need to configure the database and apply migrations.

PostgreSQL (Recommended)

Make sure you have PostgreSQL installed and running. Create a database named `transcription_db` or configure your database settings in `config/settings.py`.

Then, run the following commands:

python manage.py makemigrations
python manage.py migrate

2. Run Redis for Celery (Asynchronous Tasks)

For background tasks, you need to have Redis running:

redis-server

3. Run Celery Worker

In a separate terminal, run the Celery worker to handle asynchronous tasks:

celery -A config worker --loglevel=info

4. Run the Django Server

You can now run the Django development server:

python manage.py runserver

Access the server at `http://127.0.0.1:8000/\`.

πŸ”Œ API Endpoints

Transcription API

  • POST `/api/transcription/`: Upload an audio file and receive transcription text.
  • GET `/api/transcription//`: Get the transcription result for a specific transcription.

Speaker Identification API

  • POST `/api/speaker-identify/`: Upload an audio file and identify the speaker.
  • GET `/api/speaker-identify//`: Get the speaker identification result.

βš™οΈ Technologies Used

  • Django: Web framework for building the backend API.
  • Django REST Framework (DRF): For creating RESTful APIs.
  • Celery: For asynchronous task management (e.g., speech transcription, speaker identification).
  • Redis: As a message broker for Celery.
  • PostgreSQL: For database management.
  • Google Speech-to-Text (or other services): For handling transcription (if using an external service).
  • pyAudioAnalysis: For speaker identification (optional, if integrated).

πŸ—ƒοΈ Database Setup

Database Configuration

The project uses Django's ORM (Object-Relational Mapping) for managing the database. By default, it is configured to use PostgreSQL, but you can change this in `config/settings.py` if needed.

Example for PostgreSQL:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'transcription_db',
        'USER': 'your_db_user',
        'PASSWORD': 'your_db_password',
        'HOST': 'localhost',
        'PORT': '5432',
    }
}

If you are using another database, such as SQLite, you can update the configuration accordingly.

Running Migrations

After setting up your database connection, apply migrations to create the required tables:

python manage.py makemigrations
python manage.py migrate

πŸ”„ Asynchronous Tasks

To handle long-running tasks such as speech transcription and speaker identification, we use Celery.

Steps to Set Up Celery:

  1. Make sure Redis is installed and running as the message broker.
  2. Configure Celery in `config/celery.py`.
  3. Start the Celery worker:
    celery -A config worker --loglevel=info
    

Celery tasks are defined in the `tasks.py` file of each app (e.g., `transcription/tasks.py`).

πŸ§ͺ Testing

To run the project's test cases, use the following command:

python manage.py test

Test cases are located in the `tests/` directory, with separate tests for transcription and speaker identification functionalities.