π Backβend Configuration - fuhui14/SWEN90017-2024-TAP GitHub Wiki
π Back-end Configuration
π Overview
This document outlines the configuration of the TAP back-end, which provides API endpoints for speech transcription and speaker identification. It is built using Django + Django Rest Framework (DRF), and supports asynchronous task processing using Celery and Redis.
π Table of Contents
- Project Structure
- Installation
- How to Run
- API Endpoints
- Technologies Used
- Database Setup
- Asynchronous Tasks
- Testing
π Project Structure
transcription_project/
β
βββ manage.py # Django management command
βββ requirements.txt # Backend dependencies
βββ .env # Environment variables file
βββ .gitignore # Git ignore file
β
βββ config/ # Project settings and configuration
β βββ __init__.py
β βββ settings.py # Django global settings (includes database settings)
β βββ urls.py # Main entry for routing API endpoints
β βββ asgi.py # ASGI for asynchronous support
β βββ wsgi.py # WSGI for deployment
β
βββ core/ # Core application (generic logic and shared utilities)
β βββ __init__.py
β βββ models.py # Generic models, if needed
β βββ views.py # Core API views logic (if necessary)
β βββ urls.py # Core app API routes
β βββ serializers.py # Data serialization for generic core responses
β
βββ transcription/ # Transcription app (handles transcription processes)
β βββ __init__.py
β βββ models.py # Transcription-related data models
β βββ views.py # Transcription API views logic
β βββ urls.py # Transcription app API routes
β βββ tasks.py # Asynchronous tasks (e.g., calling external speech recognition APIs)
β βββ transcribe_service.py # Logic for integrating external transcription services
β βββ serializers.py # Transcription data serialization
β
βββ speaker_identify/ # Speaker identification app
β βββ __init__.py
β βββ models.py # Speaker identification-related data models
β βββ views.py # Speaker identification API views logic
β βββ urls.py # Speaker identification API routes
β βββ identify_service.py # Logic for identifying speakers in audio files
β βββ serializers.py # Speaker identification data serialization
β
βββ api/ # General API (for non-specific or cross-app APIs)
β βββ __init__.py
β βββ urls.py # API global routes (including versioning if needed)
β βββ views.py # General API views
β
βββ tests/ # Unit and integration tests
βββ test_core.py # Core app tests
βββ test_transcription.py # Transcription app tests
βββ test_speaker_identify.py # Speaker identification app tests
π οΈ Installation
1. Clone the Repository
First, clone the project to your local machine:
git clone https://github.com/yourusername/transcription_project.git
cd transcription_project
2. Set up a Python Virtual Environment
Create a Python virtual environment to install dependencies:
python3 -m venv venv
source venv/bin/activate # For Windows: venv\Scripts\activate
3. Install Dependencies
Use `pip` to install the project dependencies listed in `requirements.txt`:
pip install -r requirements.txt
4. Environment Variables
Create a `.env` file in the root of your project and add the following environment variables:
SECRET_KEY=your_secret_key_here
DEBUG=True
DATABASE_URL=your_database_url_here # or use settings from 'settings.py'
CELERY_BROKER_URL=redis://localhost:6379/0 # For Redis
βΆοΈ How to Run
1. Database Setup
Before running the application, you need to configure the database and apply migrations.
PostgreSQL (Recommended)
Make sure you have PostgreSQL installed and running. Create a database named `transcription_db` or configure your database settings in `config/settings.py`.
Then, run the following commands:
python manage.py makemigrations
python manage.py migrate
2. Run Redis for Celery (Asynchronous Tasks)
For background tasks, you need to have Redis running:
redis-server
3. Run Celery Worker
In a separate terminal, run the Celery worker to handle asynchronous tasks:
celery -A config worker --loglevel=info
4. Run the Django Server
You can now run the Django development server:
python manage.py runserver
Access the server at `http://127.0.0.1:8000/\`.
π API Endpoints
Transcription API
- POST `/api/transcription/`: Upload an audio file and receive transcription text.
- GET `/api/transcription//`: Get the transcription result for a specific transcription.
Speaker Identification API
- POST `/api/speaker-identify/`: Upload an audio file and identify the speaker.
- GET `/api/speaker-identify//`: Get the speaker identification result.
βοΈ Technologies Used
- Django: Web framework for building the backend API.
- Django REST Framework (DRF): For creating RESTful APIs.
- Celery: For asynchronous task management (e.g., speech transcription, speaker identification).
- Redis: As a message broker for Celery.
- PostgreSQL: For database management.
- Google Speech-to-Text (or other services): For handling transcription (if using an external service).
- pyAudioAnalysis: For speaker identification (optional, if integrated).
ποΈ Database Setup
Database Configuration
The project uses Django's ORM (Object-Relational Mapping) for managing the database. By default, it is configured to use PostgreSQL, but you can change this in `config/settings.py` if needed.
Example for PostgreSQL:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'transcription_db',
'USER': 'your_db_user',
'PASSWORD': 'your_db_password',
'HOST': 'localhost',
'PORT': '5432',
}
}
If you are using another database, such as SQLite, you can update the configuration accordingly.
Running Migrations
After setting up your database connection, apply migrations to create the required tables:
python manage.py makemigrations
python manage.py migrate
π Asynchronous Tasks
To handle long-running tasks such as speech transcription and speaker identification, we use Celery.
Steps to Set Up Celery:
- Make sure Redis is installed and running as the message broker.
- Configure Celery in `config/celery.py`.
- Start the Celery worker:
celery -A config worker --loglevel=info
Celery tasks are defined in the `tasks.py` file of each app (e.g., `transcription/tasks.py`).
π§ͺ Testing
To run the project's test cases, use the following command:
python manage.py test
Test cases are located in the `tests/` directory, with separate tests for transcription and speaker identification functionalities.