TypeScript Audio Encoder Blueprint - brookcs3/BeaatsLoops GitHub Wiki

This guide shows how to replicate the Python librosa_encoder.py functionality using TypeScript and ffmpeg.

Goals

  • Trim or loop audio to exactly 12 seconds
  • Resample to 44.1kHz mono
  • Normalize loudness via ffmpeg's loudnorm filter
  • Export as WAV
  • Generate a simple mel-spectrogram array for frontend visuals

Implementation Outline

  1. Preprocess with FFmpeg

    • Convert to mono, resample, trim and normalize:
    ffmpeg -y -i input.wav -ac 1 -ar 44100 -t 12 -filter:a loudnorm=I=-16:LRA=11:TP=-1 output.wav
    
  2. Parse WAV

    • Read the PCM samples from the processed file.
  3. Compute Spectral Features

    • Use a naive DFT to calculate magnitudes for each frame.
    • Average the magnitudes into nMels buckets to approximate a mel‑spectrogram.

Example Script

backend/audio_encoder.ts implements this flow. Run it with Bun:

bun backend/audio_encoder.ts input.wav processed.wav metadata.json

It writes the normalized clip and a JSON metadata file containing duration and an averaged mel-spectrogram array.