TypeScript Audio Encoder Blueprint - brookcs3/BeaatsLoops GitHub Wiki
This guide shows how to replicate the Python librosa_encoder.py
functionality using TypeScript and ffmpeg
.
Goals
- Trim or loop audio to exactly 12 seconds
- Resample to 44.1kHz mono
- Normalize loudness via
ffmpeg
'sloudnorm
filter - Export as WAV
- Generate a simple mel-spectrogram array for frontend visuals
Implementation Outline
-
Preprocess with FFmpeg
- Convert to mono, resample, trim and normalize:
ffmpeg -y -i input.wav -ac 1 -ar 44100 -t 12 -filter:a loudnorm=I=-16:LRA=11:TP=-1 output.wav
-
Parse WAV
- Read the PCM samples from the processed file.
-
Compute Spectral Features
- Use a naive DFT to calculate magnitudes for each frame.
- Average the magnitudes into
nMels
buckets to approximate a mel‑spectrogram.
Example Script
backend/audio_encoder.ts
implements this flow. Run it with Bun:
bun backend/audio_encoder.ts input.wav processed.wav metadata.json
It writes the normalized clip and a JSON metadata file containing duration and an averaged mel-spectrogram array.