Image Upload Guide - jra3/mulm GitHub Wiki

Image Upload Guide

This guide explains the image upload system used in Mulm for submission photos. Images are stored in Cloudflare R2 (S3-compatible object storage) and processed using Sharp for optimization.

Overview

The image upload system provides:

  • Multiple size variants (original, medium, thumbnail) for optimal performance
  • Magic byte validation to prevent malicious file uploads
  • EXIF stripping for privacy and security
  • Real-time progress tracking via Server-Sent Events (SSE)
  • Automatic cleanup on deletion or transaction failure
  • Rate limiting to prevent abuse

Storage: Cloudflare R2 (S3-compatible) Processing: Sharp (libvips-based image processing) Upload Limits: 10MB per file, 5 files per submission

Architecture

sequenceDiagram
    participant User
    participant Browser
    participant API
    participant Sharp
    participant R2
    participant DB

    User->>Browser: Select/Drag images
    Browser->>API: POST /api/upload/image (multipart)
    API->>API: Validate auth & config

    loop For each file
        API->>Sharp: Validate magic bytes
        Sharp-->>API: Valid/Invalid
        API->>Sharp: Process (original, medium, thumb)
        Sharp-->>API: 3 size variants
        API->>R2: Upload 3 variants
        R2-->>API: Public URLs
    end

    API->>DB: Update submission.images (JSON)
    DB-->>API: Success
    API-->>Browser: Image metadata
    Browser->>Browser: Update preview grid

    Note over Browser,API: Real-time progress via SSE
    Browser->>API: GET /api/upload/progress/:id
    API-->>Browser: SSE: {stage, percent, message}
Loading

Configuration

Required Config

Images must be configured in config.json:

{
  "s3Url": "https://<account-id>.r2.cloudflarestorage.com",
  "s3AccessKeyId": "your-access-key-id",
  "s3Secret": "your-secret-access-key",
  "s3Bucket": "mulm-uploads",
  "r2PublicUrl": "https://uploads.your-domain.com"
}

Environment Variables (Optional)

Config can be overridden with environment variables:

R2_ENDPOINT=https://<account-id>.r2.cloudflarestorage.com
R2_ACCESS_KEY_ID=your-access-key-id
R2_SECRET_ACCESS_KEY=your-secret-access-key
R2_BUCKET_NAME=mulm-uploads
R2_PUBLIC_URL=https://uploads.your-domain.com

Production: Store secrets in /mnt/basny-data/app/config/config.production.json (never in code or environment variables).

Cloudflare R2 Setup

1. Create R2 bucket:

# Via Cloudflare dashboard:
# Storage & Databases > R2 > Create bucket
# Name: mulm-uploads
# Location: Automatic (recommended)

2. Generate API credentials:

# Dashboard: R2 > Manage R2 API Tokens
# Create API Token with:
# - Read & Write permissions
# - Scoped to mulm-uploads bucket

3. Configure public access (optional):

# Dashboard: R2 > mulm-uploads > Settings > Public Access
# Enable public bucket access OR
# Connect custom domain for cleaner URLs

Custom Domain Example:

  • R2 Bucket: mulm-uploads
  • Custom Domain: uploads.bap.basny.org
  • Public URL: https://uploads.bap.basny.org/

Upload Flow

Frontend (Pug Template)

File: src/views/bapForm/imageUpload.pug

Features:

  • Drag-and-drop zone
  • File input fallback
  • Real-time progress bar (SSE)
  • Image preview grid with delete buttons
  • Hidden field stores image metadata as JSON

User Workflow:

flowchart TD
    A[User opens submission form] --> B{Has existing images?}
    B -->|Yes| C[Display preview grid]
    B -->|No| D[Show upload zone]
    C --> D
    D --> E[User drags/clicks to select images]
    E --> F{File validation}
    F -->|Invalid| G[Show error message]
    F -->|Valid| H[Start upload]
    H --> I[Show progress bar]
    I --> J[SSE updates progress]
    J --> K{Upload complete?}
    K -->|Error| L[Show error details]
    K -->|Success| M[Add to preview grid]
    M --> N[Update hidden field]
    N --> O{More to upload?}
    O -->|Yes| E
    O -->|No| P[Form ready to submit]
Loading

Backend (API Routes)

File: src/routes/api/upload.ts

POST /api/upload/image

Upload and process images for submission.

Request:

  • Method: POST
  • Content-Type: multipart/form-data
  • Auth: Required (session-based)
  • Rate Limit: 5 req/sec per IP (burst 10)

Request Body:

Field Type Required Description
images File[] Yes Up to 5 image files (JPEG, PNG, WebP)
uploadId string No Unique upload ID for progress tracking
submissionId number No Submission ID to associate images with

Response (Success):

{
  "success": true,
  "uploadId": "upload_1234567890_abc123",
  "images": [
    {
      "key": "submissions/123/456/1234567890-abc123-original.webp",
      "url": "https://uploads.bap.basny.org/submissions/123/456/1234567890-abc123-original.webp",
      "size": 245678,
      "uploadedAt": "2025-10-07T14:30:00.000Z",
      "contentType": "image/webp"
    }
  ],
  "errors": [] // Optional: partial failures
}

Response (Error):

{
  "error": "Upload failed",
  "message": "Invalid content type"
}

Processing Steps:

  1. Validation

    • Check authentication
    • Verify R2 is configured
    • Check file count (max 5)
    • Validate MIME types (preliminary)
  2. Image Processing (per file)

    • Validate magic bytes (JPEG: FF D8 FF, PNG: 89 50 4E 47, WebP: RIFF...WEBP)
    • Check dimensions (min: 400x400, max: 4000x4000)
    • Generate 3 size variants:
      • Original: Max 2048px, quality 85 (JPEG) or 80 (WebP)
      • Medium: 800px wide, quality 85/80
      • Thumbnail: 150x150 crop, quality 80/75
    • Strip EXIF data (auto-rotate, then remove metadata)
  3. Upload to R2

    • Generate unique keys: submissions/{memberId}/{submissionId}/{timestamp}-{hash}-{size}.{ext}
    • Upload all 3 variants in parallel
    • Get public URLs
  4. Database Update (if submissionId provided)

    • Read existing submissions.images (JSON array)
    • Append new image metadata
    • Update in transaction (atomic)
    • Rollback: Delete R2 uploads if database update fails

GET /api/upload/progress/:uploadId

Server-Sent Events endpoint for real-time upload progress.

Request:

  • Method: GET
  • Auth: Required
  • Rate Limit: 30 req/sec per IP

Response (SSE stream):

data: {"stage":"connected","percent":0,"message":"Connected"}

data: {"stage":"processing","percent":10,"message":"Processing image 1 of 3"}

data: {"stage":"uploading","percent":40,"message":"Uploading image 1 to storage..."}

data: {"stage":"complete","percent":100,"message":"Upload complete"}

Stages:

  • connected - Initial connection
  • processing - Image processing (magic byte validation, resizing)
  • uploading - Uploading to R2
  • complete - Upload finished
  • error - Upload failed

Client Example:

const eventSource = new EventSource('/api/upload/progress/' + uploadId);
eventSource.onmessage = function(event) {
  const data = JSON.parse(event.data);
  updateProgressBar(data.percent);
  updateStatusMessage(data.message);

  if (data.stage === 'complete' || data.stage === 'error') {
    eventSource.close();
  }
};

DELETE /api/upload/image/:key

Delete an uploaded image.

Request:

  • Method: DELETE
  • Auth: Required (must own the image)
  • Rate Limit: 10 req/sec per IP

Path Parameters:

Parameter Type Description
key string R2 object key (from image metadata)

Response (Success):

{
  "success": true
}

Response (Error):

{
  "error": "Image not found or access denied"
}

Delete Process:

  1. Verify ownership (check submission.member_id matches viewer.id)
  2. Delete from R2 (original + medium + thumbnail variants)
  3. Update database (remove from submissions.images JSON array)
  4. Transaction ensures atomicity

Image Processing

Sharp Processing Pipeline

File: src/utils/image-processor.ts

flowchart LR
    A[Upload Buffer] --> B[Validate Magic Bytes]
    B --> C[Check Dimensions]
    C --> D[Sharp Pipeline]
    D --> E[Auto-rotate via EXIF]
    E --> F[Strip EXIF]
    F --> G[Resize if needed]
    G --> H[Convert Format]
    H --> I[Compress]
    I --> J[Output Buffer]

    style B fill:#f9f,stroke:#333
    style F fill:#f9f,stroke:#333
    style H fill:#9f9,stroke:#333
Loading

Processing Parameters

Variant Max Width Max Height Fit Quality (JPEG) Quality (WebP)
Original 2048px 2048px inside 85 80
Medium 800px - inside 85 80
Thumbnail 150px 150px cover 80 75

Fit Modes:

  • inside - Resize to fit within dimensions, maintaining aspect ratio
  • cover - Resize and crop to fill dimensions (used for square thumbnails)

Security Validations

Magic Byte Validation:

// JPEG: FF D8 FF
const isJPEG = magicBytes[0] === 0xFF && magicBytes[1] === 0xD8 && magicBytes[2] === 0xFF;

// PNG: 89 50 4E 47 0D 0A 1A 0A
const isPNG = magicBytes[0] === 0x89 && magicBytes[1] === 0x50 && /* ... */;

// WebP: RIFF....WEBP
const isWebP = magicBytes[0] === 0x52 && magicBytes[1] === 0x49 && /* ... */;

Why Magic Bytes?

  • MIME type from HTTP header can be spoofed
  • Magic bytes are embedded in file structure
  • Prevents uploading malicious files disguised as images

EXIF Stripping:

  • EXIF data may contain GPS coordinates, camera info, timestamps
  • sharp.rotate() auto-rotates based on EXIF orientation, then strips all metadata
  • Protects user privacy

Storage Structure

R2 Object Keys

Pattern: submissions/{memberId}/{submissionId}/{timestamp}-{hash}-{variant}.{ext}

Example:

submissions/42/123/1728312000000-a1b2c3d4e5f6g7h8-original.webp
submissions/42/123/1728312000000-a1b2c3d4e5f6g7h8-medium.webp
submissions/42/123/1728312000000-a1b2c3d4e5f6g7h8-thumb.webp

Components:

  • memberId - Groups images by member (enables bulk deletion, access control)
  • submissionId - Groups images by submission
  • timestamp - Milliseconds since epoch (ensures uniqueness, chronological ordering)
  • hash - 8-byte random hex (prevents collisions, obscures upload order)
  • variant - original, medium, or thumb
  • ext - File extension (webp, jpg, png)

Benefits:

  • Hierarchical structure for organization
  • Globally unique keys (timestamp + hash)
  • Easy to find all images for a member or submission
  • Variant suffix makes size relationship clear

Database Storage

Table: submissions Column: images (TEXT, JSON)

Schema:

interface ImageMetadata {
  key: string;           // R2 object key
  url: string;           // Public URL
  size: number;          // File size in bytes
  uploadedAt: string;    // ISO 8601 timestamp
  contentType?: string;  // MIME type
}

// submissions.images value:
ImageMetadata[]  // Array of image metadata objects

Example:

[
  {
    "key": "submissions/42/123/1728312000000-abc123-original.webp",
    "url": "https://uploads.bap.basny.org/submissions/42/123/1728312000000-abc123-original.webp",
    "size": 245678,
    "uploadedAt": "2025-10-07T14:30:00.000Z",
    "contentType": "image/webp"
  },
  {
    "key": "submissions/42/123/1728312100000-def456-original.jpg",
    "url": "https://uploads.bap.basny.org/submissions/42/123/1728312100000-def456-original.jpg",
    "size": 312456,
    "uploadedAt": "2025-10-07T14:32:00.000Z",
    "contentType": "image/jpeg"
  }
]

Why JSON in SQLite?

  • SQLite doesn't have native array types
  • JSON is flexible for variable-length arrays
  • Easy to parse in JavaScript/TypeScript
  • Avoids additional join table

Error Handling

Validation Errors

Error Cause HTTP Status Resolution
Authentication required No session cookie 401 User must log in
Image upload service not configured R2 config missing 503 Admin: configure R2 credentials
No files uploaded Empty multipart request 400 Frontend validation
Invalid file type Wrong MIME type or magic bytes 400 User: upload JPEG/PNG/WebP only
File size exceeds limit File > 10MB 400 User: compress or resize image
Image too small Dimensions < 400x400 400 User: upload higher resolution
Image too large Dimensions > 4000x4000 400 User: resize image
Maximum files exceeded > 5 files uploaded 400 User: select fewer files

Upload Errors

Error Cause Recovery
R2 upload failed Network error, R2 service down Retry upload, check R2 status
Database update failed Transaction error, constraint violation R2 objects deleted automatically (cleanup)
Processing failed Corrupt image file, Sharp error Skip file, show error to user

Atomic Operations

Database Transaction Pattern:

await withTransaction(async (db) => {
  // 1. Read existing images
  const submission = await db.get('SELECT images FROM submissions WHERE id = ?', [id]);
  const existingImages = JSON.parse(submission.images || '[]');

  // 2. Append new images
  const allImages = [...existingImages, ...newImages];

  // 3. Update submission
  await db.run('UPDATE submissions SET images = ? WHERE id = ?', [
    JSON.stringify(allImages),
    id
  ]);

  // 4. If any step fails, entire transaction rolls back
});

Cleanup on Failure:

try {
  // Upload to R2
  await uploadToR2(key, buffer, contentType);

  // Update database
  await updateSubmission(submissionId, imageMetadata);
} catch (error) {
  // Database update failed - clean up R2 uploads
  logger.info('Cleaning up R2 uploads after database failure');
  await deleteImage(key);  // Remove from R2
  await deleteImage(mediumKey);
  await deleteImage(thumbnailKey);
  throw error;  // Re-throw to notify user
}

Why Cleanup?

  • Prevents orphaned objects in R2
  • Maintains consistency between database and storage
  • Avoids storage costs for unused files

Rate Limiting

Upload Endpoints

Endpoint Limit Burst Timeout Description
POST /api/upload/image 5 req/sec 10 300s Image upload (large files need time)
GET /api/upload/progress/:id 30 req/sec 50 - Progress polling (frequent)
DELETE /api/upload/image/:key 10 req/sec 20 - Image deletion

Why Different Limits?

  • Upload: Slow (processing + transfer) = lower limit, higher timeout
  • Progress: Fast (SSE) = higher limit
  • Delete: Medium (database + R2) = medium limit

Nginx Configuration:

# Define rate limit zones
limit_req_zone $binary_remote_addr zone=upload_limit:10m rate=5r/s;
limit_req_zone $binary_remote_addr zone=progress_limit:10m rate=30r/s;
limit_req_zone $binary_remote_addr zone=delete_limit:10m rate=10r/s;

# Apply to endpoints
location /api/upload/image {
    limit_req zone=upload_limit burst=10;
    client_max_body_size 100M;  # 10MB × 5 files + overhead
    client_body_timeout 300s;
    # ... proxy config ...
}

location /api/upload/progress/ {
    limit_req zone=progress_limit burst=50;
    # ... SSE proxy config ...
}

location ~ /api/upload/image/.+ {  # DELETE /api/upload/image/:key
    limit_req zone=delete_limit burst=20;
    # ... proxy config ...
}

Testing

Manual Testing

Development:

# Start dev server
npm run dev

# Open submission form
open http://localhost:4200/submissions/new

# Test upload:
# 1. Drag valid image (JPEG/PNG/WebP, 400-4000px, < 10MB)
# 2. Verify progress bar updates
# 3. Check preview grid shows uploaded image
# 4. Test delete button
# 5. Submit form and verify images persist

Production:

# SSH to production
ssh BAP

# Check R2 uploads
# (Requires AWS CLI configured with R2 credentials)
aws s3 ls s3://mulm-uploads/submissions/ --endpoint-url=https://<account>.r2.cloudflarestorage.com

# Check database
sqlite3 /mnt/basny-data/app/database/database.db
sqlite> SELECT id, member_id, images FROM submissions WHERE images IS NOT NULL LIMIT 5;

Automated Tests

File: src/__tests__/upload.test.ts

Test Cases:

  • ✅ Upload valid JPEG image
  • ✅ Upload valid PNG image
  • ✅ Upload valid WebP image
  • ✅ Reject invalid MIME type
  • ✅ Reject file with wrong magic bytes
  • ✅ Reject image too small (<400px)
  • ✅ Reject image too large (>4000px)
  • ✅ Reject file too large (>10MB)
  • ✅ Process multiple images in one request
  • ✅ Delete image (verify R2 + database cleanup)
  • ✅ Database transaction rollback (verify R2 cleanup)

Mock R2 Client:

// __tests__/upload.test.ts
import { overrideR2Client } from '@/utils/r2-client';

beforeEach(() => {
  // Mock S3Client for testing
  const mockClient = new S3Client({ /* mock config */ });
  const mockConfig = { /* mock R2 config */ };
  overrideR2Client(mockClient, mockConfig);
});

Security Testing

Test Invalid Files:

# Create malicious file with wrong magic bytes
echo "not an image" > test.jpg
# Upload should fail with "Invalid image format"

# Create PHP file disguised as JPEG
echo "<?php system('ls'); ?>" > malicious.jpg
# Upload should fail (wrong magic bytes)

# Create image with EXIF GPS data
exiftool -GPS* test.jpg
# Upload should succeed, but EXIF should be stripped
exiftool uploaded-image.webp  # Should have no GPS data

Best Practices

Frontend Integration

DO:

  • Debounce file selection (prevent accidental double-uploads)
  • Show file size before upload (prevent large file errors)
  • Display upload progress (improve UX)
  • Handle partial failures gracefully (some files succeed, some fail)
  • Clear file input after upload (prevent re-upload on form re-submit)
  • Store image metadata in hidden field (preserve on form validation errors)

DON'T:

  • Upload on every file selection (use explicit upload button if needed)
  • Allow >5 files to be selected (enforce limit client-side)
  • Submit form during upload (disable submit until complete)
  • Rely solely on MIME type (always validate server-side)

Backend Integration

DO:

  • Validate magic bytes (never trust MIME type)
  • Strip EXIF data (protect user privacy)
  • Use transactions (ensure database/R2 consistency)
  • Clean up on errors (delete orphaned R2 objects)
  • Log upload events (debugging, auditing)
  • Rate limit aggressively (prevent abuse)

DON'T:

  • Store images in database (use object storage)
  • Allow unlimited file sizes (enforce 10MB limit)
  • Skip magic byte validation (security risk)
  • Leave EXIF data (privacy risk)
  • Forget to finalize prepared statements (memory leaks)

Configuration

DO:

  • Use environment variables for secrets (never commit to git)
  • Configure custom domain for cleaner URLs
  • Enable CORS on R2 bucket (if serving from different domain)
  • Set lifecycle rules to delete old uploads (optional cost savings)

DON'T:

  • Commit R2 credentials to git (use .gitignore)
  • Use default R2 endpoint URLs in public URLs (hard to change)
  • Allow public write access to bucket (security risk)
  • Store production config in code (use config file on server)

Troubleshooting

Upload Fails Immediately

Symptoms:

  • "Image upload service not configured" error
  • No progress bar

Diagnosis:

# Check if R2 is initialized
npm run dev 2>&1 | grep "R2"
# Should see: "R2 client initialized"

# Check config file
cat src/config.json | jq '.s3Url, .s3Bucket'
# Should have valid values

Solution:

  • Ensure config.json has R2 credentials
  • Restart application after config changes

Upload Succeeds but Images Don't Display

Symptoms:

  • Upload completes without errors
  • Preview grid is empty or shows broken images

Diagnosis:

# Check if images were stored in database (development)
sqlite3 db/database.db
sqlite> SELECT id, images FROM submissions WHERE id = 123;
# Should have JSON array with image metadata
# For production, use: ssh BAP "sqlite3 /mnt/basny-data/app/database/database.db"

# Check if R2 URLs are accessible
curl -I https://uploads.bap.basny.org/submissions/.../image.webp
# Should return 200 OK

Solution:

  • Verify R2 bucket has public access enabled
  • Check custom domain is properly configured
  • Ensure CORS allows requests from application domain

Upload Stalls During Processing

Symptoms:

  • Progress bar stuck at "Processing image X of Y"
  • No error message

Diagnosis:

# Check application logs
npm run dev 2>&1 | grep -i "image"
# Look for Sharp errors

# Check file is valid
file uploaded-image.jpg
# Should show: "JPEG image data"

Solution:

  • Ensure image is not corrupt
  • Try smaller image (may be hitting memory limits)
  • Check Sharp is properly installed: npm list sharp

R2 Cleanup Not Working

Symptoms:

  • Database update fails
  • Images remain in R2 (orphaned objects)

Diagnosis:

# Check transaction logs
npm run dev 2>&1 | grep "Cleaning up R2"
# Should see cleanup attempts

# List R2 objects
aws s3 ls s3://mulm-uploads/submissions/ --recursive --endpoint-url=...

Solution:

  • Check database logs for transaction errors
  • Manually delete orphaned objects from R2
  • Verify cleanup code is executing (may be caught silently)

Performance Optimization

Image Compression

Current compression settings balance quality and file size:

Format Quality Typical Compression
JPEG 85 70-80% smaller than original
WebP 80 75-85% smaller than original

Benchmarks (1MB original PNG):

  • Original (2048px): ~800KB JPEG, ~600KB WebP
  • Medium (800px): ~150KB JPEG, ~100KB WebP
  • Thumbnail (150px): ~15KB JPEG, ~10KB WebP

Caching Strategy

Browser Caching:

# Cloudflare R2 bucket settings
location ~* \.(jpg|jpeg|png|webp)$ {
    expires 1y;
    add_header Cache-Control "public, immutable";
}

CDN Caching:

  • Cloudflare automatically caches R2 public bucket content
  • Images are cached at edge locations globally
  • First request: Slow (origin fetch)
  • Subsequent requests: Fast (edge cache hit)

Parallel Processing

Upload route processes images in parallel:

// Upload all 3 variants concurrently
await Promise.all([
  uploadToR2(originalKey, originalBuffer, contentType),
  uploadToR2(mediumKey, mediumBuffer, contentType),
  uploadToR2(thumbnailKey, thumbnailBuffer, contentType)
]);

Benefits:

  • 3x faster than sequential uploads
  • Reduces user wait time
  • Better utilization of network bandwidth

Related Documentation

⚠️ **GitHub.com Fallback** ⚠️