docker_deployment - poppopjmp/spiderfoot GitHub Wiki
This guide covers deploying SpiderFoot using Docker Compose with profile-based service activation. Start with just the 5 core services, then enable additional profiles as needed.
git clone https://github.com/poppopjmp/spiderfoot.git
cd spiderfoot
# Copy and configure environment
cp .env.example .env
# Edit .env — change passwords, uncomment profile sections as needed
# Core only (5 services)
docker compose -f docker-compose.yml up --build -d
# Full stack (all services except SSO)
docker compose -f docker-compose.yml --profile full up --build -d
# View logs
docker compose -f docker-compose.yml logs -f
# Stop everything
docker compose -f docker-compose.yml downCore (no profile) — http://localhost:3000:
| URL | Service |
|---|---|
http://localhost:3000 |
React SPA |
http://localhost:3000/api/docs |
Swagger / OpenAPI |
Full stack (--profile full) — https://localhost via Traefik:
| URL | Service |
|---|---|
https://localhost |
React SPA (via Traefik) |
https://localhost/api/docs |
Swagger / OpenAPI |
https://localhost/api/graphql |
GraphiQL IDE |
https://localhost/flower/ |
Celery Flower Monitoring |
https://localhost/grafana/ |
Grafana Dashboards |
https://localhost/minio/ |
MinIO Console |
https://localhost/traefik/ |
Traefik Dashboard |
Services are organized into profiles — activate only what you need:
| Profile | Services | Description |
|---|---|---|
| (core) | postgres, redis, api, celery-worker, frontend | Always starts — minimal working set |
scan |
celery-worker-active | Active recon tools (nmap, nuclei, httpx, …) |
proxy |
traefik, docker-socket-proxy | Reverse proxy + TLS termination |
storage |
minio, minio-init, qdrant, tika, pg-backup | Object storage, vector DB, document parsing |
monitor |
vector, loki, grafana, prometheus, jaeger | Full observability stack |
ai |
agents, litellm | AI analysis agents + LLM gateway |
scheduler |
celery-beat, flower | Periodic tasks + Celery monitoring |
sso |
keycloak | OIDC / SAML identity provider |
full |
all of the above except SSO | Complete deployment |
# Mix and match profiles
docker compose -f docker-compose.yml --profile proxy --profile storage up -d
# Full stack + SSO
docker compose -f docker-compose.yml --profile full --profile sso up -dThe stack uses two Docker networks (sf-frontend, sf-backend) and organizes services by profile:
| Service | Image | Port | Purpose |
|---|---|---|---|
| sf-postgres | postgres:15-alpine | 5432 | Primary relational data store |
| sf-redis | redis:7-alpine | 6379 | EventBus pub/sub, caching, Celery broker |
| sf-api | spiderfoot-micro | 8001 | FastAPI REST + GraphQL API |
| sf-celery-worker | spiderfoot-micro | — | Celery distributed task workers |
| sf-frontend-ui | spiderfoot-frontend | 3000 | React SPA served by Nginx |
| Service | Profile | Image | Port | Purpose |
|---|---|---|---|---|
| sf-celery-worker-active | scan |
spiderfoot-active | — | Active scanning (33+ recon tools) |
| sf-traefik | proxy |
traefik:v3 | 443 | Reverse proxy, auto-TLS, routing |
| sf-docker-proxy | proxy |
tecnativa/docker-socket-proxy | — | Secure Docker API access |
| sf-minio | storage |
minio/minio | 9000 | S3-compatible object storage |
| sf-minio-init | storage |
minio/mc | — | One-shot bucket creation |
| sf-qdrant | storage |
qdrant/qdrant | 6333 | Vector similarity search |
| sf-tika | storage |
apache/tika | 9998 | Document parsing (PDF, DOCX, etc.) |
| sf-pg-backup | storage |
postgres:15-alpine | — | Cron sidecar: pg_dump → MinIO |
| sf-vector | monitor |
timberio/vector | 8686 | Telemetry pipeline |
| sf-loki | monitor |
grafana/loki | 3100 | Log aggregation |
| sf-grafana | monitor |
grafana/grafana | 3000 | Dashboards & alerting |
| sf-prometheus | monitor |
prom/prometheus | 9090 | Metrics collection |
| sf-jaeger | monitor |
jaegertracing/jaeger | 16686 | Distributed tracing |
| sf-agents | ai |
spiderfoot-micro | 8100 | 6 AI-powered analysis agents |
| sf-litellm | ai |
ghcr.io/berriai/litellm | 4000 | Unified LLM proxy |
| sf-celery-beat | scheduler |
spiderfoot-micro | — | Periodic task scheduler |
| sf-flower | scheduler |
spiderfoot-micro | 5555 | Celery monitoring dashboard |
| sf-keycloak | sso |
keycloak | 9080 | OIDC / SAML identity provider |
- sf-frontend — Browser-facing (Traefik, Frontend, API)
- sf-backend — Internal only (PostgreSQL, Redis, Qdrant, MinIO)
| Volume | Service | Purpose |
|---|---|---|
postgres-data |
sf-postgres | Database files |
redis-data |
sf-redis | RDB/AOF persistence |
qdrant-data |
sf-qdrant | Vector index storage |
minio-data |
sf-minio | Object store files |
vector-data |
sf-vector | Log buffer / checkpoints |
grafana-data |
sf-grafana | Dashboard state |
prometheus-data |
sf-prometheus | Metrics TSDB |
traefik-logs |
sf-traefik | Access logs |
Seven buckets are auto-created by sf-minio-init on first boot:
| Bucket | Contents |
|---|---|
sf-logs |
Vector.dev log archive |
sf-reports |
Generated scan reports (HTML, PDF, JSON, CSV) |
sf-pg-backups |
PostgreSQL daily pg_dump files |
sf-qdrant-snapshots |
Qdrant vector DB snapshots |
sf-data |
General artefacts |
sf-loki-data |
Loki log chunk storage |
sf-loki-ruler |
Loki alerting rules |
Access the MinIO Console at https://localhost/minio/ (requires proxy + storage profiles; default credentials: minioadmin / minioadmin).
Qdrant runs on port 6333 and stores embeddings for semantic OSINT event search. Collections are prefixed with sf_.
The GraphQL API exposes semanticSearch and vectorCollections queries for searching and inspecting vector data.
Copy .env.example to .env and customise:
cp .env.example .env
# Uncomment profile-specific sections when activating profilesKey variables:
| Variable | Default | Description |
|---|---|---|
SF_DEPLOYMENT_MODE |
microservices |
Service mode |
SF_DATABASE_URL |
postgresql://... |
PostgreSQL connection |
SF_REDIS_URL |
redis://sf-redis:6379 |
Redis connection |
SF_QDRANT_HOST |
sf-qdrant |
Qdrant hostname |
SF_MINIO_ENDPOINT |
sf-minio:9000 |
MinIO endpoint |
SF_MINIO_ACCESS_KEY |
minioadmin |
MinIO access key |
SF_MINIO_SECRET_KEY |
minioadmin |
MinIO secret key |
SF_EMBEDDING_PROVIDER |
mock |
Embedding backend |
SF_LOG_FORMAT |
json |
Log format |
TLS requires the proxy profile (Traefik). To enable:
# Generate a self-signed cert (testing)
./generate-certificate
# Start with proxy profile
docker compose -f docker-compose.yml --profile proxy up -d
# Traefik auto-discovers TLS certificates from the mounted certs directoryAll services include Docker HEALTHCHECK directives. Monitor with:
# Check all service health
docker compose -f docker-compose.yml ps
# Detailed health for one service
docker inspect --format='{{json .State.Health}}' sf-apiAPI health endpoint: GET /api/health
The sf-pg-backup sidecar runs pg_dump on a cron schedule and uploads backups to the sf-pg-backups MinIO bucket.
# List backups
curl http://localhost/api/storage/buckets/sf-pg-backups
# Manual backup trigger
docker exec sf-pg-backup /scripts/pg_backup_minio.shFor horizontal scaling, increase replicas of stateless services:
docker compose -f docker-compose.yml up -d --scale sf-api=3The active scan worker requires the scan profile. Scale independently:
docker compose -f docker-compose.yml --profile scan up -d --scale celery-worker-active=3Each instance competes for tasks from the scan queue via Celery's fair scheduling.
See Active Scan Worker Guide for full details on the 33+ tools,
resource requirements, and security considerations.
For production deployments, consider the Helm chart in helm/ for Kubernetes.
| Problem | Solution |
|---|---|
| Port 443 in use | Change Traefik port mapping in compose file |
| MinIO init fails | Check sf-minio is healthy before sf-minio-init runs |
| Qdrant OOM | Increase memory limit for sf-qdrant service |
| DB connection refused | Wait for sf-postgres healthcheck to pass |
| GraphQL not loading | Check sf-api logs: docker logs sf-api
|
# All services
docker compose -f docker-compose.yml logs -f
# Specific service
docker logs -f sf-api
# Archived logs in MinIO
# Access via MinIO Console at https://localhost/minio/