overview - poppopjmp/spiderfoot GitHub Wiki

Overview

SpiderFoot is a powerful open source intelligence (OSINT) automation platform designed to help security professionals, penetration testers, and threat analysts map attack surfaces, discover external assets, and gather actionable intelligence from hundreds of public data sources.


What is SpiderFoot?

SpiderFoot automates the process of collecting OSINT by leveraging over 309 modules to gather information about IP addresses, domains, subdomains, emails, usernames, cryptocurrency addresses, social media profiles, and more. It is suitable for:

  • Threat intelligence and digital footprinting: Gain visibility into your organization's or target's online presence, uncovering exposed data and potential risks.
  • Attack surface mapping and asset discovery: Identify all internet-facing assets, including forgotten or shadow IT, to reduce your exposure to attacks.
  • Red teaming and penetration testing: Support offensive security operations by automating reconnaissance and providing deep context for targets.
  • Brand protection and fraud detection: Monitor for brand abuse, phishing, and fraudulent activity across the web and dark web.
  • Third-party risk assessment: Evaluate the security posture of vendors, partners, or acquisitions by mapping their digital footprint.

SpiderFoot is designed for both beginners and advanced users, offering a modern dark-themed React web interface as well as a powerful command-line tool for automation and scripting.

SpiderFoot Dashboard


Key Features

  • Automated OSINT Collection: Schedule and run scans to collect intelligence from a wide range of sources with minimal manual effort. SpiderFoot can run unattended, making it ideal for continuous monitoring.
  • 309+ Data Gathering Modules: Integrate with threat feeds, search engines, social media, DNS, WHOIS, breach databases, paste sites, blockchain analytics, dark web sources, and more. Modules are regularly updated and community contributions are welcome.
  • Enhanced OSINT Capabilities: Advanced modules for TikTok intelligence, blockchain analytics, advanced correlation, and performance optimization provide cutting-edge OSINT capabilities.
  • AI-Powered Analysis: Built-in AI agents and LLM-generated CTI reports using configurable models via LiteLLM (OpenAI, Anthropic, Ollama, OpenRouter).
  • Modern Web UI: Dark-themed React SPA with real-time updates, interactive graph visualization, geographic mapping, and comprehensive scan management.
  • 95 Correlation Rules: YAML-configurable correlation engine for automated finding and risk identification.
  • Web UI and CLI: Use the intuitive web interface for interactive investigations, visualization, and reporting, or the command-line interface for automation, scripting, and integration with other tools.
  • Enterprise Security Features: Comprehensive security implementation including CSRF protection, input validation, rate limiting, session management, API security, and structured security logging.
  • Workspaces and Multi-Target Support: Organize scans, targets, and results into workspaces for collaborative investigations and large-scale assessments. Workspaces allow you to manage multiple projects and share findings with your team.
  • API for Integration: Integrate SpiderFoot with SIEM, SOAR, and other security tools using the RESTful API with enterprise-grade security features. Automate scans, retrieve results, and trigger actions based on findings.
  • Correlation and Analysis: Built-in correlation engine with 95 YAML-based rules to identify relationships, patterns, and risks across collected data. Advanced correlation engine provides cross-platform identity resolution, temporal analysis, and geospatial clustering. Visualize connections between entities and uncover hidden threats.
  • Custom Module Support: Easily extend SpiderFoot with your own modules to support new data sources, custom logic, or proprietary integrations. The modular architecture makes development and maintenance straightforward.
  • Notifications and Alerts: Receive real-time notifications for critical findings, such as data breaches, exposed credentials, or new assets discovered.
  • Export and Reporting: Export scan results in multiple formats (CSV, JSON, HTML) for further analysis or reporting to stakeholders.
  • Cross-Platform: Runs on Windows, Linux, and macOS. Docker images are available for easy deployment.
  • Production-Ready Security: Battle-tested security middleware with comprehensive protection against common web vulnerabilities and enterprise-grade logging and monitoring.

Current Version and Capabilities

SpiderFoot v6.0.0 represents a significant advancement in OSINT automation with the following enhanced capabilities:

Module Count and Categories

  • 309 Total Modules: Comprehensive coverage across all major OSINT categories
  • Core Investigation Modules: DNS/Network (45+), Threat Intelligence (35+), Social Media (25+)
  • Enhanced Modules: TikTok OSINT, Blockchain Analytics, Hudson Rock, AI Analysis, Advanced Correlation
  • Performance Modules: Caching optimization, rate limiting, resource management

Microservices Architecture (23 containers)

  • REST + GraphQL API: FastAPI with Strawberry GraphQL (queries, mutations, subscriptions)
  • Vector Search: Qdrant-powered semantic search over scan events
  • Object Storage: MinIO S3-compatible storage for reports, exports, backups
  • Task Processing: Celery workers + beat scheduler with Flower monitoring
  • LLM Gateway: LiteLLM multi-provider proxy (OpenAI, Anthropic, Ollama, OpenRouter)
  • Observability: Vector.dev → Loki + Prometheus + Grafana + Jaeger
  • Reverse Proxy: Traefik v3 with auto-TLS and routing
  • Automated Backups: PG backup sidecar with MinIO archival

New Advanced Features

  • TikTok Intelligence: Comprehensive TikTok profile analysis and content investigation
  • Blockchain Analytics: Multi-cryptocurrency investigation with risk assessment and sanctions checking
  • AI-Powered Analysis: OpenAI GPT integration for automated threat intelligence summarization
  • Advanced Correlation: Cross-platform entity resolution with graph algorithms and temporal analysis
  • Performance Optimization: Intelligent caching system with 60%+ hit rates and adaptive rate limiting

Enterprise Security

  • Security Middleware: Unified protection layer with CSRF, input validation, and rate limiting
  • API Security: JWT tokens, API key management, and comprehensive authentication
  • Audit Logging: Structured security event logging with compliance support
  • Data Protection: GDPR compliance, privacy-by-design, and configurable retention policies

Scalability and Performance

  • Optimized Processing: 15-30% performance improvement with enhanced caching
  • Resource Management: Intelligent memory monitoring and garbage collection
  • API Efficiency: 70% reduction in duplicate requests through smart caching
  • Database Performance: 40% faster query execution with optimization features

Architecture Overview

SpiderFoot consists of the following main components:

  • Core Engine: Orchestrates scans, manages modules, processes results, and handles scheduling. The engine is highly extensible and supports concurrent scanning.
  • GraphQL + REST API: Code-first Strawberry GraphQL schema (13 queries, 5 mutations, 2 WebSocket subscriptions) alongside the full REST API. Mounted at /api/graphql with GraphiQL IDE.
  • Qdrant Vector Search: Semantic similarity search over scan events using custom HTTP-based Qdrant client with embedding service (all-MiniLM-L6-v2, 384 dims). 5 correlation strategies.
  • MinIO Object Storage: S3-compatible storage for reports, exports, artifacts, backups, and logs across 5 auto-provisioned buckets.
  • Security Middleware: Enterprise-grade security layer providing CSRF protection, input validation, rate limiting, session management, API security, and comprehensive logging.
  • Modules: Each module is responsible for gathering a specific type of data or integrating with a particular source. Modules can be enabled, disabled, or configured individually.
  • Web UI: Provides a user-friendly interface for configuring scans, viewing results, managing workspaces, and visualizing relationships between entities.
  • Database: PostgreSQL with automated backup sidecar.
  • Performance Optimization: Intelligent caching system with TTL support, adaptive rate limiting, and resource monitoring.
  • Advanced Analytics: Sophisticated data correlation including cross-platform identity resolution, temporal pattern analysis, geospatial clustering, and behavioral analytics.
  • Log Pipeline: Vector.dev for structured log aggregation, transformation, and routing.

Typical Use Cases

  • Mapping the external attack surface of an organization: Identify all domains, subdomains, IPs, and other assets exposed to the internet.
  • Discovering forgotten or shadow IT assets: Uncover assets that may have been overlooked or are no longer actively managed.
  • Investigating threat actors and digital footprints: Gather intelligence on individuals, organizations, or threat groups using a wide array of data sources.
  • Monitoring for data leaks, breaches, and brand abuse: Detect compromised credentials, leaked data, and unauthorized use of your brand.
  • Enriching alerts and incidents with OSINT context: Provide additional context to security alerts and incidents, improving triage and response.
  • Third-party/vendor risk assessment: Evaluate the security posture of partners, vendors, or acquisition targets by mapping their digital footprint.
  • Red team and penetration test reconnaissance: Automate the reconnaissance phase of security assessments, saving time and increasing coverage.
  • Continuous monitoring: Set up scheduled scans to keep track of changes in your attack surface and receive alerts for new findings.

Next Steps

  • Getting Started: Learn how to install and launch SpiderFoot on your platform of choice.
  • Quick Start: Run your first scan in minutes with step-by-step instructions.
  • User Guide: Explore advanced features, workflows, and best practices for using SpiderFoot effectively.
  • Configuration: Customize SpiderFoot to fit your needs, including API keys, module settings, and scan options.
  • Modules: Browse the full list of available modules and learn how to enable, configure, or develop your own.
  • API Reference: Integrate SpiderFoot with other tools and automate your workflows using the RESTful API.
  • Advanced Topics: Dive deeper into performance optimization, security, and deployment options.
  • Developer Guide: Contribute to SpiderFoot or build custom modules and integrations.
  • FAQ: Find answers to common questions and troubleshooting tips.
  • Troubleshooting: Resolve common issues and get support.

For more information, see the rest of the documentation or visit the SpiderFoot GitHub repository.


Authored by poppopjmp

⚠️ **GitHub.com Fallback** ⚠️