3. Sprint 2 Tasks - ITA-Dnipro/PyDataCenter5000 GitHub Wiki

Sprint 2 – From Monitoring to Management

Overview

In Sprint 2, the PyDataCenter system evolves from passive monitoring to active control. Agents become responsive, API security is added, and a basic UI is introduced. Each task is designed for parallel execution by team members.


Task 1: Implement Multithreaded Data Collection in Agent

Description:
Refactor agent.py to collect and send data using separate threads:

  • Thread 1: Collect system metrics periodically.
  • Thread 2: Send metrics to the controller.
  • Thread 3: Watch process status (e.g., check if sshd is running).

Goal: Increase agent reliability and real-time performance.


Task 2: Add Command Execution Support in Agent

Description:
Enable agents to fetch commands from the controller (HTTP GET), execute safe commands (e.g., uptime, ls), and return output. Commands are initiated via controller and must be limited to a whitelist.

Goal: Begin remote execution support.


Task 3: Build Command Dispatch API in Django

Description:
Create /api/command/ endpoint that:

  • Accepts a command and target hostname.
  • Stores command in a CommandHistory model with status “pending.”
  • Allows agent to fetch pending command and send back the result.

Goal: Add backend support for remote ops.


Task 4: Create Controller CLI to Trigger Commands

Description:
Build a CLI admin tool (Python 3.8) that:

  • Lists active agents.
  • Sends commands to a selected agent via Django API.
  • Displays command execution result.

Goal: Create a command center interface for the controller.


Task 5: Store Command History in Django DB

Description:
Add a model CommandHistory with fields:

  • hostname, command, result, status (pending/done/failed), timestamp

Make it accessible from the Django Admin and/or frontend.

Goal: Enable tracking of all issued and executed commands.


Task 6: Add Token-Based API Authentication

Description:
Use Django REST Framework's TokenAuth or Simple JWT:

  • Each agent must send an authentication token in headers.
  • Reject unauthorized requests.

Goal: Secure the controller API against unauthorized agents.


Task 7: Integrate Telegram/Discord Alert Bot (Prototype)

Description:
Create a simple bot that:

  • Sends alert when agent reports an error or process failure.
  • Sends command results or status messages. Can be implemented via webhook or polling.

Goal: Enable real-time notification for admins.


Task 8: Build Basic Real-Time Dashboard

Description:
Create /dashboard/ page using Django Templates or JS frontend:

  • Show live agent list (hostname, IP, uptime, last seen).
  • Optionally highlight down agents (no report > 1 min).

Goal: Visualize system state in real-time.


Task 9: Detect and Log Failed Processes in Agent

Description:
Extend the agent to monitor specific OS processes (e.g., nginx, sshd):

  • If a process is not running, log the issue.
  • Send alert via POST or bot.

Goal: Add basic anomaly detection to agent.


Task 10: Write Unit Tests for Agent and API

Description:
Write tests for:

  • POST /api/status/ and /api/command/
  • Agent data collection functions
  • Retry logic and command handling

Use Django’s test client and basic Python test framework (unittest or pytest).

Goal: Improve reliability and enable continuous testing.