3. Sprint 2 Tasks - ITA-Dnipro/PyDataCenter5000 GitHub Wiki
Sprint 2 – From Monitoring to Management
Overview
In Sprint 2, the PyDataCenter system evolves from passive monitoring to active control. Agents become responsive, API security is added, and a basic UI is introduced. Each task is designed for parallel execution by team members.
Task 1: Implement Multithreaded Data Collection in Agent
Description:
Refactor agent.py
to collect and send data using separate threads:
- Thread 1: Collect system metrics periodically.
- Thread 2: Send metrics to the controller.
- Thread 3: Watch process status (e.g., check if
sshd
is running).
Goal: Increase agent reliability and real-time performance.
Task 2: Add Command Execution Support in Agent
Description:
Enable agents to fetch commands from the controller (HTTP GET), execute safe commands (e.g., uptime
, ls
), and return output. Commands are initiated via controller and must be limited to a whitelist.
Goal: Begin remote execution support.
Task 3: Build Command Dispatch API in Django
Description:
Create /api/command/
endpoint that:
- Accepts a command and target hostname.
- Stores command in a
CommandHistory
model with status “pending.” - Allows agent to fetch pending command and send back the result.
Goal: Add backend support for remote ops.
Task 4: Create Controller CLI to Trigger Commands
Description:
Build a CLI admin tool (Python 3.8) that:
- Lists active agents.
- Sends commands to a selected agent via Django API.
- Displays command execution result.
Goal: Create a command center interface for the controller.
Task 5: Store Command History in Django DB
Description:
Add a model CommandHistory
with fields:
- hostname, command, result, status (pending/done/failed), timestamp
Make it accessible from the Django Admin and/or frontend.
Goal: Enable tracking of all issued and executed commands.
Task 6: Add Token-Based API Authentication
Description:
Use Django REST Framework's TokenAuth or Simple JWT:
- Each agent must send an authentication token in headers.
- Reject unauthorized requests.
Goal: Secure the controller API against unauthorized agents.
Task 7: Integrate Telegram/Discord Alert Bot (Prototype)
Description:
Create a simple bot that:
- Sends alert when agent reports an error or process failure.
- Sends command results or status messages. Can be implemented via webhook or polling.
Goal: Enable real-time notification for admins.
Task 8: Build Basic Real-Time Dashboard
Description:
Create /dashboard/
page using Django Templates or JS frontend:
- Show live agent list (hostname, IP, uptime, last seen).
- Optionally highlight down agents (no report > 1 min).
Goal: Visualize system state in real-time.
Task 9: Detect and Log Failed Processes in Agent
Description:
Extend the agent to monitor specific OS processes (e.g., nginx
, sshd
):
- If a process is not running, log the issue.
- Send alert via POST or bot.
Goal: Add basic anomaly detection to agent.
Task 10: Write Unit Tests for Agent and API
Description:
Write tests for:
POST /api/status/
and/api/command/
- Agent data collection functions
- Retry logic and command handling
Use Django’s test client and basic Python test framework (unittest
or pytest
).
Goal: Improve reliability and enable continuous testing.