EMBArk internals - e-m-b-a/embark GitHub Wiki

Technical Documentation

❗ Please consider recent changes and don't entirely rely on this doc being up to date ❗ Work inprogress

Modules

currently EMBArk only has 1 submodule, EMBA. Using not the main-branch but only verified-working release states.

Django

embark folder in the root repo houses the django application

Django-apps

the app-list is defined in the settings embark/settings/deploy.py

1 embark

1.1 consumers.py

Extension of AsyncWebsocketConsumer class from django-channels to act as consumer of websockets.

This class is responsible for establishing a websocket connection with the user-client.
It also accesses redis database through CHANNEL_LAYER declared in settings.py in the same folder to send real time events to the user.
the channel-groups are based on the user-name
! be aware that logreader also send updates to the client once a connection is established through this consumer class

1.2 logreader.py

Our current server implementation creates a temporary empty log file and then waits in a blocking loop for new file changes on the emba.log via the inotify wrap.
Whenever the log file has changed, the difference between the emba.log file and the temporary log file is calculated via a python difflib script and passed to rxpy method for further processing.
After extracting the relevant information from the emba.log, a temporary message dictionary is updated and appended to a global dictionary which contains all messages for all running processes.
This message dictionary is cached/committed to Database in the form of a JSONField and send via the channel-layer(WS, ASGI)
The inotify reader class uses a python wrapper for the linux system call inotify(7). In our implementation, it adds a watch on the emba.log and sends events whenever the file changed. In this way we can trigger the next steps for live reading the emba.log.
More details can be found in embark/embark/logreader.py

1.3 routing.py

Just like urls.py a file containing routes for web-sockets to corresponding extentions of AsyncWebsocketConsumer class from django-channels

1.4 runapscheduler.py

started as command in the entrypoint.sh via python3 manage.py runapscheduler --test &, this runs in a seperate task
starts a logger, which sampels the system load in a predefined timespan and saves it into ResourceTimestamp model
also registrating a cleanup task to prevent excessive database usage
relying on apscheduler for adding tasks
flags:
- <>: sample every hour - delete after 2 weeks \
- --test: sample every second - delete after 5 minutes
For more details see the code embark/uploader/management/commands/runapscheduler.pyor the official django documentation: https://docs.djangoproject.com/en/3.2/howto/custom-management-commands/

2.uploader

This app is responsible for uploading and scanning firmwares, running emba commands and saving results and metadata into SQL database. It is the app handling execution of EMBA in the boundedexecor.py

2.1 boundedexecutor

The BoundedExecutor class is basically an extended wrapper for pythons ThreadPoolExecutor to support a selectable finite upper bound for running processes. Internally it uses additional BoundedSemaphore to track the current load, on submit() the semaphore is decremented and on finished process incremented.
On submit_*, the Archiver class is used to execute all higher processing tasks with the blocking subprocess.call.
Methods are documented properly in embark/uploader/boundedExecutor.py

2.2 archiver

The Archiver class is basically an extended wrapper for shutils to support all common archive formats.
It supports packing, unpacking and validating of different archive types.

Methods are documented properly in embark/uploader/archiver.py

Unit tests labeled as test_archiver.py.

3 users

TODO/functionality missing

3.1 changing password

4 reporter

Basically wrapps all of EMBAs Report directory into something that can be accessed. !! Should be swapped with EMBArks own reporting soon instead of serving dynamic files

5 dashboard

Application for all dashboard pages which are dedicated to Query->process->display

6 porter

Importing exporting application ! still in work

Frontend

ServiceDashboard.js: This script running on client side, takes the messages sent via websocket, to show live information about running emba processes in the backend. Currently, it displays the percentage, the current module as well as the current phase, the emba process is currently in. For each emba process it shows a container with this information labeled by the firmware name.

- `Socket.onmessage`: After the socket connection is established and once the socket provides message this function helps in binding the messages to 
                      and creating the container.
- `makeProgress()`  : Update the Progress bar with the percentage of progress made in Analyzing the Firmware.
- `livelog_phase()` : Append new phase messages to the container
- `livelog_module()`: Append new module messages to the container
- `cancelLog()`     : Removes the container from the UI.

alertBox.js: This script provides alert functionalities to display success and error alerts to user. These functions can be used across the project when there is something to notify for the user.

- `errorAlert()`    : Displays Alert message if something is failed.
- `successAlert()`  : Display success message to user.
- `promptAlert()`   : For any input which is required to be entered by user.

accumulatedReport.js: This script generates reports from the data collected on analyzing the firmware . These reports will be displayed in the main dashboard.

- `getRandomColors()`        : Get Random Colors for the Charts. 
- `get_accumulated_reports()`: Gets Accumulated data of all firmware scans analyzed.
- `promptAlert()`            : For any input which is required to be entered by user.

fileUpload.js: This script allows the user to upload the firmware files and make post calls to backend to save files locally.

- `showFiles()`: This function binds file name to div which will be displayed to user.
- `postFiles()`: Makes Ajax call and save files locally.

main.js: This script contains functionalities related to navigation menu and also contains the delete firmware functionality .

- `expertModeOn()` : To toggle expert mode option during analysing the Firmware.
- `confirmDelete()`: To show a window on confirmation screen asking the user to confirm deletion of firmware file.

mainDashboard.js: This script generates the report of system load and also validates the login.

- `check_login()`  : Validates the Login.
- `get_load()`     : Get Load of Time , CPU and Memory Percentage.

Docker and docker-compose.

Currently there are only 2 container: MySQL and Redis which run the backend operations

MySQL DB

We use a MySql database as default db of our Django application. It is a part of docker-compose.yml in the root of our repo. This database serves the following purposes.

Acts as house of all the models in the apps.
As a consequence of point 1, this database stores locations and respective commands for firmwares.
It also acts as datastore of results after the processing.

Redis

We also use a Redis DB for caching intermediate results and events from various emba.sh runs. These events are not persisted permanently. Redis is mostly used as queue to store these events till they are pushed to be displayed on frontend through websockets.

Setup(Refer to Build Documentation)

docker-compose takes care of all the setup. The variables inside .env are required to access redis container.

uwsgi

We use Apache for our wsgi which is configured and used with the pip-tool mod-wsgi-standalone

asgi

For live data exchange between server and client, the Django framework provides us with websocket communication via asgi, which is also called Django Channels. For server side handling the framework has the consumers.py class in place, whereas for client side handling you just have to open a websocket connection, opening the IP and Port specified in the backend. As a result, multiple clients can connect to the backend. The url routing is declared in the routing.py file, which is the equivalent to the urls.py file for HTTP communication.

Guidelines

testing

!! Work in progress The project uses the django testing environment. To write your own unittest you need a python file labeled test_. Within there need to be class extending TestCase. On test execution all methods in test classes are invoked an run.\

Existing test cases:

test_archiver.py: For testing Archiver class in embark/uploader/archiver.py
test_boundedExecutor.py: For testing BoundedExecutor class in embark/uploader/boundedExecutor.py
test_users.py: Tests for embark/users/views.py
test_logreader.py: Tests for embark/embark/logreader.py

There is Pipeline checking regression by running the django test environment: python manage.py test.
You are encouraged to run the tests locally beforehand.

logging and debugging

For logging use djangos logging environment.
Configuration can be found in embark/settings.py as LOGGING. Logs can be inspected at embark/*.log

For further reading see how to logging.