To get started with the project, you first need to clone the repository and install the required dependencies. This guide will walk you through the steps required to set up the project on your local machine.

Installation

This will walk you through the steps required to install and set up the project with a pyproject.toml configuration.

Requirements

This project requires Python 3.12 - 3.13 for use. Note: Because the current version of Microsoft GraphRag requires a Python version equal to or below 3.12, we recommend using Python 3.12 if you intend to work with Microsoft GraphRag. However, you can still run the code with Python 3.13 if you do not intend to work with Microsoft GraphRag as it is an optional dependency.

Prerequisites: Install and Initialize Git LFS

Large files are stored in Git LFS. If you don't have Git LFS installed, follow the installation instruction here to install it

After installing Git LFS, initialize it by running the following command in the terminal:

git lfs install

Cloning the project repository

With an open terminal, clone the project’s repository by executing the following command:

git clone https://gitlab.kit.edu/kit/kastel/sdq/stud/abschlussarbeiten/masterarbeiten/marco-schneider/ma-schneider-implementation.git

Note: If you are using a windows system, you may encounter an error with the file paths being too long. If this is the case, you have too configure your Windows to allow long paths. This can be done by opening a Powershell with administrator rights and running the commands: reg add "HKLM\SYSTEM\CurrentControlSet\Control\FileSystem" /v LongPathsEnabled /t REG_DWORD /d 1 and git config --system core.longpaths true. Restart your system and try the clone process again.

Once the cloning process is complete, navigate into the project directory:

cd ma-schneider-implementation

Then, pull the large files managed by Git LFS:

git lfs pull

(optional but recommended) Create virtual environment

You can optionally choose to run the project in a virtual environment. It is strongly recommended doing this if you have multiple python projects that you would like to run on your device. This short guide will show you how to get a virtual python environment running with pythons venv module.

Navigate your terminal into the sqa-system folder (where pyproject.toml is located).

cd sqa-system

Create the virtual environment.

python -m venv venv

Now a new folder has been created in the projects root directory that includes the necessary files for the virtual environment. The folder should be called venv. After the preparation for the environment is done, the environment needs to be activated.

Activate the virtual environment.

Windows

venv\Scripts\activate

Linux

source venv/bin/activate

Your terminal should now show a (venv) next to the path. If that is the case the virtual environment has been activated. If you encounter a terminal permission error take a look at this post or use CMD instead of powershell.

Installing the Requirements

This project uses a pyproject.toml file to manage its dependencies, you can install everything with a single pip command. Make sure your terminal is located in the sqa-system directory (where pyproject.toml is located). Execute either of the following commands. It should then proceed to download and install all dependencies required to run the project.

Note: Because currently the Microsoft GraphRAG retriever is not working with the codecarbon package, you either have to choose to install the codecarbon package or the Microsoft GraphRAG retriever.

Install the following if you are interested to run the project without the Microsoft GraphRAG retriever:

pip install .[codecarbon]

If you plan to use Microsoft GraphRag (and you are on Python 3.12 or below), run the following command. You can also first run the above command to install with the codecarbon package, run the experiments, delete the codecarbon package and then run the following command to install with Microsoft GraphRAG. This ensures that the emission tracking is working for the experiments that are not using Microsoft GraphRAG.

pip install .[graphrag]

Note: It is possible that you may have to install gfortran to install the gensim package. On linux you can do this by running sudo apt install build-essential gfortran python3-dev python3-pip

Creating an Account on Weight & Biases and Logging In

To run the experiments, the SQA system uses Weight & Biases (W&B) to track the experiments on the dashboard. You can create a free account on Weight & Biases.

After the account has been created, run the following command in the terminal to login:

wandb login

🥳 That's it! You can now use the SQA-system. To replicate the experiments read the tutorial here.

getting_started - Sidies/MasterThesis-HubLink GitHub Wiki

Installation

Requirements

Prerequisites: Install and Initialize Git LFS

Cloning the project repository

(optional but recommended) Create virtual environment

Installing the Requirements

Creating an Account on Weight & Biases and Logging In

⚠️ GitHub.com Fallback ⚠️

getting_started - Sidies/MasterThesis-HubLink GitHub Wiki

Installation

Requirements

Prerequisites: Install and Initialize Git LFS

Cloning the project repository

(optional but recommended) Create virtual environment

Installing the Requirements

Creating an Account on Weight & Biases and Logging In

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️