Database - serlo/documentation GitHub Wiki

Almost all data of Serlo is currently stored in a MySQL database. The database is accessed by the database-layer which provides HTTP endpoints for making predefined queries and mutations (see article about the Serlo infrastructure).

How to download anonymized dumps of the database

Each day we store an anonymized dump of the database in the bucket gs://anonymous-data. In order to access it, you need to have gsutil installed (see "How to install Google Cloud CLI"). You also need to be logged in with your @serlo.org account as well as having access to the bucket. Then you can list the available dumps via gsutil ls gs://anonymous-data:

$ gsutil ls gs://anonymous-data | tail                                                                          
gs://anonymous-data/dump-2023-02-25.zip
gs://anonymous-data/dump-2023-02-26.zip
gs://anonymous-data/dump-2023-02-27.zip
gs://anonymous-data/dump-2023-02-28.zip
gs://anonymous-data/dump-2023-03-01.zip
gs://anonymous-data/dump-2023-03-02.zip
gs://anonymous-data/dump-2023-03-03.zip
gs://anonymous-data/dump-2023-03-04.zip
gs://anonymous-data/dump-2023-03-05.zip
gs://anonymous-data/dump-2023-03-06.zip

You can download a dump to your local storage via gsutil cp:

$ gsutil cp gs://anonymous-data/dump-2023-03-06.zip /tmp
Copying gs://anonymous-data/dump-2023-03-06.zip...
/ [1 files][ 87.5 MiB/ 87.5 MiB]                                                
Operation completed over 1 objects/87.5 MiB.

The dump contains a MySQL mysql.sql of the database (without data of the user table) and an anonymized dump user.csv of the user table.

How to start a local version of the database

The easiest way to start a local version of the database is to use the repository database-layer.

Requirements

  • You need to install docker and docker-compose.
  • In case you want to use current anonymized dumps (see section above) you need gsutil installed.

Setup

  1. Clone database-layer
  2. Run yarn start to start a local version of the database via docker-compose. You will have access to the database via mysql://root:secret@localhost:3306/serlo:
User: root
Password: secret
Port: 3306
Database: serlo

Useful commands

  • yarn mysql – Start a shell for the database
  • yarn mysql:import-anonymous-data – Import a current and anonymized dump of the Serlo database (normally one day old)
  • yarn mysql:rollback – Rollback to the 2015 dump of the database In