Database explained - serlo/documentation GitHub Wiki

THIS PAGE IS OUTDATED

If you connect to the database, you're going to see a lot of tables and maybe feel lost. We explain here only the most important tables for you. Actually, the others may not be used anymore or will be deprecated soon.

General

Instance and Language

Table instance stores data about the subdomains (like https://*de*.serlo.org or https://*es*.serlo.org), while the language table has info about language of the instances.

SELECT * FROM instance LIMIT 1;
+----+-------------+---------+-----------+
| id | language_id | name    | subdomain |
+----+-------------+---------+-----------+
|  1 |           1 | Deutsch | de        |
+----+-------------+---------+-----------+

Uuid

The concept of uuid is very important in Serlo's app.
The uuid table holds ids that are used in some other tables as id.

SELECT * FROM uuid LIMIT 6;
+----+---------+---------------+
| id | trashed | discriminator |
+----+---------+---------------+
|  1 |       0 | user          |
|  2 |       0 | user          |
|  3 |       0 | taxonomyTerm  |
|  4 |       0 | page          |
|  5 |       0 | taxonomyTerm  |
|  6 |       0 | user          |
+----+---------+---------------+

SELECT id, email, username FROM user LIMIT 3;
+----+--------------------+----------+
| id | email              | username |
+----+--------------------+----------+
|  1 | [email protected] | admin    |
|  2 | [email protected]    | yolobird |
|  6 | [email protected]   | kathi    |
+----+--------------------+----------+

SELECT id, taxonomy_id, term_id FROM term_taxonomy LIMIT 2;
+----+-------------+---------+
| id | taxonomy_id | term_id |
+----+-------------+---------+
|  3 |           1 |       1 |
|  5 |           3 |       3 |
+----+-------------+---------+

Notice the jumping ids in user and term_taxonomy. They correspond to the order in table uuid.

The discriminator column gives a hint in which table the corresponding uuid can be found.

SELECT DISTINCT discriminator FROM uuid;
+----------------+
| discriminator  |
+----------------+
| attachment     |
| blogPost       |
| comment        |
| entity         |
| entityRevision |
| page           |
| pageRevision   |
| taxonomyTerm   |
| user           |
+----------------+

Page

These are static pages in the site. Example: https://de.serlo.org/21413/geschichte.

Page Revisions

These are versions of the page.

SELECT * FROM page_revision where page_repository_id = 16256 LIMIT 2;
+-------+-----------+--------------------+----------+------------------...-+---------------------+---------+
| id    | author_id | page_repository_id | title    | content              | date                | trashed |
+-------+-----------+--------------------+----------+------------------...-+---------------------+---------+
| 16258 |        10 |              16256 | Das Team | [[{"col":24,"cont... | 2014-03-07 00:28:09 |       0 |
| 16278 |        10 |              16256 | Das Team | [[{"col":24,"cont... | 2014-03-07 09:41:40 |       0 |
+-------+-----------+--------------------+----------+----------------------+---------------------+---------+

Entity

Entity comprises the following types:

SELECT DISTINCT name AS type_name FROM type JOIN entity ON type.id = entity.type_id;
+-----------------------------------------+
| type_name                               |
+-----------------------------------------+
| applet                                  |
| article                                 |
| course                                  |
| course-page                             |
| event                                   |
| grouped-text-exercise                   |
| input-expression-equal-match-challenge  |
| input-number-exact-match-challenge      |
| input-string-normalized-match-challenge |
| math-puzzle                             |
| multiple-choice-right-answer            |
| multiple-choice-wrong-answer            |
| single-choice-right-answer              |
| single-choice-wrong-answer              |
| text-exercise                           |
| text-exercise-group                     |
| text-solution                           |
| video                                   |
+-----------------------------------------+

Entity Revision

Entities have revisions, which are like versions.
author_id points to an element in the user table and repository_id to an entity.

SELECT id, author_id, repository_id FROM entity_revision LIMIT 5;
+------+-----------+---------------+
| id   | author_id | repository_id |
+------+-----------+---------------+
| 1496 |         6 |          1495 |
| 1498 |         6 |          1497 |
| 1500 |         6 |          1499 |
| 1502 |         6 |          1501 |
| 1504 |         6 |          1503 |
+------+-----------+---------------+

The current_revision_id at the entity table points to the active ("checked out") revision.

Entity Revision Fields

The data of an entity revision are in the entity_revision_field table. For example:

SELECT * FROM entity_revision_field WHERE entity_revision_id = 1496;
+----+---------+--------------------+------------------------------------------------------------------------------------------...--+
| id | field   | entity_revision_id | value                                                                                         |    
+----+---------+--------------------+-------------------------------------------------------------------------------------------...-+
|  1 | title   |               1496 | Addition                                                                                      |
|  2 | content |               1496 | [[{"col":24,"content":"Die **Addition**, umgangssprachlich auch **Plus-Rechnen** genannt, ... |
+----+---------+--------------------+-------------------------------------------------------------------------------------------...-+

See https://de.serlo.org/mathe/1495/addition (uuid 1496 is an entity revision of uuid 1495)

Taxonomy

Taxonomy are tree structures that organize taxonomies and entities. One entity can be in many taxonomies. These are the possible taxonomy types:

SELECT DISTINCT name AS taxonomy_types FROM type JOIN taxonomy ON type.id = type_id;
+-------------------------+
| taxonomy_types          |
+-------------------------+
| blog                    |
| curriculum              |
| curriculum-topic        |
| curriculum-topic-folder |
| forum                   |
| forum-category          |
| locale                  |
| root                    |
| subject                 |
| topic                   |
| topic-folder            |
+-------------------------+

The root is the initial taxonomy that contains children and no parent.

SELECT id, parent_id FROM term_taxonomy LIMIT 3;
+-------+-----------+
| id    | parent_id |
+-------+-----------+
|     3 |           |
 ...
|     5 |         3 |
+-------+-----------+

The taxonomy 5 is child from taxonomy 3 which in its turn doesn't have a parent. See https://de.serlo.org/3 and https://de.serlo.org/5

Taxonomy Term

Term plays the role of a title of the taxonomy.

SELECT * FROM term LIMIT 5;
+------+-------------+---------------------------------------+
| id   | instance_id | name                                  |
+------+-------------+---------------------------------------+
|  148 |           1 |    zwei Gleichungen mit 2 Unbekannten |
| 2594 |           1 |   Aufgaben zur Division von Brüchen   |
| 3660 |           5 |   உவமை அணி                         |
| 2777 |           1 |   Magnetismus und Elektromagnetismus  |
| 2773 |           1 |   Optik                               |
+------+-------------+---------------------------------------+

term_taxonomy holds a description of the term, if any.
The weitght column determines its order: the higher the weight, the later is the order in relation to its siblings.

SELECT * FROM term_taxonomy LIMIT 5;
+------+-------------+---------+-----------+-------------...-+--------+
| id   | taxonomy_id | term_id | parent_id | description     | weight |
+------+-------------+---------+-----------+-------------...-+--------+
|    3 |           1 |       1 |           |                 |        |
|    5 |           3 |       3 |         3 |                 |     27 |
|    7 |           4 |     927 |      1397 | Artikel und ... |     17 |
|    8 |           5 |    1223 |         3 |                 |     15 | 
| 1288 |           4 |       6 |         5 | [[{"col":24 ... |      7 |
+------+-------------+---------+-----------+-------------...-+--------+

Entities and Taxonomies

The relation between entities and taxonomies are established in the term_taxonomy_entity table.

SELECT * FROM term_taxonomy_entity LIMIT 5;
+-----+-----------+------------------+----------+
| id  | entity_id | term_taxonomy_id | position |
+-----+-----------+------------------+----------+
| 319 |      2131 |                7 |      184 |
| 345 |      2183 |                7 |      112 |
| 346 |      2185 |                7 |      113 |
| 410 |      3089 |             1401 |        2 |
| 445 |      3473 |             1318 |        4 |
+-----+-----------+------------------+----------+

The position column determines its order: the higher the position, the later is the order in relation to its siblings.

Comment

The uuid_id tells to which repository (entity or taxonomy) it is a comment of (v.g. https://de.serlo.org/239758). If a comment has a parent_id it means it is a comment about a comment (v.g. https://de.serlo.org/239769).

SELECT id, uuid_id, parent_id, title, content FROM comment LIMIT 4;
+-------+---------+-----------+---------+--------------------------------------------+
| id    | uuid_id | parent_id | title   | content                                    |
+-------+---------+-----------+---------+--------------------------------------------+
| 15468 |       7 |           | Artikel | Hier sind die Artikel aus dem alten Serlo! |
| 15469 |         |     15468 |         | Bitte neu einsortieren :)                  |
+-------+---------+-----------+---------+--------------------------------------------+

The comment_vote table has info about upvoted comments. In term_taxonomy_comment table there are additional taxonomies to which a comment belong.

Subscription

User can subscribe to certain elements in order to receive notifications in case of updates. At subscription table the notify_mailman column determines if user are going to receive emails.

SELECT * FROM subscription ORDER BY id DESC LIMIT 5;
+------+---------+---------+----------------+---------------------+
| id   | uuid_id | user_id | notify_mailman | date                |
+------+---------+---------+----------------+---------------------+
| 9478 |   35624 |       6 |              1 | 2022-02-02 14:34:45 |
| 9477 |   35564 |       1 |              1 | 2020-01-28 19:39:44 |
| 9476 |   35551 |       1 |              1 | 2019-12-02 21:33:30 |
| 9475 |    9561 |       1 |              1 | 2019-12-02 11:46:13 |
| 9474 |   35547 |       1 |              1 | 2019-12-01 20:30:08 |
+------+---------+---------+----------------+---------------------+

Event

There are types of events:

SELECT name FROM event;
+-----------------------------+
| name                        |
+-----------------------------+
| discussion/comment/archive  |
| discussion/comment/create   |
| discussion/create           |
| discussion/restore          |
| entity/create               |
| entity/link/create          |
| entity/link/remove          |
| entity/revision/add         |
| entity/revision/checkout    |
| entity/revision/reject      |
| license/object/set          |
| taxonomy/term/associate     |
| taxonomy/term/create        |
| taxonomy/term/dissociate    |
| taxonomy/term/parent/change |
| taxonomy/term/update        |
| uuid/restore                |
| uuid/trash                  |
+-----------------------------+

Event Logs

The actual events are stored in event_log table, in which the event_id column points to the event table, actor_id to the user table.

SELECT log.id, log.actor_id, user.username AS actor_name, log.event_id, event.name AS event_name, uuid.discriminator AS uuid_type  
FROM event_log log  
LEFT JOIN user ON log.actor_id = user.id  
LEFT JOIN event ON event.id = event_id  
LEFT JOIN uuid ON uuid.id = log.uuid_id  
ORDER BY log.id LIMIT 1;
+----+----------+------------+----------+-------------------------+--------------+
| id | actor_id | actor_name | event_id | event_name              | uuid_type    |
+----+----------+------------+----------+-------------------------+--------------+
|  1 |        6 | kathi      |        1 | taxonomy/term/associate | taxonomyTerm |
+----+----------+------------+----------+-------------------------+--------------+

Event Parameters

Events can have additional information, that is stored in event_parameter table.

SELECT ep.id, log_id, name_id, name AS param_name, epu.uuid_id, uuid.discriminator AS uuid_type FROM event_parameter ep  
JOIN event_parameter_name epn ON name_id = epn.id JOIN event_parameter_uuid epu ON ep.id = epu.event_parameter_id  
JOIN uuid ON epu.uuid_id = uuid.id limit 1;
+----+--------+---------+------------+---------+-----------+
| id | log_id | name_id | param_name | uuid_id | uuid_type |
+----+--------+---------+------------+---------+-----------+
|  1 |      1 |       1 | object     |    1199 | blogPost  |
+----+--------+---------+------------+---------+-----------+

These are possible keys...

SELECT * FROM event_parameter_name;
+----+------------+
| id | name       |
+----+------------+
|  1 | object     |
|  2 | repository |
|  3 | parent     |
|  4 | on         |
|  5 | discussion |
|  6 | reason     |
|  7 | from       |
|  8 | to         |
+----+------------+

... that are linked to a) elements in the uuid table according to event_parameter_uuid table (like the query example above), and/or b) data in the event_parameter_string table (v.g. event parameters of key 'reason').