Database explained - serlo/documentation GitHub Wiki
THIS PAGE IS OUTDATED
If you connect to the database, you're going to see a lot of tables and maybe feel lost. We explain here only the most important tables for you. Actually, the others may not be used anymore or will be deprecated soon.
General
Instance and Language
Table instance
stores data about the subdomains (like https://*de*.serlo.org or https://*es*.serlo.org), while the language
table has info about language of the instances.
SELECT * FROM instance LIMIT 1;
+----+-------------+---------+-----------+
| id | language_id | name | subdomain |
+----+-------------+---------+-----------+
| 1 | 1 | Deutsch | de |
+----+-------------+---------+-----------+
Uuid
The concept of uuid is very important in Serlo's app.
The uuid table holds ids that are used in some other tables as id.
SELECT * FROM uuid LIMIT 6;
+----+---------+---------------+
| id | trashed | discriminator |
+----+---------+---------------+
| 1 | 0 | user |
| 2 | 0 | user |
| 3 | 0 | taxonomyTerm |
| 4 | 0 | page |
| 5 | 0 | taxonomyTerm |
| 6 | 0 | user |
+----+---------+---------------+
SELECT id, email, username FROM user LIMIT 3;
+----+--------------------+----------+
| id | email | username |
+----+--------------------+----------+
| 1 | [email protected] | admin |
| 2 | [email protected] | yolobird |
| 6 | [email protected] | kathi |
+----+--------------------+----------+
SELECT id, taxonomy_id, term_id FROM term_taxonomy LIMIT 2;
+----+-------------+---------+
| id | taxonomy_id | term_id |
+----+-------------+---------+
| 3 | 1 | 1 |
| 5 | 3 | 3 |
+----+-------------+---------+
Notice the jumping ids in user
and term_taxonomy
. They correspond to the order in table uuid
.
The discriminator
column gives a hint in which table the corresponding uuid can be found.
SELECT DISTINCT discriminator FROM uuid;
+----------------+
| discriminator |
+----------------+
| attachment |
| blogPost |
| comment |
| entity |
| entityRevision |
| page |
| pageRevision |
| taxonomyTerm |
| user |
+----------------+
Page
These are static pages in the site. Example: https://de.serlo.org/21413/geschichte.
Page Revisions
These are versions of the page.
SELECT * FROM page_revision where page_repository_id = 16256 LIMIT 2;
+-------+-----------+--------------------+----------+------------------...-+---------------------+---------+
| id | author_id | page_repository_id | title | content | date | trashed |
+-------+-----------+--------------------+----------+------------------...-+---------------------+---------+
| 16258 | 10 | 16256 | Das Team | [[{"col":24,"cont... | 2014-03-07 00:28:09 | 0 |
| 16278 | 10 | 16256 | Das Team | [[{"col":24,"cont... | 2014-03-07 09:41:40 | 0 |
+-------+-----------+--------------------+----------+----------------------+---------------------+---------+
Entity
Entity comprises the following types:
SELECT DISTINCT name AS type_name FROM type JOIN entity ON type.id = entity.type_id;
+-----------------------------------------+
| type_name |
+-----------------------------------------+
| applet |
| article |
| course |
| course-page |
| event |
| grouped-text-exercise |
| input-expression-equal-match-challenge |
| input-number-exact-match-challenge |
| input-string-normalized-match-challenge |
| math-puzzle |
| multiple-choice-right-answer |
| multiple-choice-wrong-answer |
| single-choice-right-answer |
| single-choice-wrong-answer |
| text-exercise |
| text-exercise-group |
| text-solution |
| video |
+-----------------------------------------+
Entity Revision
Entities have revisions, which are like versions.
author_id
points to an element in the user
table and repository_id
to an entity
.
SELECT id, author_id, repository_id FROM entity_revision LIMIT 5;
+------+-----------+---------------+
| id | author_id | repository_id |
+------+-----------+---------------+
| 1496 | 6 | 1495 |
| 1498 | 6 | 1497 |
| 1500 | 6 | 1499 |
| 1502 | 6 | 1501 |
| 1504 | 6 | 1503 |
+------+-----------+---------------+
The current_revision_id
at the entity
table points to the active ("checked out") revision.
Entity Revision Fields
The data of an entity revision are in the entity_revision_field
table. For example:
SELECT * FROM entity_revision_field WHERE entity_revision_id = 1496;
+----+---------+--------------------+------------------------------------------------------------------------------------------...--+
| id | field | entity_revision_id | value |
+----+---------+--------------------+-------------------------------------------------------------------------------------------...-+
| 1 | title | 1496 | Addition |
| 2 | content | 1496 | [[{"col":24,"content":"Die **Addition**, umgangssprachlich auch **Plus-Rechnen** genannt, ... |
+----+---------+--------------------+-------------------------------------------------------------------------------------------...-+
See https://de.serlo.org/mathe/1495/addition (uuid 1496 is an entity revision of uuid 1495)
Taxonomy
Taxonomy are tree structures that organize taxonomies and entities. One entity can be in many taxonomies. These are the possible taxonomy types:
SELECT DISTINCT name AS taxonomy_types FROM type JOIN taxonomy ON type.id = type_id;
+-------------------------+
| taxonomy_types |
+-------------------------+
| blog |
| curriculum |
| curriculum-topic |
| curriculum-topic-folder |
| forum |
| forum-category |
| locale |
| root |
| subject |
| topic |
| topic-folder |
+-------------------------+
The root
is the initial taxonomy that contains children and no parent.
SELECT id, parent_id FROM term_taxonomy LIMIT 3;
+-------+-----------+
| id | parent_id |
+-------+-----------+
| 3 | |
...
| 5 | 3 |
+-------+-----------+
The taxonomy 5 is child from taxonomy 3 which in its turn doesn't have a parent. See https://de.serlo.org/3 and https://de.serlo.org/5
Taxonomy Term
Term plays the role of a title of the taxonomy.
SELECT * FROM term LIMIT 5;
+------+-------------+---------------------------------------+
| id | instance_id | name |
+------+-------------+---------------------------------------+
| 148 | 1 | zwei Gleichungen mit 2 Unbekannten |
| 2594 | 1 | Aufgaben zur Division von Brüchen |
| 3660 | 5 | உவமை அணி |
| 2777 | 1 | Magnetismus und Elektromagnetismus |
| 2773 | 1 | Optik |
+------+-------------+---------------------------------------+
term_taxonomy
holds a description of the term, if any.
The weitght
column determines its order: the higher the weight, the later is the order in relation to its siblings.
SELECT * FROM term_taxonomy LIMIT 5;
+------+-------------+---------+-----------+-------------...-+--------+
| id | taxonomy_id | term_id | parent_id | description | weight |
+------+-------------+---------+-----------+-------------...-+--------+
| 3 | 1 | 1 | | | |
| 5 | 3 | 3 | 3 | | 27 |
| 7 | 4 | 927 | 1397 | Artikel und ... | 17 |
| 8 | 5 | 1223 | 3 | | 15 |
| 1288 | 4 | 6 | 5 | [[{"col":24 ... | 7 |
+------+-------------+---------+-----------+-------------...-+--------+
Entities and Taxonomies
The relation between entities and taxonomies are established in the term_taxonomy_entity
table.
SELECT * FROM term_taxonomy_entity LIMIT 5;
+-----+-----------+------------------+----------+
| id | entity_id | term_taxonomy_id | position |
+-----+-----------+------------------+----------+
| 319 | 2131 | 7 | 184 |
| 345 | 2183 | 7 | 112 |
| 346 | 2185 | 7 | 113 |
| 410 | 3089 | 1401 | 2 |
| 445 | 3473 | 1318 | 4 |
+-----+-----------+------------------+----------+
The position
column determines its order: the higher the position, the later is the order in relation to its siblings.
Comment
The uuid_id
tells to which repository (entity or taxonomy) it is a comment of (v.g. https://de.serlo.org/239758).
If a comment has a parent_id
it means it is a comment about a comment (v.g. https://de.serlo.org/239769).
SELECT id, uuid_id, parent_id, title, content FROM comment LIMIT 4;
+-------+---------+-----------+---------+--------------------------------------------+
| id | uuid_id | parent_id | title | content |
+-------+---------+-----------+---------+--------------------------------------------+
| 15468 | 7 | | Artikel | Hier sind die Artikel aus dem alten Serlo! |
| 15469 | | 15468 | | Bitte neu einsortieren :) |
+-------+---------+-----------+---------+--------------------------------------------+
The comment_vote
table has info about upvoted comments.
In term_taxonomy_comment
table there are additional taxonomies to which a comment belong.
Subscription
User can subscribe to certain elements in order to receive notifications in case of updates.
At subscription
table the notify_mailman
column determines if user are going to receive emails.
SELECT * FROM subscription ORDER BY id DESC LIMIT 5;
+------+---------+---------+----------------+---------------------+
| id | uuid_id | user_id | notify_mailman | date |
+------+---------+---------+----------------+---------------------+
| 9478 | 35624 | 6 | 1 | 2022-02-02 14:34:45 |
| 9477 | 35564 | 1 | 1 | 2020-01-28 19:39:44 |
| 9476 | 35551 | 1 | 1 | 2019-12-02 21:33:30 |
| 9475 | 9561 | 1 | 1 | 2019-12-02 11:46:13 |
| 9474 | 35547 | 1 | 1 | 2019-12-01 20:30:08 |
+------+---------+---------+----------------+---------------------+
Event
There are types of events:
SELECT name FROM event;
+-----------------------------+
| name |
+-----------------------------+
| discussion/comment/archive |
| discussion/comment/create |
| discussion/create |
| discussion/restore |
| entity/create |
| entity/link/create |
| entity/link/remove |
| entity/revision/add |
| entity/revision/checkout |
| entity/revision/reject |
| license/object/set |
| taxonomy/term/associate |
| taxonomy/term/create |
| taxonomy/term/dissociate |
| taxonomy/term/parent/change |
| taxonomy/term/update |
| uuid/restore |
| uuid/trash |
+-----------------------------+
Event Logs
The actual events are stored in event_log
table, in which the event_id
column points to the event
table, actor_id
to the user
table.
SELECT log.id, log.actor_id, user.username AS actor_name, log.event_id, event.name AS event_name, uuid.discriminator AS uuid_type
FROM event_log log
LEFT JOIN user ON log.actor_id = user.id
LEFT JOIN event ON event.id = event_id
LEFT JOIN uuid ON uuid.id = log.uuid_id
ORDER BY log.id LIMIT 1;
+----+----------+------------+----------+-------------------------+--------------+
| id | actor_id | actor_name | event_id | event_name | uuid_type |
+----+----------+------------+----------+-------------------------+--------------+
| 1 | 6 | kathi | 1 | taxonomy/term/associate | taxonomyTerm |
+----+----------+------------+----------+-------------------------+--------------+
Event Parameters
Events can have additional information, that is stored in event_parameter
table.
SELECT ep.id, log_id, name_id, name AS param_name, epu.uuid_id, uuid.discriminator AS uuid_type FROM event_parameter ep
JOIN event_parameter_name epn ON name_id = epn.id JOIN event_parameter_uuid epu ON ep.id = epu.event_parameter_id
JOIN uuid ON epu.uuid_id = uuid.id limit 1;
+----+--------+---------+------------+---------+-----------+
| id | log_id | name_id | param_name | uuid_id | uuid_type |
+----+--------+---------+------------+---------+-----------+
| 1 | 1 | 1 | object | 1199 | blogPost |
+----+--------+---------+------------+---------+-----------+
These are possible keys...
SELECT * FROM event_parameter_name;
+----+------------+
| id | name |
+----+------------+
| 1 | object |
| 2 | repository |
| 3 | parent |
| 4 | on |
| 5 | discussion |
| 6 | reason |
| 7 | from |
| 8 | to |
+----+------------+
... that are linked to a) elements in the uuid table according to event_parameter_uuid
table (like the query example above), and/or b) data in the event_parameter_string
table (v.g. event parameters of key 'reason').