Employee Database UseCase Doc - lambda-land/VDBMS GitHub Wiki
- Install original employees example database in https://github.com/datacharmer/test_db .
- Run query in
VDBMS/databases/employeeDB/step1_chop_employees.sql(to create view table based on schema evolution example). - Run query in
VDBMS/databases/employeeDB/step2_build_vdb.sql(to create variational database for employee use case). - Migrate data from MySQL to PostgreSQL using pgloader:
- install the the pgloader in pgloader github link
- edit content in the lisp script located at:
/VDBMS/databases/employeeDB/step3_migrate.lisp - run the lisp script to migrate: run command
pgloader step3_migrate.lispin terminal
- If you couldn't get setp 4 to work follow the steps 4-9 except step 8 for creating enron email vdb, here, and adjust it to the employee database.
Link: https://github.com/lambda-land/VDBMS/tree/VDBMSCaseStudy/src/VDBMS/UseCases/EmployeeUseCase
The schema evolution of employee database is from paper Managing and Querying Transaction-time Databases under Schema Evolution
The data set in MySql format is from https://github.com/datacharmer/test_db
- v[x]_[table name]: the table in version x
- v_[table name]: the table in variational database
- Plain database schema
- schema version 1
- v1_engineerpersonnel (empno, name, hiredate, title, deptname)
- v1_otherpersonnel (empno, name, hiredate, title, deptname)
- job (title, salary)
- schema version 2
- v2_empacct (empno, name, hiredate, title, deptname)
- job (title, salary)
- schema version 3
- v3_empacct (empno, name, hiredate, title, deptno)
- job (title, salary)
- dept (deptname, deptno, managerno)
- schema version 4
- v4_empacct (empno, hiredate, title, deptno)
- job (title, salary)
- dept (deptname, deptno, managerno)
- v4_empbio (empno, sex, birthdate, name)
- schema version 5
- v5_empacct (empno, hiredate, title, deptno, salary)
- dept (deptname, deptno, managerno)
- v5_empbio (empno, sex, birthdate, firstname, lastname)
- schema version 1
- variational database
- v_engineerpersonnel (empno, name, hiredate, title, deptname, presCond)
- v_engineerpersonnel (empno, name, hiredate, title, deptname, presCond)
- v_job (title, salary, presCond)
- v_empacct (empno, name, hiredate, title, deptname, deptno, salary, presCond)
- v_dept (deptname, deptno, managerno, presCond)
- v_empbio (empno, sex, birthdate, name,lastname,firstname, presCond)
Note that elem == elemTrue (Element has feature expression as True, we omit True here.
- v_engineerpersonnelv1(empno, name, hiredate, title, deptname)
- v_otherpersonnelv1 (empno, name, hiredate, title, deptname)
- v_empacctv2 or v3 or v4 or v5(empno, namev2 or v3, hiredate, title, deptnamev2, deptnov3 or v4 or v5, salaryv5)
- v_jobv2 or v3 or v4 (title, salary)
- v_deptv3 or v4 or v5 (deptname, deptno, managerno)
- v_empbiov4 or v5 (empno, sex, birthdate, namev4, firstnamev5, lastnamev5)
( v1 and not v2 and not v3 and not v4 and not v5) or
(not v1 and v2 and not v3 and not v4 and not v5) or
(not v1 and not v2 and v3 and not v4 and not v5) or
(not v1 and not v2 and not v3 and v4 and not v5) or
(not v1 and not v2 and not v3 and not v4 and v5)
Divided employees into 5 variants based on their hire date. We have 2 options here, but our experiments are based on the option 2.
Option 1: Chop the employees(total: 300024) into 5 parts and do not migrate data of old version to new version
Based on original number of employees.
- variant1:
hire_date < '1988-01-01'- total number: 104967
- variant2:
'1988-01-01' <= hire_date < '1991-01-01'- total number: 85440
- variant3:
'1991-01-01' <= hire_date < '1994-01-01'- total number: 60742
- variant4:
'1994-01-01' <= hire_date < '1997-01-01'- total number: 36524
- variant5:
hire_date >= '1997-01-01'- total number: 12351
- variant1: employees whose
hire_date < '1988-01-01'- total number: 104967
- variant2: employees whose
hire_date < '1991-01-01'- total number: 190407
- variant3: employees whose
hire_date < '1994-01-01'- total number: 251149
- variant4: employees whose
hire_date < '1997-01-01'- total number: 287673
- variant5: employees whose
hire_date < '2000-01-28'- total number: 300024
-
Chop employees into 5 parts (A,B,C,D,E) based on hire_date and migrate from old version to new one. Based on original number of employees whose to_date is '9999-01-01', we only experiment on the "permanent" employee.
- variant1:
hire_date < '1988-01-01'- total number: 83815 (42247 engineer + 41568 other)
- variant2:
hire_date < '1991-01-01'- total number: 152216 (83815 + 68401)
- variant3:
hire_date < '1994-01-01'- total number: 201006 (152216 + 48790)
- variant4:
hire_date < '1997-01-01'- total number: 230297 (201006 + 29291)
- variant5: `hire_date < '2000-01-28'``
- total number: 240124 (230297 + 9827)
- variant1:
-
For v-table
v_engineerpersonnelandv_otherpersonnel:- populate employees in A whose title is Enginner into enginnerpersonnel
- insert the rest of employees of A into otherpersonnel
- The presCond value for these 2 tables shoule be
v1
-
For v-table ``v_empacct`:
- Employees in B,C,D,E will be inserted into v_empacct
| - | empno | name | hiredate | title | deptname | deptno | salary | presCond |
|---|---|---|---|---|---|---|---|---|
| B | xx | xx | xx | xx | xx | NULL | NULL | v2 |
| C | xx | xx | xx | xx | NULL | xx | NULL | v3 |
| D | xx | NULL | xx | xx | NULL | xx | NULL | v4 |
| E | xx | NULL | xx | xx | NULL | xx | xx | v5 |
- For v-table
v_job, Employee in A,B,C,D,E will be inserted in it. - For v-table
v_dept, employes in C, D, E will be inserted in it as follows:
| - | deptname | deptno | managerno | presCond |
|---|---|---|---|---|
| C | xx | xx | xx | v3 |
| D | xx | xx | xx | v4 |
| E | xx | xx | xx | v5 |
- For v-table
v_empbio, employee in D, E will be inserted in it accordingly:
| - | empno | sex | birthdate | name | firstname | lastname | presCond |
|---|---|---|---|---|---|---|---|
| D | xx | xx | xx | xx | NULL | NULL | v4 |
| E | xx | xx | xx | NULL | xx | xx | v5 |
The database is originally created in mySql, and converted to SQLite3 compatible dump by use tool mysql2sqlite.
- The dump file is located in
./VDBMS/databases/employeeDB- mySql dump file:
emp_vdb.sql - sqlite3 dump file:
emp_vdb.db
- mySql dump file: