Employee Database UseCase Doc - lambda-land/VDBMS GitHub Wiki
- Install original employees example database in https://github.com/datacharmer/test_db .
- Run query in
VDBMS/databases/employeeDB/step1_chop_employees.sql
(to create view table based on schema evolution example). - Run query in
VDBMS/databases/employeeDB/step2_build_vdb.sql
(to create variational database for employee use case). - Migrate data from MySQL to PostgreSQL using pgloader:
- install the the pgloader in pgloader github link
- edit content in the lisp script located at:
/VDBMS/databases/employeeDB/step3_migrate.lisp
- run the lisp script to migrate: run command
pgloader step3_migrate.lisp
in terminal
- If you couldn't get setp 4 to work follow the steps 4-9 except step 8 for creating enron email vdb, here, and adjust it to the employee database.
Link: https://github.com/lambda-land/VDBMS/tree/VDBMSCaseStudy/src/VDBMS/UseCases/EmployeeUseCase
The schema evolution of employee database is from paper Managing and Querying Transaction-time Databases under Schema Evolution
The data set in MySql format is from https://github.com/datacharmer/test_db
- v[x]_[table name]: the table in version x
- v_[table name]: the table in variational database
- Plain database schema
- schema version 1
- v1_engineerpersonnel (empno, name, hiredate, title, deptname)
- v1_otherpersonnel (empno, name, hiredate, title, deptname)
- job (title, salary)
- schema version 2
- v2_empacct (empno, name, hiredate, title, deptname)
- job (title, salary)
- schema version 3
- v3_empacct (empno, name, hiredate, title, deptno)
- job (title, salary)
- dept (deptname, deptno, managerno)
- schema version 4
- v4_empacct (empno, hiredate, title, deptno)
- job (title, salary)
- dept (deptname, deptno, managerno)
- v4_empbio (empno, sex, birthdate, name)
- schema version 5
- v5_empacct (empno, hiredate, title, deptno, salary)
- dept (deptname, deptno, managerno)
- v5_empbio (empno, sex, birthdate, firstname, lastname)
- schema version 1
- variational database
- v_engineerpersonnel (empno, name, hiredate, title, deptname, presCond)
- v_engineerpersonnel (empno, name, hiredate, title, deptname, presCond)
- v_job (title, salary, presCond)
- v_empacct (empno, name, hiredate, title, deptname, deptno, salary, presCond)
- v_dept (deptname, deptno, managerno, presCond)
- v_empbio (empno, sex, birthdate, name,lastname,firstname, presCond)
Note that elem == elemTrue (Element has feature expression as True, we omit True here.
- v_engineerpersonnelv1(empno, name, hiredate, title, deptname)
- v_otherpersonnelv1 (empno, name, hiredate, title, deptname)
- v_empacctv2 or v3 or v4 or v5(empno, namev2 or v3, hiredate, title, deptnamev2, deptnov3 or v4 or v5, salaryv5)
- v_jobv2 or v3 or v4 (title, salary)
- v_deptv3 or v4 or v5 (deptname, deptno, managerno)
- v_empbiov4 or v5 (empno, sex, birthdate, namev4, firstnamev5, lastnamev5)
( v1 and not v2
and not v3
and not v4
and not v5
) or
(not v1
and v2 and not v3
and not v4
and not v5
) or
(not v1
and not v2
and v3 and not v4
and not v5
) or
(not v1
and not v2
and not v3
and v4 and not v5
) or
(not v1
and not v2
and not v3
and not v4
and v5)
Divided employees into 5 variants based on their hire date. We have 2 options here, but our experiments are based on the option 2.
Option 1: Chop the employees(total: 300024) into 5 parts and do not migrate data of old version to new version
Based on original number of employees.
- variant1:
hire_date < '1988-01-01'
- total number: 104967
- variant2:
'1988-01-01' <= hire_date < '1991-01-01'
- total number: 85440
- variant3:
'1991-01-01' <= hire_date < '1994-01-01'
- total number: 60742
- variant4:
'1994-01-01' <= hire_date < '1997-01-01'
- total number: 36524
- variant5:
hire_date >= '1997-01-01'
- total number: 12351
- variant1: employees whose
hire_date < '1988-01-01'
- total number: 104967
- variant2: employees whose
hire_date < '1991-01-01'
- total number: 190407
- variant3: employees whose
hire_date < '1994-01-01'
- total number: 251149
- variant4: employees whose
hire_date < '1997-01-01'
- total number: 287673
- variant5: employees whose
hire_date < '2000-01-28'
- total number: 300024
-
Chop employees into 5 parts (A,B,C,D,E) based on hire_date and migrate from old version to new one. Based on original number of employees whose to_date is '9999-01-01', we only experiment on the "permanent" employee.
- variant1:
hire_date < '1988-01-01'
- total number: 83815 (42247 engineer + 41568 other)
- variant2:
hire_date < '1991-01-01'
- total number: 152216 (83815 + 68401)
- variant3:
hire_date < '1994-01-01'
- total number: 201006 (152216 + 48790)
- variant4:
hire_date < '1997-01-01'
- total number: 230297 (201006 + 29291)
- variant5: `hire_date < '2000-01-28'``
- total number: 240124 (230297 + 9827)
- variant1:
-
For v-table
v_engineerpersonnel
andv_otherpersonnel
:- populate employees in A whose title is Enginner into enginnerpersonnel
- insert the rest of employees of A into otherpersonnel
- The presCond value for these 2 tables shoule be
v1
-
For v-table ``v_empacct`:
- Employees in B,C,D,E will be inserted into v_empacct
- | empno | name | hiredate | title | deptname | deptno | salary | presCond |
---|---|---|---|---|---|---|---|---|
B | xx | xx | xx | xx | xx | NULL | NULL | v2 |
C | xx | xx | xx | xx | NULL | xx | NULL | v3 |
D | xx | NULL | xx | xx | NULL | xx | NULL | v4 |
E | xx | NULL | xx | xx | NULL | xx | xx | v5 |
- For v-table
v_job
, Employee in A,B,C,D,E will be inserted in it. - For v-table
v_dept
, employes in C, D, E will be inserted in it as follows:
- | deptname | deptno | managerno | presCond |
---|---|---|---|---|
C | xx | xx | xx | v3 |
D | xx | xx | xx | v4 |
E | xx | xx | xx | v5 |
- For v-table
v_empbio
, employee in D, E will be inserted in it accordingly:
- | empno | sex | birthdate | name | firstname | lastname | presCond |
---|---|---|---|---|---|---|---|
D | xx | xx | xx | xx | NULL | NULL | v4 |
E | xx | xx | xx | NULL | xx | xx | v5 |
The database is originally created in mySql, and converted to SQLite3 compatible dump by use tool mysql2sqlite.
- The dump file is located in
./VDBMS/databases/employeeDB
- mySql dump file:
emp_vdb.sql
- sqlite3 dump file:
emp_vdb.db
- mySql dump file: