Work with Database - ARCAD-Software/AFS GitHub Wiki

This article details how to works with a Database in an AFS Server. It present how the database definition script must be made, the tools which may help the developer and how the CI integrate the generation of the required deliverable.

Database definition

Starting from the AFS version 2022.04, the database management is done using a "source code" paradigm. The developer is responsible from the database definition, and he/she manage it just like other source code of this project. The database definition is stored into the OSGi bundle of the product, developed to be hosted into an AFS Server. This definition are made using SQL language and the files storing it (.sql files) are versioned to ensure the management of the database update. These files location and names follow a very strict nomenclature to allow the automatic build process and other tools to generate the deliverable databases and update scripts.

Database definition nomenclature

The database definition files must me stored into a database sub-folder of the META-INF folder of any bundle witch is part of the product. You can create an unique SQL file for the whole database management or multiple file, for instance near the corresponding entities.xml files related to, this is up to you, but you should not mix up the files, i.e. if you create a Table in some bundle manage the evolution of this table in the same bundle.

The file name must be build like the following pattern: PPPxxx[_yyy][_dbtype].sql

Where:

PPP is the product code, this value can be a generic code, like "PRODUCT" or "V". the code "AFS" is reserved for the tables relative to the AFS framework.
xxx is the database version number. the versionnumber, starting for zero, allow to manage the evolution of the database, and generate the upgrade scripts. The zero version correspond to the initial creation of your product database form the original release of it... or the fist release that use this new process.
yyy is a order number, this optional number allow to order the file execution into the same version level. For instance, if you have decided to use multiple definition files, some file may required to be executed after some other, this number allow to ensure that it will be done as you defined it.
dbtype is a keyword, equal to h2, pg or db2i and indicate that this file will be used only in the creation of the specified database type, i.e. H2 database or PostgreSQL. Please note all file witch did not specify a database type will be executed in all kind of generation, if one of your view does not have the same representation in H2 and PG you will have to put its declaration into one _h2.sql file, another in one _db2i.sql and in one _pg.sql not in one unspecified file, even if the DB2i representation is the same as the PostgreSQL one !

So, for instance, a database definition file applied to the third version of the database, that should be executed in a second phase, and specific to the PostgreSQL must be named: /META-INF/database/V3_1_pg.sql

Notes and recommendations:

Each new modification (for a new release) must be included in a new version of the SQL file. So, for instance if somewhere there is a version 2 of the SQL script and you create a new bundle including SQL modification, you will have to put a version 3 of the file in this bundle. It is recommended to put also two empty script file, version 1 and 2, in this bundle to ensure the continuity on the SQL versioning, even if these scripts are empty.
With the same idea, if you have to create a database specific script, you should also create a generic SQL file with the same version in this bundle.

So for instance, If you create a new bundle requiring a dedicate modification in the H2 database (not applicable in PostgreSQL), and the current version of your database was PRODUCT3.sql. You will have to create a PRODUCT4_h2.sql with your modification in it, but you should also include emprty SQL scripts: PRODUCT4.sql, PRODUCT3.sql, PRODUCT2.sql, PRODUCT1.sql and PRODUCT0.sql in this bundle.

Database Versioning

To manage the Database update, the versioning information of the database structure must be included into the database itself. The AFS database management add a dedicated table, named ARCADDBV to the the database definitions:

create table ARCADDBV (
    DBV_ID integer generated by default as identity primary key,
    DBV_CODE varchar(50),
    DBV_VERSION integer,
    DBV_LABEL varchar(255),
    DBV_UPDATE timestamp default current_timestamp);

This table is filled during the Database installation and update, with lines which define the installed versions of the database.

The developer does not have to worry about this table because the integration program of the V*.sql file generate automatically the version number according to the file number.

Definition order of AFS

Use usage of the order number is up to you, they will order the execution of the file for the database generation, and the construction of the update scripts. It is up to you to use it or not, if you chose to use a unique file to store the whole declaration of your database you will not need it. But as your product is dependenfrom AFS and as AFS use the same process wou may have to use it to connect your database part to the one declared by AFS bundles.

The AFS bundle manage the piece of database related to the Metadata entities they declare. These table will be installed and updated before the tables related to the product database, so you can refer or alter these definition in your SQL scripts:

The table ARCADDBV, used to the database versioning, is declared with the default order (zero).
The tables USERS, USER_TITLES , PROFILES, PROFILES_RIGHTS and the view USER_RIGHTS, related to the user management, are declared with the default order (zero).
The table IBMIAUTH, LOCALAUTH, LDAPAUTH and WINAUTH, related to the corresponding authentication schema, are declared with the level 2 (two).

According to the AFS usage you made in your product you may not have access to all these tables into your database.

The usage of these order number in AFS means that if some of your tables are related to the USERS table, in a foreign key, then this table, or at least the foreign key must be declared with an order higher or equal to 1 (one).

Generic SQL dialect

This chapter describe the convention and the generic SQL instruction which can be used to define a SQL code usage on all supported databases. As H2 Database use a flexible SQL syntax, generating a "generic SQL code" which run into H2 is not enough, other database may have a more strict parsing of SQL code.

Naming

Due to an historical convention all the names must be written in upper case, with the following concerns:

The Tables and Views name are plural names.
Each column name has a prefix of three letters followed by an underscore and the actual name of the column.
The primary key column is name _ID.
The column used for soft deletion is _DELETED.
The column storing the last modification date is _UPDATE.
Do not double quote names.

Comments

Add comments in your SQL code, just like in any other kind of code.

To do so use the line prefix -- (double minus) to start a line a comment, this is the most generic kind of comment.

As the SQL files are assembled into a single file per version, you should add a comment header and ans a comment footer to your files to identify them into the global generated file.

SQL Types

To be able to map AFS Metadata entities to the correct SQL type it is recommended to use the following types, associated with the corresponding with default values:

Java type	SQL type	Default values sample
Integer	integer
BigInteger	biginteger
String	varchar(xxx)	''
Float	float
Boolean	integer	0 (false) and 1 (true)
Date	timestamp	current_timestamp

Avoid other SQL data type because they may not be supported by DBMS, like the "boolean" type which is not supported by all version of IBM DB2i.

Table declarations

All tables must declare a auto incremented primary key, with an integer type and using the name _ID. to do so use the following column declaration:

XXX_ID integer generated by default as identity primary key,

if the table use the soft deletion of AFS, and the recording of the last modification date, add the following columns replace the XXX prefix with the table prefix:

XXX_DELETED integer default 0,
XXX_UPDATE timestamp default current_timestamp,

Indexes and constraints

Add some indexes to your table according to the most used selection conditions, the indexes may drastically improve the performance of the data access on large tables.

Define the referential constraint (i.e. foreign key) only on the primary key of the targeted table.

For "link" tables, i.e. the table which implement an AFS Matadata Link, declare the foreign keys with the sentence "on delete cascade" to ensure that the "link" will be removed as soon as one of the link data is removed.

View declarations

The views are the most difficult object to declare in a generic SQL code. Here are some advises to produce Views declaration the most compatible between supported databases.

Identify the selected columns and use aliases to avoid names collisions.
PostgreSQL does not support the ROWNUM() function of H2, you have to use ROW_NUMBER() OVER(ORDER BY COL_ID) instead and create 2 sql script (one for h2 and another one for postgreSQL).

Inline tables must be declared using the VALUES SQL structure:

select ID, VAL from ( values (0, 'NO'), (1, 'YES') ) as X ( ID, VAL)

Instead of the TABLE instruction, which is H2 specific:

select ID, VAL from table ( "ID" int = row (0, 1), "VAL" varchar = row ('NO', 'YES') )

H2 database may require to recompile a view if one of the table used by the wiew is modified in the same JDBC session. So if an instruction use a view (for instance a view depending from another one), and if in a previous SQL script one of the tables used by this view has been altered then you will have to move this instruction into an h2 scpecific SQL script and preced it by a instruction:
```
alter view MYVIEW recompile;
```

Data declarations

When you insert data in the database code, do it without the primary key value, which will be auto generated, if you have to insert a data with its primary key number, you will have to manually update the sequence to a higher value.