Configuration - sookeke/DataMasker.Net5 GitHub Wiki

Configuration

After the package has been installed.

  1. Navigate to the installed location
  2. Open the Folder name Schema
  3. Locate the config file app.config
  4. Edit the Config file in Note pad, text or Notepad ++

Compulsory App Config Keys:

NOTE: EACH TABLE IN THE DATABASE MUST HAVE A PRIMARYKEY OR IDENTITY_COLUMN TO BE ABLE TO RUN THE DATA MASKING APPLICATION EXCEPT FOR SPREADSHEET DATASOURCE

There are some keys in the app.comfig file that are mandatory to run the DC-app ( Data classification application);

  1. DatabaseName Key: This key indicates the name of the database to be masked
  2. DataSourceType Key: This key indicates the source type of the proposed database. This tool utilizes 5 types of data sources which includes;
  • OracleServer – this is for Oracle database server
  • Requirement: Oracle client or Oracle Data Access Components installed on the workstation. Install ODAC (ODAC122011_x64.zip) from Online here
  • SqlServer – For Microsoft SQL server database
  • PostgresServer – For Postgress Server
  • MySQLServer - For MySQL Server
  • SpreadSheet – For ordinary Spreadsheets (.xlsx or .xls file type)

Chosing a DataSourcetype, the naming must match one of the above listed.

  1. OutputFilename Key – This app key indicate the output file name for the spreadsheet to be generated. The generated output will be located in a newly created folder similar to the OutputFilename in the installed folder.
  2. Schema Key – This key indicates the database schema to be mask.
  3. ConnectionString Key – this is one of the major app.config keys that need to be configured carefully. This key indicates the connection string of the database to be mask. This connection string sometime will require a username and a password. Examples of the connectionString are;
  4. PostgresServer - "Server=127.0.0.1;User Id=postgres; Password=****;Database=masking_sample;"
  5. OracleServer - "Data Source=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=)(PORT=))(CONNECT_DATA=(SID=)));User id=;Password=***;"
  6. SqlServer - "Data Source=; Initial Catalog=;Integrated Security=True;Connect Timeout=0;Encrypt=False;TrustServerCertificate=False;ApplicationIntent=ReadWrite;MultiSubnetFailover=False" SpreadSheet – Location of the Spreasheet example "C:\Registrations.xlsx"
 <appSettings>
    <add key="DatabaseName" value="APP_PAM" /> <!--DatabaseName-->
    <add key="OutputFilename" value="APP_PAM_Classification" /> <!--DatabaseName_Classification-->
    <add key="DataSourceType" value="OracleServer" /> <!--SqlServer or OracleServer or PostgresServer or MySqlServer or SpreadSheet-->
    <add key="TargetSchema" value="" /> <!--target schema of masking destination. Leave empty if source schema is the same as destination schema-->
    <!--SqlServer or OracleServer-->
    <!--insert SqlServer-->
    <add key="ConnectionString" value="Data Source=(DESCRIPTION=(ADDRESS = (PROTOCOL = TCP)(HOST = ****)(PORT = ***))(CONNECT_DATA = (SERVICE_NAME=**)));User id=***; Password =***;" /><!--for oracle-->
   
    <add key="SendEmail" value="yes" />
    <add key="appServer" value="***" />
    <add key="fromEmail" value="***" />
    <add key="Recipients" value="***" /> <!--Add recipient email address-->
    <!--seperated with a comma-->
    <add key="cCEmail" value="" />

*.SQL Config

In the DC installation folder, according to each DataSourceType, a separate SQL file will be generated. Edit the sql file with the appropriate database target name. Example ./Tab_columnsSQL.sql for SQL Server edit line e1, 114 and 287 with the target database name

The Data Masker Build installer package

This installed application will take a data classification spreadSheet and generate a JSON column map type for the data classification.

After the package has been installed.

  • Navigate to the installed location
  • Open the Folder name DataMaskerInstaller
  • Locate the config file DataMasker.Examples.exe.config
  • Edit the Config file in Note pad, text or Notepad ++
  • Edit the Mandatory app config keys
  • Save the configuration as an administrator Mandatory App Config Keys:

The mandatory app config keys required to run the utility tool are;

    • ExcelSheetPath,
    • DatabaseName,
    • WriteDML,
    • MaskTabletoSpreadsheet,
    • DataSourceType,
    • APP_NAME,
    • ConnectionString,
    • ConnectionStringPrd,
    • MaskedCopyDatabase,
    • RunValidation,
    • Hostname,
    • TestJson,
    • RunTestJson,
    • EmailValidation,
    • AutoUpdate,
    • CurrentVersionURL
    • CurrentInstallerURL
    • ExcelSheetPath Key: This app key holds the location of the information schema and data classification spreadsheet generated by InfoSchema.exe

jsonMapPath Key: This key will hold the file name of the JSON file to be generated. This JSON file will be saved in the folder classification-configs of the installation folder with the prescribed name (\classification-configs)

DatabaseName Key: This key indicates the name of the database to be mask

WriteDML Key : What type of masking do you want to apply?. Set this Key to YES if you want the masked output to be inform of DML SQL insert statement.

ConnectionString: Do not store DB password in the configuration file. For security reason, client user will be prompt to type in connectionString and ConnectionStringPrd password. The password field in the configuration should be {0} . example: -->

MaskTabletoSpreadsheet Key : What type of masking do you want to apply?. Set this Key to YES if you want the masked output to be inform of SpreadSheets. **WriteDML **Key must be set to True/YES for this to be triggered.

DataSourceType Key: This key indicates the source type of the proposed database. This tool utilizes 5 types of data sources which includes;

OracleServer – this is for Oracle database server Requirement: Oracle client or Oracle Data Access Components installed on the workstation. Install ODAC (ODAC122011_x64.zip) from Online here or from our NetworkDrive here SqlServer – For Microsoft SQL server database PostgresServer – For Postgress Server MySQLServer - For MySQL Server SpreadSheet – For ordinary Spreadsheets (.xlsx or .xls file type)

Property Name Values
ExcelSheetPath Data classification spreadsheet see excel format in SharePoint
jsonMapPath
Json config name format = classification-configs//APP_NAME_config.json
TestJson Have a test Json full path to work on - Ignore if not
RunTestJson Always Set this No if you do not have test Json
DatabaseName Database name to be masked
WriteDML Generate DML? [Yes/No]
MaskedCopyDatabase Mask database or test copy directly? [Yes/No]
RunValidation RUN data masking validation test? [Yes/No]
EmailValidation Send validation test report as email? [Yes/No]
MaskTabletoSpreadsheet Generate masked tables like a spreadsheet? [Yes/No]
Hostname Server or hostname
DataSourceType SqlServer, PostgresServer, MySQLServer, SpreadSheet, OracleServer
APP_NAME schema name here or database name here. use this APP_NAME in exception path config
ConnectionString ConnectionString to Masked copy here. Use Password={0} to prompt user for password during execution
ConnectionStringPrd ConnectionString to PRD copy here. Use Password={0} to prompt user for password during execution
fromEmail
Sender Email Address must be logged in to outlook
recipient email Recipients email addresses separate with a ";"
cCEmail cc email addresses separate with a ";"
attachDML Attach Generated DML as an email attachment. [Yes/No]
attachMaskException Attach Masked exceptions as an email attachment. [Yes/No]
attachSpreadsheet Attach Table SpreadSheet as an email attachment. [Yes/No]
_exceptionpath Format is \output\[Databasename]_exception.txt
_successfulCommit Format is \output\[Databasename]_successfulCommit.txt

Performance Tips

Shuffle has been tested to take longer time depending on the number of records. To improve performance, limit the numer of shuffle used or preferably use other masking type to achieve the same result. Ensure the data classification is done correctly: wrong classifcation will let to error handling which may reduce system performance Have a clean data. The data masking tool try to clean up the data before outputing the results. For example, Date field with Varchar type contains wrong data type like string or empty string. In this case, the data masking tool will clean up all irregular data in the column to a matched datetime data. E,.g. Empty string will be converted to a default dateTime type. Big data are cached in the CPU stack memory, therefore; masking a big data object (DataTables with more than 1 Million records) will require a good CPU memory for a good performance.

⚠️ **GitHub.com Fallback** ⚠️