Database Security - yibinericxia/documents GitHub Wiki
Privacy laws and regulations, such as PCI, HIPAA, etc, do not allow sensitive information to be exposed to the outside world, even to be used for internal developing, testing and quality assurance. Those sensitive information should be stored in its own table with proper access control, and only the dedicated services should connect to the data and provide APIs with proper security. For developing these dedicated services, data obfuscation is also needed as well as testing.
Data obfuscation including redaction and masking, is normally applied to those data, like customer names, telephone numbers, addresses, SSN, bidding price.
Data Striping/Nulling Out
Data appears blank/null, for example, ###-##-1234 for SSN.
Data Obfuscation
Main techniques:
-
Encryption
-
Tokenization
Non-algorithmic approach to map sensitive data to unique token with format/length preserving validation but cannot be reversed via algorithm. For example, the customer name "John Smith" and his email "[email protected]" could be mapped to "Bcd Fgh" and "[email protected]" so that data integrity can be obtained.
- Masking
Use fake data to mask/anonymize original real sensitive data with data integrity, so it can be used for development, testing, training, even for public sharing. Data could of PII, such as credit card number of 16 digits need to pass checksum and the first 6 digits of BIN needs to be kept without masking.
Customize masking demands a lot of efforts
Irreversible masking protects the data (ideally one copy with data access layer for dynamic transformation based on user roles)
Packages/PlugIn
- Microsoft SQL DDM (Dynamic Data Masking)
- Postgressl anonymaizer
- DbDefence