Migrate OpenAM to Apache Cassandra without Single Point of Failure - OpenIdentityPlatform/OpenAM GitHub Wiki

Initial Data Storage Scheme

Data Type Storage Method Fautl Tolerance Method Disadvantages
OpenAM configuration OpenDJ (localhost:1389) Multi-master replication  Configuration update on a single node affects all nodes by replication.
CTS-Core Token Service (Session persistence)

Syncronisation payload processed on all nodes

Read performance from single node restricted by single node performance.

Replication failure could cause other nodes to read-only mode

Accounts repository (except AD)

Data Storage Scheme for the Number of Credentials >5 Million

Data Type Storage Method Fault Tolerance Method Disadvantages
OpenAM configuration OpenDJ (localhost:1389) Local independent storage as a part of the distribution (war file) Updating the configuration on one node does not affect other nodes (the nodes are completely independent)
CTS-Core Token Service (Session persistence) Cassandra Cluster (tcp:9042) Cluster without a single point of failure with geo-distribution and distribution by rack

Synchronization write payload is processed only on the nodes with the required replication level

The reading load from one node is not limited to the performance of one node, but distributed according to the replication level.

Node failure does cause replication stop according to the replication level

Accounts repository (except AD)

Migration Plan

  1. Cluster hardware resources planning
  2. Deploy the cluster according to the required level of fault tolerance
  3. Provide network access OpenAM-> tcp:9042
  4. Migration stages (can be done independently):
    1. Switch "CTS - Core Token Service (sessions)"
    2. Switch "Accounts repository (except AD)" with legacy data migration
    3. Switch "OpenAM configuration"

Fault Tolerance Level Planning

Datacenter

Defines geo-distributed storage fault tolerance.  

  • Minimum number of data centers: 1
  • The recommended number of data centers: 
    • at least two 
    • at least the same number of data centers used for application servers (OpenAM)
  • Allowed data center fault tolerance mode:
    • Hot Spare: used for data processing for application servers (OpenAM)
    • Cold Spare: not used for data processing for application servers (OpenAM)

Rack

Minimum fault tolerance unit within a data center for data distribution within a data center, haves:

  • Independent disk subsystem array
  • Independent virtualization hypervisor (host system)

Amount calculation:

  • Minimum quantity inside hot spare data center: 1, but not less than replication level inside the data center
  • Minimum quantity inside cold spare datacenter: 1, but not less than replication level inside the datacenter

Node

Defines the unit of information storage and load inside the data center rack

An Example of a Recommended Minimum Configuration

Test

DataCenter Type Amount of Copies Rack % data Node % data
dc01 host 1 rack01 100% dc01-rack01-node01 100%

Production

DataCenter Type Amount of Copies Rack data % Node data %
dc01 hot 1 rack01 100% dc01-rack01-node01 50%
          dc01-rack01-node02 50%
    1 rack02 100% dc01-rack02-node01 50%
          dc01-rack02-node02 50%
dc02 hot 1 rack01 100% dc02-rack01-node01 50%
          dc02-rack01-node02 50%
    1 rack02 100% dc02-rack02-node01 50%
          dc02-rack02-node02 50%

 

Allowed:

  • Increase the data center amount without service interruption
  • Increase the rack amount without service interruption
  • Increase the node amount without service interruption
  • Change the number of data copies inside the data center without service interruption

Hardware Requirements For a Single Node

Environment CPU RAM Disk
Test >=2 >=8 16G HDD
Production >=8 >=32 ballon=off 64G SSD RAID
⚠️ **GitHub.com Fallback** ⚠️