Pega Deployment Architecture - sabrylk/PegaInfinity GitHub Wiki
Beginning in Pega Platform version 8.8, externalization is the preferred choice for third-party software, and embedded services are deprecated. The corresponding changes in this new deployment model are reflected in the following figure:
Hazelcast
Hazelcast is designed for environments where the power of in-memory computing can be used to accelerate applications that require low latency, high throughput, horizontal scaling, and security. Hazelcast can run embedded in every node, but also supports a client-server topology.
Pega Platform™ low-level features such as agents, Job Scheduler, and REST service rules use Hazelcast to cache or store the data.
Kafka
Apache Kafka has two use cases in Pega Platform.
Kafka as a streaming service: An external Apache Kafka service is required in every Pega Platform deployment to support the streaming functionality. All Queue Processors and Job Schedulers in Pega Platform require Kafka as a backend to stream data. Without access to an external Apache Kafka service, Pega Platform is not fully functional. Additionally, the required external Apache Kafka service acts as a backend for a Stream Data Set, which you can optionally configure for your application.
Kafka Data Sets: In deployments running Pega Customer Decision Hub or Pega Process AI™ applications, you can also use an additional external Apache Kafka service to create Kafka Data Sets in your application. This Kafka service is optional and must be different from your required external Kafka service for streaming.
Cassandra
Apache Cassandra is the primary example of a backing technology that underpins the Decision Data Store (DDS) Data Set. The following sections provide an overview of the most important Cassandra features in terms of scalability, data distribution, consistency, and architecture.
Cassandra handles the database operations for Pega Platform™ decision management by providing fast access to the data that is essential in making Next-Best-Action decisions in both batch and real time.
Elasticsearch
Pega Cloud® uses the Elasticsearch engine for full-text searches to retrieve specific data from the system, such as rules, assignments, attachments, or data instances. Elasticsearch is a third-party search engine that quickly finds relevant information within applications by analyzing large volumes of data. An efficient indexing mechanism makes application data searchable in near real time.
Search and Reporting Service is a multi-tenant, cloud-based service that enables the connection of multiple Pega Platform™ environments (tenants) to the same instance of the service. The service provides options to receive requests and store and segregate data from multiple tenants.
Sources: