Aleph2: configuration reference - IKANOW/Aleph2 GitHub Wiki
-
Global
globals.local_root_dir
will usually be/opt/aleph2-home
- all the configuration files of the Aleph2 system are assumed to be below this directoryglobals.local_cached_jar_dir
is a location where shared library bundles (eg JARs) are cached locally (default/opt/aleph2-home/cached-jars/
)globals.distributed_root_dir
will usually be '/apps/aleph2' - all of the distributed/shares configuration and data files on the distributed filesystem in the cluster (typically HDFS) are assumed to be below this directoryglobals.local_yarn_config_dir
is the directory where the system assumes that all Hadoop/YARN-related configuration files ("*-site.xml") are to be found. Defaults to/opt/aleph2-home/yarn-config/
.
-
Core Distributed Services
service.CoreDistributedServices.interface
- should always becom.ikanow.aleph2.data_model.interfaces.shared_services.ICoreDistributedServices
service.CoreDistributedServices.service
- for production, should becom.ikanow.aleph2.management_db.services.CoreDistributedServices
- For system testing, can use
com.ikanow.aleph2.management_db.services.MockCoreDistributedServices
- For system testing, can use
CoreDistributedServices.zookeeper_connection
- defaults to "localhost:2181"CoreDistributedServices.zookeeper_connection
- for Kafka defaults to "localhost:6667"CoreDistributedServices.application_name
- "DataImportManager", "DataAnalyticsManager", "AccessManager" etc (or null for transient nodes such as external harvest processes)CoreDistributedServices.application_port.<application_name>
- for the applications defined using the above parameter, the server port number to sit on (eg2252
for "DataImportManager")
-
MongoDB Management DB
service.ManagementDbService.interface
- should always becom.ikanow.aleph2.data_model.interfaces.shared_services.IManagementDbService
service.ManagementDbService.service
- should always becom.ikanow.aleph2.management_db.mongodb.services.MongoDbMangementDbService
- For system testing can use
com.ikanow.aleph2.management_db.mongodb.services.MockMongoDbMangementDbService
- For system testing can use
MongoDbManagementDbService.mongodb_connection
- the connection string for the database, eg "localhost:27017"MongoDbManagementDbService.v1_enabled
- if true, then runs the V1/V2 synchronization service
-
Elasticsearch Search Index service
-
service.SearchIndexService.interface
- should always becom.ikanow.aleph2.data_model.interfaces.data_services.ISearchIndexService
-
service.SearchIndexService.service
- should always becom.ikanow.aleph2.search_service.elasticsearch.services.ElasticsearchIndexService
-
ElasticsearchCrudService.elasticsearch_connection
- the connection string for the index, eglocalhost:9300
-
ElasticsearchCrudService.cluster_name
- (optional) the cluster name, if not present then connects to whatever cluster is running at the location pointed to byelasticsearch_connection
-
ElasticsearchIndexService.search_technology_override.*
- options for setting the default search index settings, see link -
ElasticsearchIndexService.columnar_technology_override.*
- options for setting the default columnar settings, see link -
ElasticsearchIndexService.temporal_technology_override.*
- options for setting the default temporal settings, see link
-
-
Core Management DB
service.CoreManagementDbService.interface
- should always becom.ikanow.aleph2.data_model.interfaces.shared_services.IManagementDbService
service.CoreManagementDbService.service
- should always becom.ikanow.aleph2.management_db.services.CoreManagementDbService
-
Security Service
service.SecurityService.interface
- should always becom.ikanow.aleph2.data_model.interfaces.shared_services.ISecurityService
service.SecurityService.service
- the technology service that provides user authentication and authorization, options:com.ikanow.aleph2.security.service.IkanowV1SecurityService
-
Document Service
service.DocumentService.interface
- should always becom.ikanow.aleph2.data_model.interfaces.data_services.IDocumentService
service.DocumentService.service
- the technology service that provides document-oriented storage, options:com.ikanow.aleph2.search_service.elasticsearch.services.ElasticsearchIndexService
- An additional "v1" document service that provides read only access to documents in v1 format stored in MongoDB can be provided with the following 2 lines:
service.V1DocumentService.interface=com.ikanow.aleph2.data_model.interfaces.data_services.IDocumentService
service.V1DocumentService.service=com.ikanow.aleph2.v1.document_db.services.V1DocumentDbService
-
Enrichment services:
service.BatchEnrichmentService.interface
- should always becom.ikanow.aleph2.data_model.interfaces.data_analytics.IAnalyticsTechnologyService
service.BatchEnrichmentService.service
- The technology service that provides batch enrichment to harvesters, can be:com.ikanow.aleph2.analytics.hadoop.services.HadoopTechnologyService
(can useanalytic_technology_name_or_id
:BatchEnrichmentService
in analytic jobs)service.StreamingEnrichmentService.interface
- should always becom.ikanow.aleph2.data_model.interfaces.data_analytics.IAnalyticsTechnologyService
service.StreamingEnrichmentService.service
- The technology service that provides batch enrichment to harvesters, can be: "com.ikanow.aleph2.analytics.storm.services.StormAnalyticTechnologyService" (can useanalytic_technology_name_or_id
:StreamingEnrichmentService
in analytic jobs)
-
MongoDB CRUD service
MockMongoDbCrudServiceFactory.one_per_thread
: defaults to false: get a separate mock MongoDB instance per thread - useful for unit testing where each test might run on a different thread. To use within a multi-threaded test, set to false instead.
-
Data Import Manager
DataImportManager.harvest_enabled
- whether this data import manager supports harvest orchestration (defaulttrue
)DataImportManager.analytics_enabled
- whether this data import manager supports analytics orchestration (defaultfalse
)DataImportManager.governance_enabled
- whether this data import manager supports data governance, eg deletion based on age (defaulttrue
)
-
Logging Service
service.LoggingService.interface
- should always becom.ikanow.aleph2.data_model.interfaces.shared_services.ILoggingService
service.LoggingService.service
- can be one ofcom.ikanow.aleph2.logging.service.LoggingService
(standard - writes to a "logging bucket"),com.ikanow.aleph2.logging.service.Log4jLoggingService
(Writes to log4j), orcom.ikanow.aleph2.logging.service.NoLoggingService
(doesn't write any logging out at all)LoggingService.default_time_field
- field name to output logging message timestamps as (defaults to 'date')LoggingService.default_user_log_level
- log4j Level to default user messages as https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html (defaults to 'OFF')LoggingService.default_system_log_level
- log4j Level to default system messages as https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html (defaults to 'OFF')LoggingService.system_mirror_to_log4j_level
- log4j Level to output system messages to log4j, set to 'OFF' to not output any messages to log4j (this allows for external logging rather than messages going into ES) (defaults to 'OFF')