docker compose deployment - 52North/ecmwf-dataset-crawl GitHub Wiki
Using docker-compose, the application can be set up straight-forward on a docker enabled host.
System requirements:
- Docker engine
v17.06or later - 6GB RAM (2 gigs for elasticsearch, 2 for the crawler, 1 for the rest. the first two would happily eat more, though configuration may be required)
We successfully deployed the application on a Debian 9 VM with 20GB of RAM running with docker 18.06-ce and docker-compose 1.19.0.
Technically a distributed deployment (with elasticsearch or the crawler on another host) could be realized, but we didn't try that.
Example Setup
target=/var/lib/docker-compose/ecmwfcrawler
git clone https://github.com/52north/ecmwf-dataset-crawl $target
cd $target
vi .env # set up WEB_DOMAIN and API keys
docker-compose build
docker-compose up -d
Kibana does not start correctly for the first time with the required elasticsearch configuration. To set it up with a workaround:
- comment out
action.auto_create_index: falseinelasticsearch/config/elasticsearch.yml - (re-)start the application and run a test crawl to populate the elasticsearch indexes
- visit
$WEB_DOMAIN/kibanaand follow the kibana setup procedure described inREADME.md, while the crawl is running - reset the elasticsearch config and restart the elasticsearch container
Pitfalls
- After making changes to
.env, always restart the setup with--force-recreate - Don't run docker-compose with
--abort-on-container-exit: The frontend container will exit immediately, shutting the whole setup down - When pulling a new version from the repo with changes to the frontend, remove the existing
frontendvolume first (otherwise the previous build will continue to be used):docker rm ecmwf-dataset-crawl_frontend_1 ecmwf-dataset-crawl_proxy_1 docker volume rm ecmwf-dataset-crawl_frontend - When deploying the setup behind another reverse proxy
- configure
WEB_DOMAINto be the internal hostname with which the reverse proxy refers the service - If you don't specify a port,
2015will be selected - You should be able to use a non root base URL (eg
/crawler)
- configure
TODO: RAM optimizations for elasticsearch & crawler container, proxy set up