Log Shipping - SeaDataCloud/Documentation GitHub Wiki

Get started on main machine (sdc-test):

cd /root/
mkdir logshipping
cd logshipping/

docker pull docker.elastic.co/elasticsearch/elasticsearch:7.6.2
docker pull docker.elastic.co/beats/filebeat:7.6.2
docker network create logshipping

ElasticSearch

Set up ElasticSearch with an nginx proxy that does HTTP Basic auth.

cd /root/logshipping
mkdir logstore
cd logstore

Set up ElasticSearch:

mkdir elasticdata
chown 1000:1000 elasticdata
vi docker-compose.yml
vi nginx.conf

HTTP Basic Auth

Following: https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-http-basic-authentication/

yum install httpd-tools
cd /root/elasticdata/
htpasswd -c ./.htpasswd developer

docker-compose:

version: '3.3'

services:

  nginx:
    image: nginx
    restart: always
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf
      - ./.htpasswd:/etc/nginx/.htpasswd
      - /root/HEALTH/healthcheck_nginx.sh:/bin/healthcheck.sh
      - /root/cert/cert.pem:/etc/ssl/certs/myhost.crt:ro
      - /root/cert/cert.key:/etc/ssl/private/myhost.key:ro
    ports:
      - "9200:443"
    networks:
      - logshipping
    depends_on:
      - elastic
    healthcheck:
      test: ["CMD", "/bin/healthcheck.sh"]
      interval: 60s
      timeout: 30s
      retries: 5

  elastic:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.6.2
    restart: always
    volumes:
      - ./elasticdata:/usr/share/elasticsearch/data
    ports:
      - "9222:9200" #  we need the port open for Filebeat to send logs, but not standard port!
      #- "9300:9300"
    networks:
      - logshipping
    environment:
      - discovery.type=single-node
      #- node.name=elastic1
      #- cluster.name=es-docker-cluster
      #- discovery.seed_hosts=es01,es03
      #- cluster.initial_master_nodes=es01,es02,es03
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9200/_cat/indices"]
      interval: 60s
      timeout: 30s
      retries: 5

networks:
  logshipping:
    external: true

nginx config:

upstream elastic_http {
    server elastic:9200;
}


server {
    listen       443 ssl;
    server_name  localhost;
    ssl_certificate     /etc/ssl/certs/myhost.crt;
    ssl_certificate_key /etc/ssl/private/myhost.key;

    location /logs/{

        limit_except GET {
            deny  all;
        }

        auth_basic           "Elastic";
        auth_basic_user_file /etc/nginx/.htpasswd;
        proxy_pass http://elastic_http/$1$is_args$args;
        proxy_set_header X-Real-IP $remote_addr;
    }

    location  ~ /logs/(.*)$ {

        limit_except GET {
            deny  all;
        }

        auth_basic           "Elastic";
        auth_basic_user_file /etc/nginx/.htpasswd;
        proxy_pass http://elastic_http/$1$is_args$args;
        proxy_set_header X-Real-IP $remote_addr;
    }

    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }
}

Filebeat

Set up Filebeat:

cd /root/logshipping
mkdir logreader
cd logreader
vi filebeat.yml
vi docker-compose.yml

docker-compose:

version: '3.3'
services:
  filebeat:
    image: docker.elastic.co/beats/filebeat:7.6.2
    restart: always
    volumes:
      - ./filebeat.yml:/usr/share/filebeat/filebeat.yml
      - /var/lib/docker/containers:/var/lib/docker/containers
    #ports:
    #  - "8001:443"
    networks:
      - logshipping
    depends_on:
      - elastic
    #healthcheck:
    #  test: ["CMD", "/bin/healthcheck.sh"]
    #  interval: 60s
    #  timeout: 30s
    #  retries: 5

networks:
  logshipping:
    external: true

filebeat.yml:

Make sure you change the word jellyfish against whichever hostname you run this on, so developers can retrieve the logs from the host they look for!

# https://www.elastic.co/guide/en/beats/filebeat/master/filebeat-input-container.html

filebeat.inputs:
- type: container
  stream: all
  paths:
    - '/var/lib/docker/containers/*/*.log'

# for testing:
# https://www.elastic.co/guide/en/beats/filebeat/current/console-output.html
output.console:
  enabled: true
  pretty: true

# For the index naming:
setup.template.name: "devlogs_jellyfish"
setup.template.pattern: "devlogs_jellyfish-*"
setup.ilm.enabled: auto
setup.ilm.rollover_alias: "devlogs_jellyfish"
#setup.ilm.pattern: "{now/d}-000001"
setup.ilm.pattern: "{now/d}"

# for shipping
# https://www.elastic.co/guide/en/beats/filebeat/master/elasticsearch-output.html

# on this machine:
#output.elasticsearch:
#  hosts: ["elastic:9200"]
#  index: "devlogs_jellyfish-%{+yyyy.MM}"
# on the remote machine:
output.elasticsearch:
  hosts: ["sdc-test.argo.grnet.gr:9222"]
  index: "devlogs_jellyfish-%{+yyyy.MM}"

Filebeat on other machines:

docker pull docker.elastic.co/beats/filebeat:7.6.2
cd /root
mkdir /logshipping
cd logshipping
vi docker-compose.yml   # add content
vi filebeat.yml         # add content

Add the docker-compose.yml, but the network has to be called vre.

Also the elasticsearch output needs a different URL.

Troubleshooting

Note that the directories containing the logs have to be visible to Filebeat, which runs as uid 1000.

TODO: Docker writes the logs as root:root, how to solve this?

# create dummy log...
mkdir /var/lib/docker/containers/dummy
vi /var/lib/docker/containers/dummy/dummy-json.log

# chown to 1000
chown -R 1000:1000 /var/lib/docker/containers/dummy/
# still does not work

# chown also parent dir
chown 1000 /var/lib/docker/containers/
# now it works for the dummy one!

Delete contents of an index, after testing

docker exec -it logstore_elastic_1 /bin/bash
#curl -X DELETE 'http://localhost:9200/_all' # CAREFUL!!!
#curl -X DELETE 'http://localhost:9200/[indexname]'
#curl -X DELETE 'http://localhost:9200/filebeat-7.6.2'
#curl -X DELETE 'http://localhost:9200/filebeat-7.6.2-2020.05.13-000001'

View the ElasticSearch content

Which indices exist?

Node stats

Check out one index:

Search for all entries with a certain tag:

curl -X GET --user <user>:<password> "https://sdc-test.argo.grnet.gr:9200/logs/_search?pretty" -H 'Content-Type: application/json' -d'{ "query": { "term": { "tags": { "value": "bioqc", "boost": 1.0 } } }}'
⚠️ **GitHub.com Fallback** ⚠️