Lab 4.1: Configuring Logstash - squatchulator/Tech-Journal GitHub Wiki

Lab 4.1 - Configuring Logstash

Preparation

Start off by turning off Filebeat, Metricbeat, and Logstash with systemctl. At this point, you probably want to add a bit more memory to your server. This can be done by just changing the instance type.

Configuring a Basic Pipeline

First, let’s test the Logstash installation by running the most basic Logstash pipeline. A Logstash pipeline has two required elements, input and output, and one optional element, filter. The input plugins consume data from a source, the filter plugins modify the data as you specify, and the output plugins write the data to a destination.

To test your Logstash instllation, we can run the most basic Logstash pipeline with

sudo -i
cd /usr/share/logstash
bin/logstash -e 'input { stdin { } } output { stdout {} }'

The -e flag enables you to specify a configuration directly from the command line. Specifying configurations at the command line lets you quickly test configurations without having to edit a file between iterations. The pipeline in the example takes input from the standard input, stdin, and moves that input to the standard output, stdout, in a structured format. Once working, it should say something like "Pipeline Started-The stdin plugin is now waiting for input"

Enter "hello world" in the console and it should generate a little log there. We can exit logstash with CTRL-D. At this point, we've created and run a basic Logstash pipeline!

Basic File Parsing with Logstash

First we need some sample data. We can get it using these commands:

cd /home/ubuntu
wget https://download.elastic.co/demos/logstash/gettingstarted/logstash-tutorial.log.gz
gunzip logstash-tutorial.log.gz

Next create a Logstash config pipeline that uses Beats's input plugin to get events from Beats:

sudo nano /usr/share/logstash/first-pipeline.conf
# The # character at the beginning of a line indicates a comment. Use
# comments to describe your configuration.
input {
  beats {
    port => "5044"
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
  stdout { codec => rubydebug }
}

Run sudo /usr/share/logstash/bin/logstash -f first-pipeline.conf --config.test_and_exit to test the config after creating it. Should let you know if there's any issues that come up.

Configure Filebeat to send Test Logs to Logstash

Make a new terminal window SSHing into the server again. Go to /etc/filebeat and create a new file called test-filebeat.yml with the following lines (make sure "paths" points to the file we downloaded earlier):

filebeat.inputs:
- type: log
  paths:
    - /path/to/file/logstash-tutorial.log 
output.logstash:
  hosts: ["your_server_IP:5044"]

Start Logstash and Filebeat to send and parse the test log

Start Logstash from the Logstash terminal with

sudo /usr/share/logstash/bin/logstash -f /usr/share/logstash/first-pipeline.conf --config.reload.automatic

Start Filebeat from the Filebeat terminal with

sudo filebeat -e -c /etc/filebeat/test-filebeat.yml -d "publish"

Parsing Web Logs with the Grok Filter Plugin

Now you have a working test pipeline that reads log lines from Filebeat. However you’ll notice that the format of the log messages is not ideal. You want to parse the log messages to create specific, named fields from the logs. To do this, you’ll use the grok filter plugin.

The grok filter plugin is one of several plugins that are available by default in Logstash.

The grok filter plugin enables you to parse the unstructured log data into something structured and queryable.

Edit the first-pipeline.conf file and add the following in the filter section:

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
    }
}

Save your changes. Because you’ve enabled automatic config reloading, you don’t have to restart Logstash to pick up your changes. However, you do need to force Filebeat to read the log file from scratch. To do this, go to the terminal window where Filebeat is running and press Ctrl+C to shut down Filebeat. Then delete the Filebeat registry file. For example, run sudo rm -rf /var/lib/filebeat/registry
Next, restart filebeat with `sudo filebeat -e -c /etc/filebeat/test-filebeat.yml -d "publish"

Enhancing Data with the Geoip Filter

Now bring down Filebeat again. Edit the /usr/share/logstash/first-pipeline.conf file again and make sure it looks like this:

input {
    beats {
        port => "5044"
    }
}
 filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
    }
    geoip {
        source => "clientip"
    }
}
output {
    stdout { codec => rubydebug }
}

Delete the registry file and run Filebeat again to see the Geoip in formation in the logs now.

Indexing Our Data to Elasticsearch

Edit that same config file again and make it look like the following:

input {
    beats {
        port => "5044"
    }
}
 filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
    }
    geoip {
        source => "clientip"
    }
}
output {
    elasticsearch {
        hosts => [ "your_ip:9200" ]
        index => "%{[@metadata][beat]}-%{[@metadata][version]}"
    }
}

Testing Your Pipeline

To see a list of available indexes, run curl your_ip:9200/_cat/indices?v
Get a test query using curl -XGET 'your_server_ip:9200/your_filebeat_index/_search?pretty&q=clientip:83.149.9.216'