Lab 1 Setting up Elastic in AWS - Hsanokklis/2023-2024-Tech-journal GitHub Wiki

The Public IPv4 address will change every new session. Current IPv4 address in use: 54.173.196.221

Private IPv4 address is : 172.31.87.23

Lab Outline

Part 1:

  • Configure an Ubuntu Server in AWS that you can SSH into

Part 2:

  • Install Elasticsearch on that server
  • Install Logstash on that server
  • Configure a test pipeline from Logstash to Elasticsearch
  • Install Kibana
  • Successfully query log data using Kibana

CLICK start lab when going into the box

The boxes time out every 4 hours, so once you reboot you will have a new IP address (so keep that in mind)

When you create your key pair, it stores the public key on amazon and downloads the private key to your downloads

When we interact with the address from Champlain we have to ssh with the public 1pv4 address

ssh -i .\hannelore-elk.pem ubuntu@'ipaddress'

Security groups are essentially firewalls that are attached to your instance


Part 1: Setting up Ubuntu in AWS

Step 1: Accessing the Learner Lab

https://awsacademy.instructure.com/courses/60764

Step 2: Launching a Ubuntu Server

Use the Services search bar and search for “EC2”

image

Click EC2 to get to the EC2 Dashboard

image

Click on the Instances-Instances Menu on the left pane

image

Click the Orange “Launch Instances” on the Right

image

On the “Launch an Instance” screen

  • Name your server “your_name ELK Server”

image

  • Select Quick Start Ubuntu as the OS Image

image

  • Select t2.medium as the Instance Type

image

Important: Key Pair (login)

  • Choose “Create new key pair”

image

  • Key pair name: yourname-elk-key

image

  • Type: RSA

image

  • Format: .pem

image

Click Create Key Pair and it should Download to your computer

This file is like a password - you will need access to it to log into your server- please keep track of it!

Network Settings:

  • Create security group

image

  • Check Allow SSH Traffic from anywhere 0.0.0.0/0

image

  • Check Allow HTTPS Traffic from Internet

image

  • Check Allow HTTP Traffic From Internet

image

  • Leave defaults for “Configure Storage” and Advanced Details”

Configure Storage --> update it to 30GB

image

Launch the Instance at the bottom of the page

image

you should see a green bar saying success with your instance

image

You should see your instance running! --->

image

Step 3: SSH to your server (in powershell)

ssh -i hannelore-elk-key.pem [email protected]

the public ip might change the next time I open the instance

image

Update Security Group (aka firewall) for Elastic

  • Final prep step - in the AWS EC2 Console - find the security group attached to your instances

image

  • Update the security group to allow the Elasticsearch and Kibana ports
    • Edit Inbound Rules for the Security Group assigned to your ubuntu Server

image

  • Add Custom TCP port 5601 with source Any IPv4 (this is Kibana)

image

  • Add Custom TCP port 9200 with source Any IPv4 (this is Elasticsearch)

image

This message should pop up once you edit your security rules

image

Part 2: Building the ELK Stack

Installing Elasticsearch

From SSH on your Ubuntu server

First, you need to add Elastic’s signing key so that the downloaded package can be verified (skip this step if you’ve already installed packages from Elastic):

you need to do cd to the downloads folder so that it can find the key

image

Connected to Ubuntu!

image

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

image

For Debian based OS's like Ubuntu, we need to then install the apt-transport-https package:

sudo apt-get update

sudo apt-get install apt-transport-https

The next step is to add the repository definition to your system (one line command)

echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

image

To install a version of Elasticsearch that contains only features licensed under Apache 2.0 (aka OSS Elasticsearch) (again, one line command):

echo "deb https://artifacts.elastic.co/packages/oss-7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

image

All that’s left to do is to update your repositories and install Elasticsearch:

sudo apt-get update

sudo apt-get install elasticsearch

Elasticsearch configurations are done using a configuration file that allows you to configure general settings (e.g. node name), as well as network settings (e.g. host and port), where data is stored, memory, log files, and more.

For our example, since we are installing Elasticsearch in AWS, it is a good best practice to bind Elasticsearch to the private IP (like 172.31.something):

sudo apt install micro (actually being able to read the text editor)

sudo micro /etc/elasticsearch/elasticsearch.yml

Edit the file to uncomment the following in the Network section (delete # for those lines). For use the Private IP of your AWS server network.host:

image

image

image

sudo service elasticsearch start

image

To confirm that everything is working as expected, use curl to port 9200 on your server:

curl http://your_private_ip:9200/

image

Step 2: Installing Logstash

Logstash requires Java 8 or Java 11 to run so we will start the process of setting up Logstash with:

sudo apt-get install default-jre

Verify java is installed:

java -version

With output something like:

image

Since we already defined the Elastice repository, all we have to do to install Logstash is run:

sudo apt-get install logstash

Before you run Logstash, you will need to configure a data pipeline.

Step 3: Create a Data Pipeline

3.1 Download Sample data

We will use some sample data to send data from logstash to elasticsearch (aka “shipping” data). For the purpose of this lab, there is some sample data containing Apache access logs.

It can be download from: https://raw.githubusercontent.com/agoldstein333/Files/main/apache-daily-access.log

Download it using “wget” on your Ubuntu server. Wget is a command to retrieve files via http when there is no gui/browser.

Make a directory for your data

  • `mkdir /logstash

cd into that directory

  • cd /logstash

use wget to download the data to that directory

  • wget https://raw.githubusercontent.com/agoldstein333/Files/main/apache-daily-access.log

image

Change owner/group of that directory to the “logstash” user

  • sudo chown -R logstash /logstash
  • sudo chgrp -R logstash /logstash

TROUBLESHOOTING: I kept getting the error saying that the user logstash could not be found. So I trouble shooted to try and figure out if the logstash user actually exsisted.

First I made sure the package was installed with apt list --installed | grep logstash and it was indeed installed.

image

I attemped to make a new user with sudo adduser logstash and the system told me there was already a user with that name.

Then I checked cat /etc/passwd to see if logstash was there

I ended up taking out the / when specifying the file path and it work! I think I made the directory wrong. I then removed the directory and then remade it the correct way so it matched lab instructions.

Permissions changed:

image

3.2 Create Logstash Configuration File

Next, create a Logstash configuration file to ingest the sample apache logs and output to elasticsearch

The new file will be /etc/logstash/conf.d/apache-01.conf

sudo nano /etc/logstash/conf.d/apache-01.conf

Enter the following Logstash configuration (NOTE - this file is in YAML format and indentation matters) AND

  • Make sure to change the path to the location of your data file (probably something like: /logstash/apache-daily-access.log)
  • Make sure to change the elasticsearch IP to the Private IP of your server
input {
  file {
    path => "your_data_file"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}
filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
  geoip {
    source => "clientip"
  }
}
output {
  elasticsearch { 
  #stdout {}
    hosts => ["your_private_ip:9200"] 
  }
}

image

3.3 Test Configuration File

Before starting Logstash - test the configuration file to make sure there are no errors/issues. You can do this by running logstash from the command line:

sudo /usr/share/logstash/bin/logstash --config.test_and_exit -f /etc/logstash/conf.d/apache-01.conf

Validation message at the end of the output

image

3.4 Start Logstash

If all goes well, start Logstash with:

sudo service logstash start

If it works, a new Logstash index will be created in Elasticsearch

You can directly call the Elasticsearch api with curl using GET /_cat/indices to see if the index was created:

Private IP is 172.31.87.23

curl http://your_private_ip:9200/_cat/indices?v

image

TROUBLESHOOTING: This command wasn't working. It kept saying that the connection timed out. Here is what I did to solve this:

  • I rebooted the instance
  • I restarted my computer
  • I checked to see if both logstash and elasticserach were installed with apt list --installed | grep logstash and apt list --installed | grep elasticsearch
  • I then checked the status of elasticsearch with systemctl status elasticseach and it said it was dead so here was the problem
  • I ran systemctl start elasticserach and we were back in business baby!

If the “doc.count” is zero, wait a few minutes and check again as it can take a bit for the data to be indexed

If it stays at 0, there is likely an issue with importing the apache log file.

It could be:

  • Incorrect path in your config file
  • Incorrect permissions on the apache log sample file

*Once the document count is showing up, you can again use curl and the elasticsearch api (GET /_search) to see the data - replacing “your_index” with the name of your index - something like logstash-2023.08.13-000001

your_index should be logstash-2023.11.14-000001

make sure to change the IP address to your private IP which is 172.31.87.23

image

curl http://172.31.83.57:9200/logstash-2023.11.14-000001/_search?pretty=true

image

You should now see the log data in semi-structured/JSON form - like:

image

"_index" : "logstash-2023.08.13-000001",
        "_type" : "_doc",
        "_id" : "INK88IkBHSuKG7wH-wyZ",
        "_score" : 1.0,
        "_source" : {
          "response" : "200",
          "httpversion" : "1.1",
          "agent" : "\"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.46 Safari/535.11\"",
          "request" : "/category/software",
          "bytes" : "111",
          "geoip" : {
            "location" : {
              "lon" : -43.0811,
              "lat" : -22.9201
            },
            "region_code" : "RJ",
            "longitude" : -43.0811,
            "continent_code" : "SA",
            "latitude" : -22.9201,
            "country_code3" : "BR",
            "timezone" : "America/Sao_Paulo",
            "city_name" : "Rio de Janeiro",
            "region_name" : "Rio de Janeiro",
            "country_name" : "Brazil",
            "ip" : "200.222.44.110",
            "postal_code" : "24000",
            "country_code2" : "BR"
          },
          "auth" : "-",
          "host" : "ip-172-31-83-57",
          "ident" : "-",
          "@version" : "1",
          "@timestamp" : "2022-11-30T17:22:20.000Z",
          "referrer" : "\"http://www.google.com/search?ie=UTF-8&q=google&sclient=psy-ab&q=Software&oq=Software&aq=f&aqi=g-vL1&aql=&pbx=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&biw=2401&bih=503\"",
          "timestamp" : "30/Nov/2022:17:22:20 +0000",
          "verb" : "GET",
          "path" : "/logstash/sample-data",
          "message" : "200.222.44.110 - - [30/Nov/2022:17:22:20 +0000] \"GET /category/software HTTP/1.1\" 200 111 \"http://www.google.com/search?ie=UTF-8&q=google&sclient=psy-ab&q=Software&oq=Software&aq=f&a
qi=g-vL1&aql=&pbx=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&biw=2401&bih=503\" \"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.46 Safari/535.11\"",
          "clientip" : "200.222.44.110"
        }

Step 4: Kibana

Use a simple apt command to install Kibana:

sudo apt-get install kibana

Open up the Kibana configuration file with vim or nano at: /etc/kibana/kibana.yml, and make sure you have the following configurations defined - replace with the IP of your Ubuntu system.

sudo micro /etc/kibana.yml

server.port: 5601
server.host: '<YourPrivateIP>'
elasticsearch.hosts: ["http://<YourPrivateIP>:9200"]

image

Start Kibana with:

sudo service kibana start

NOTE: It might take a few minutes for Kibana to actually start - even though it says the service is running

Open up Kibana in your laptop/workstation browser with: http://Public_IP_of_Ubuntu:5601. You will be presented with the Kibana home page.

http://54.173.196.221:5601

TROUBLESHOOTING: my kibana website was not connecting. Here are the steps I took:

  • I checked to see if Kibana, logstash and elasticsearch were all on and running (and they were)
  • I checked the logs of kibana to see if it actually was running with `sudo tail -f /var/log/kibana/kibana.log and I got this error message:
    • {"type":"log","@timestamp":"2023-11-14T15:31:51+00:00","tags":["error","elasticsearch-service"],"pid":1460,"message":"Unable to retrieve version information from Elasticsearch nodes. connect ECONNREFUSED 127.0.0.1:9200"}

_ By this message I figured there was an issue in the configuration file so I went back with `sudo micro /etc/kibana/kibana.yml_

Turns out I didn't uncomment the alterations I made with the private IP address, so kibana could not find elasticsearch and thus was not able to connect in the web browser.

The colorful lines are the ones that I got rid of the comments

image

You can also see the server status by going to http://YourPublicIP:5601/status

4.2 Add an Index Pattern to display to Logstash Index

In Kibana, click the 3 horizontal lines in the upper left to open the left navigation pane

Go to Stack Management → Kibana -> Index Patterns - select “Create Index Pattern”

image

Enter “logstash-*” as the index pattern, and in the next step select @timestamp as your Time Filter field.

image

Hit Create index pattern, and you are ready to analyze the data.

4.3 Use Kibana to query data

In the Navigation Menu (3 lines) Go to the Analytics- Discover tab

image

Use the time filter in the Upper Right to find a time window that includes the Apache Log data. (In testing, the file I downloaded went back to November 2022)

image

4.4 Query the Data

Explore the data and some of the key-value pairs that are in the documents. Notice that not all documents have all the same attributes - but there is consistency in the key name where the data is present.

Also note the GeoIP data - this was added as part of the logstash processing. Using a GeoIP database, it looked up the location of the Client_IP address from the original data.

Deliverable: Use the Search Bar in the upper left and make 3 different queries using the keys in the data

Searching for users with the country code AR (Argentina)

image

image

Link to search for country codes: https://www.nationsonline.org/oneworld/country_code_list.htm

Searching for a googlebot from the US

image

image

Searching for requests that are 126 bytes

image

image

Links for reference:

Screenshot of Discover Window showing Apache logs through logstash

image

⚠️ **GitHub.com Fallback** ⚠️