Logstash - newgeekorder/TechWiki GitHub Wiki

Logstash is run form a config (conf) file and consist of 3 basic options

input
output
and filter

Input

e.g input from file

 file {
    type => "mule-ee_log"
    path => [ "/home/mule/mule-enterprise-standalone-3.4.0/logs/mule_ee.log*" ]
    sincedb_path => [ "/home/logstash" ]
  }

Output

Filters

Managing multi-line strings like an xml file

 multiline {
              type => "mule-ee_log"
              pattern => "%{DATESTAMP}"
              negate => true
              what => "previous"
    }

Grok

 grok {
         pattern => "%{LOGLEVEL:loglevel}"
         }
    grok {
         pattern => "%{DATESTAMP:logdate}"
         }
    grok {
         pattern => "%{JAVACLASS:javaclass}"
         }
    grok {
         pattern => "(?<CorrelationID>(?<CorrelationID>)(\d*?)(?=</CorrelationID>))"
         }
    grok {
         pattern => "(?<MessageID>(?<MessageID>)(\d*?)(?=</MessageID>))"
         }
    grok {
         pattern => "(?<MessageType>(?<=MessageType>)(.*?)(?=</MessageType>))"
         }

Note filters can be conditional

filter {
  if ([message] =~ /<ROOT/) {
    grok {
      match => [ "message", 
        'number="(?<number>\d+)" number2="(?<number1>\d+)"'
      ] 
    }
  } else if ([message] =~ /<EVENT /) {
    grok { 
      match => [ "message", 'name="(?<name>[^"]+)"']
    }
  }

Links and References

Examples

input {
  file {
    path => "/home/richard/apps/logstash-2.0.0-beta3/testXml.log"
    start_position => "beginning"
     codec => multiline {
       pattern => "^<PhysicalAssetMessage(.*)>"
       negate => true
       what => previous
     }
  }
}

filter {
#  mutate {
#    strip => "message"
#  }

  xml {
    source => message
    target => physicalasset

    # remove ns schema from PhysicalAssetMessage
    remove_namespaces => false

    # remove any redundant fields left
    #remove_field => [ "@version", "@timestamp", "tags", "host", "path", "type", "message" ]
  }
}

output{
  stdout{

  }
    elasticsearch {
      index => "physicalasset"
    }
}

Links and Reference

cood from a Logstash cookbook

Alternatives

Elastic seem to have two alternative shippers in the works

filebeat - file logging
packetbeat - network packet logging/analyzer
topbeat - process cpu statistic shipper

Filebeat

gettting started guide

File beat is a go binary that can be downloaded as per the getting started curl or go get github.com/elastic/filebeat

Filebeat command line is used to start, stop, provide a config file Note: filebeat will by default grab it's config file from elastic search

sudo /etc/init.d/filebeat start

Filebeat uses a tab sensitive yml file for configuration. A default file to ship all linux log files would be:

filebeat:
  # List of prospectors to fetch data.
  prospectors:
    # Each - is a prospector. Below are the prospector specific configurations
    -
      # Paths that should be crawled and fetched. Glob based paths.
      # For each file found under this path, a harvester is started.
      paths:
        - /var/log/*.log
      # - c:\programdata\elasticsearch\logs\*

      # Type of the files. Based on this the way the file is read is decided.
      # The different types cannot be mixed in one prospector
      #
      # Possible options are:
      # * log: Reads every line of the log file (default)
      # * stdin: Reads the standard in
      type: log

      # Optional additional fields. These field can be freely picked
      # to add additional information to the crawled log files for filtering
      #fields:
      #  level: debug
      #  review: 1

      # Ignore files which were modified more then the defined timespan in the past
      # Time strings like 2h (2 hours), 5m (5 minutes) can be used.
      #ignore_older:

      # Scan frequency in seconds.
      # How often these files should be checked for changes. In case it is set
      # to 0s, it is done as often as possible. Default: 10s
      #scan_frequency: 10s

      # Defines the buffer size every harvester uses when fetching the file
      #harvester_buffer_size: 16384

      # Always tail on log rotation. Disabled by default
      # Note: This may skip entries
      #tail_on_rotate: false

      # Configure the file encoding for reading file with international characters
      # supported encodings:
      #   plain, utf-8, utf-16be-bom, utf-16be, utf-16le, big5, gb18030, gbk, hzgb2312,
      #   euckr, eucjp, iso2022jp, shiftjis, iso8859-63, iso8859-6i, iso8859-8e,
      #   iso8859-8i
      #encoding: plain

output:
    elasticsearch:

      # Enable Elasticsearch as output
      enabled: true

      # The Elasticsearch cluster
      hosts: ["http://localhost:9200"]

      # Comment this option if you don't want to store the topology in
      # Elasticsearch. The default is false.
      # This option makes sense only for Packetbeat
      save_topology: true

      # Optional index name. The default is packetbeat and generates
      # [packetbeat-]YYYY.MM.DD keys.
      index: "packetbeat"