Splunk - ilya-khadykin/notes-outdated GitHub Wiki

Splunk logo

How to install

Splunk offers a free version with almost all capabilities of Enterprise version which can be used as a learning tool

Installation instructions can be found in official documentation - http://docs.splunk.com/Documentation/Splunk/6.4.0/Installation/Chooseyourplatform

On Linux

  1. Download splunk_package_name.tgz
  2. Extract it to /opt with tar xvzf splunk_package_name.tgz -C /opt

Starting up Splunk

  • splunkd is a daemon and C or C++ server that can process and index data even if it is streaming, or even if it is quickly moving data. It can also process and index static data files. Splunkd is responsible for searching and indexing, which it does through the Splunk API;

  • Splunkweb provides web interface to Splunk API, it's written on Python

The functions of Splunk

  1. Data Collection Splunk can collect data from different sources, it can handle even streaming data

  2. Data Indexing Before data can be searched, it needs to be indexed. To create an index actually requires two steps: parsing and indexing. Parsing, which is basically separating the data into events, involves several steps. More details - http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Howindexingworks Indexing Process

In short, in addition to breaking up chunks of data, it adds metadata (or data about data), such as host (what device did the data come from), source (where did the event originate from), and sourcetype (the format of the data), as well as timestamps and other necessary information. The next step, indexing, breaks the events into segments that can subsequently be searched. It creates a data structure for the index and then writes the raw data and index files to disk. With this index structure, searches in Splunk can be quickly done on massive data sets.

  1. Data Searching

  2. Data Analysis

References