Elasticsearch - nimrody/knowledgebase GitHub Wiki

Running in production

Development / internals

Performance

  • Check cache performance

    curl 'localhost:9200/*/_stats?filter_path=indices.**.query_cache'
    curl 'localhost:9200/*/_stats?filter_path=indices.**.request_cache'
    
  • May want to enable eager global ordinals generation otherwise they are lazely generated when needed -- this affects the first query.

Setup

  1. Stop the event senders
  2. Delete the index curl -XDELETE kibana.tensera.net:9200/monitor
  3. Create the mapping by running sh monitor/config/ElasticSearchConfigs.sh
  4. Start the event senders and wait for a few events to accumulate
  5. Enter Kibana management and delete the index. Create a new index monitor with timeInSeconds as the time field (should be in the dropbox given by Kibana)

Advice

Tools

Scripts

Installation

  • RPMS: elasticsearch and kibana

  • Scaling Elasticsearch

  • Remove all indices except .kibana:

    curl 'http://localhost:9200/_cat/indices'|grep -iv kibana|cut -d' ' -f3 |xargs  -n1 -I{}  curl  -XDELETE  http://localhost:9200/{}
    

Starting as an upstart service:

  • ElasticSearch upstart script (/etc/init/elasticsearch.conf)

    description     "ElasticSearch service"
    
    start on (net-device-up
              and local-filesystems
              and runlevel [2345])
    
    stop on runlevel [016]
    
    respawn
    
    respawn limit 10 30
    
    # NB: Upstart scripts do not respect
    # /etc/security/limits.conf, so the open-file limits
    # settings need to be applied here.
    limit nofile 92000 92000
    
    exec sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch  -Edefault.path.home=/usr/share/elasticsearch -Edefault.path.logs=/var/log/elasticsearch -Edefault.path.data=/var/lib/elasticsearch -Edefault.path.conf=/etc/elasticsearch
    
  • Delete documents and clean index

    curl -X POST localhost:9200/store_2019.08.22/_delete_by_query -H "Content-Type: application/json" -d '{"query": { "match_all" : {}}}'
    curl -XPOST localhost:9200/store_2019.08.22/_forcemerge?only_expunge_deletes=true
    
  • Delete old documents

    curl -X POST localhost:9200/store_2019.11.11/_delete_by_query -H "Content-Type: application/json" -d ' {"query": { "range" : { "__timestamp": { "lte": 1573603200000 } } } }'
    
  • List all indices

    curl 'kibana.tensera.net:9200/_cat/indices?v'
    
  • List all mappings in an index

    curl "$DDD/monitor/_all/_mapping?pretty"
    
  • List all aliases

    curl localhost:9200/_cat/aliases
    
  • Query specific mapping type in an index:

    curl -XPOST "$DDD/monitor/FetchStatsSummaryMetric/_search?pretty"|
    
  • Search for specific field (note that Amit uses camelCase)

    curl  "$DDD/monitor/FetchStatsSummaryMetric/_search?q=userId:79efb2b78720987e"
    
  • Search using a full Lucene query:

    GET /monitor/FetchStatsSummaryMetric/_search
    {
      "query": {
          "bool" : {
              "must" : {
                  "query_string" : {
                      "query" : "userId:505*"
                  }
              }
          }
      }
    }
    
  • Query DSL syntax

  • Lucene queries from Kibana

  • List all records

    curl 'http://localhost:9200/foo/_search?q=*:*&size=1000'
    
  • Check index settings (read only index)

    curl localhost:9200/_settings
    
  • Release all indexes from read-only mode:

    curl -H 'content-type:application/json' -X PUT 'localhost:9200/*/_settings' -d '{"index" : {"blocks":{"read_only_allow_delete" : "false"}}}'
    
  • Aggregations (see here) and here - elasticsearch aggregations for analytics

    // max over all records
    GET /monitor/FetchStatsSummaryMetric/_search
    {
      "size": 0,
    
      "query": {"match_all": {}},
    
      "aggs" : {
        "by_user" :  {
          "max" : {"field" : "numOfGcmsReceived" }
        
        }
      }
    }
    
    
    // approximate distinct count (hyperloglog, can specify precision)
    GET /monitor/FetchStatsSummaryMetric/_search
    {
      "size": 0,
    
      "query": {"match_all": {}},
    
      "aggs" : {
        "by_user" :  {
          "cardinality" : {"field" : "userId" }
        
        }
      }
    }
    

    Count documents per user

    GET /monitor/FetchStatsSummaryMetric/_search { "size": 0,

    "query": {"match_all": {}},
    
    "aggs" : {
      "by_user" :  {
         "terms" : { "field" : "userId" } 
        
      }
    }
    

    }

  • Get last written document

    curl -H 'content-type: application/json' localhost:9200/*/_search -d' { "size": 1, "sort": { "__timestamp": "desc"}, "query": { "match_all": {} } } ' |jq .
    
  • Get last written document for the netlight pipeline

    curl -H 'content-type: application/json' localhost:9200/*/_search -d' { "size": 1, "sort": { "__timestamp": "desc"}, "query": { "term": { "_appId" : { "value":"avc_flows" } } } }' |jq .
    
  • Turn on the slow log and set the threshold

    curl -X PUT -H 'content-type: application/json' localhost:9200/_settings -d ' {"index.search.slowlog.threshold.query.info": "1ms"} '
    
  • Set max number of buckets

      curl -X PUT -H 'content-type: application/json' localhost:9200/_cluster/settings -d ' {"transient" :{"search.max_buckets": "10000"}} '
    
  • Disable auto index creation

    curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
    {
     "persistent": {
         "action.auto_create_index": "false"
     }
    }'
    
  • Pipeline aggregations

  • Logz Guide to ElasticSearch

Tools

Plugins development

Internals

Lucene

⚠️ **GitHub.com Fallback** ⚠️