Elasticsearch - nimrody/knowledgebase GitHub Wiki
- Circuit breakers and here
-
Check cache performance
curl 'localhost:9200/*/_stats?filter_path=indices.**.query_cache' curl 'localhost:9200/*/_stats?filter_path=indices.**.request_cache'
-
May want to enable eager global ordinals generation otherwise they are lazely generated when needed -- this affects the first query.
- Stop the event senders
- Delete the index
curl -XDELETE kibana.tensera.net:9200/monitor
- Create the mapping by running
sh monitor/config/ElasticSearchConfigs.sh
- Start the event senders and wait for a few events to accumulate
- Enter Kibana management and delete the index. Create a new index
monitor
withtimeInSeconds
as the time field (should be in the dropbox given by Kibana)
- Routing, aliases, handling multitenant apps
- Blackbelt elasticsearch
- Lab training
- Bulk ingestion
- Optimizations advice
- Rollups and documentation
- Ebay optimization advice
-
RPMS: elasticsearch and kibana
-
Remove all indices except
.kibana
:curl 'http://localhost:9200/_cat/indices'|grep -iv kibana|cut -d' ' -f3 |xargs -n1 -I{} curl -XDELETE http://localhost:9200/{}
-
ElasticSearch upstart script (/etc/init/elasticsearch.conf)
description "ElasticSearch service" start on (net-device-up and local-filesystems and runlevel [2345]) stop on runlevel [016] respawn respawn limit 10 30 # NB: Upstart scripts do not respect # /etc/security/limits.conf, so the open-file limits # settings need to be applied here. limit nofile 92000 92000 exec sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch -Edefault.path.home=/usr/share/elasticsearch -Edefault.path.logs=/var/log/elasticsearch -Edefault.path.data=/var/lib/elasticsearch -Edefault.path.conf=/etc/elasticsearch
-
Delete documents and clean index
curl -X POST localhost:9200/store_2019.08.22/_delete_by_query -H "Content-Type: application/json" -d '{"query": { "match_all" : {}}}' curl -XPOST localhost:9200/store_2019.08.22/_forcemerge?only_expunge_deletes=true
-
Delete old documents
curl -X POST localhost:9200/store_2019.11.11/_delete_by_query -H "Content-Type: application/json" -d ' {"query": { "range" : { "__timestamp": { "lte": 1573603200000 } } } }'
-
List all indices
curl 'kibana.tensera.net:9200/_cat/indices?v'
-
List all mappings in an index
curl "$DDD/monitor/_all/_mapping?pretty"
-
List all aliases
curl localhost:9200/_cat/aliases
-
Query specific mapping type in an index:
curl -XPOST "$DDD/monitor/FetchStatsSummaryMetric/_search?pretty"|
-
Search for specific field (note that Amit uses camelCase)
curl "$DDD/monitor/FetchStatsSummaryMetric/_search?q=userId:79efb2b78720987e"
-
Search using a full Lucene query:
GET /monitor/FetchStatsSummaryMetric/_search { "query": { "bool" : { "must" : { "query_string" : { "query" : "userId:505*" } } } } }
-
List all records
curl 'http://localhost:9200/foo/_search?q=*:*&size=1000'
-
Check index settings (read only index)
curl localhost:9200/_settings
-
Release all indexes from read-only mode:
curl -H 'content-type:application/json' -X PUT 'localhost:9200/*/_settings' -d '{"index" : {"blocks":{"read_only_allow_delete" : "false"}}}'
-
Aggregations (see here) and here - elasticsearch aggregations for analytics
// max over all records GET /monitor/FetchStatsSummaryMetric/_search { "size": 0, "query": {"match_all": {}}, "aggs" : { "by_user" : { "max" : {"field" : "numOfGcmsReceived" } } } } // approximate distinct count (hyperloglog, can specify precision) GET /monitor/FetchStatsSummaryMetric/_search { "size": 0, "query": {"match_all": {}}, "aggs" : { "by_user" : { "cardinality" : {"field" : "userId" } } } }
Count documents per user
GET /monitor/FetchStatsSummaryMetric/_search { "size": 0,
"query": {"match_all": {}}, "aggs" : { "by_user" : { "terms" : { "field" : "userId" } } }
}
-
Get last written document
curl -H 'content-type: application/json' localhost:9200/*/_search -d' { "size": 1, "sort": { "__timestamp": "desc"}, "query": { "match_all": {} } } ' |jq .
-
Get last written document for the netlight pipeline
curl -H 'content-type: application/json' localhost:9200/*/_search -d' { "size": 1, "sort": { "__timestamp": "desc"}, "query": { "term": { "_appId" : { "value":"avc_flows" } } } }' |jq .
-
Turn on the slow log and set the threshold
curl -X PUT -H 'content-type: application/json' localhost:9200/_settings -d ' {"index.search.slowlog.threshold.query.info": "1ms"} '
-
Set max number of buckets
curl -X PUT -H 'content-type: application/json' localhost:9200/_cluster/settings -d ' {"transient" :{"search.max_buckets": "10000"}} '
-
Disable auto index creation
curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d' { "persistent": { "action.auto_create_index": "false" } }'
- Lucene basic concepts
- Lucene indexwriter in depth
- docvalues -- Note that docvalues are enabled by default in Elasticsearch.
- Docvalues
- Docvalues deep dive
- Lucene file formats
- clue - lucene index file viewer