Analysis - gnuhub/elasticsearch GitHub Wiki

The index analysis module acts as a configurable registry of analyzers that can be used in order to both break indexed (analyzed) fields when a document is indexed and process query strings. It maps to Lucene Analyzer.

Analyzers in general are broken down into a "Tokenizer" with zero or more "Token Filters" applied to it. A set of "Char Filters" can be associated with it to filter out the text stream. The analysis module allows to register TokenFilters, Tokenizers and Analyzers under logical names which can then be referenced either in mapping definitions or in certain APIs. The Analysis module automatically registers (if not explicitly defined) built in analyzers, token filters, and tokenizers.

index :
    analysis :
        analyzer : 
            standard : 
                type : standard
                stopwords : [stop1, stop2]
            myAnalyzer1 :
                type : standard
                stopwords : [stop1, stop2, stop3]
                max_token_length : 500
            # configure a custom analyzer which is 
            # exactly like the default standard analyzer
            myAnalyzer2 :
                tokenizer : standard
                filter : [standard, lowercase, stop]
        tokenizer :
            myTokenizer1 :
                type : standard
                max_token_length : 900
            myTokenizer2 :
                type : keyword
                buffer_size : 512
        filter :
            myTokenFilter1 :
                type : stop
                stopwords : [stop1, stop2, stop3, stop4]
            myTokenFilter2 :
                type : length
                min : 0
                max : 2000
⚠️ **GitHub.com Fallback** ⚠️