Custom Token Filters

Token filters are used to modify the tokens produced by a tokenizer. Common token filters include lowercasing, removing stop words, stemming, and more.

Example: Lowercase and Stop Filter

Let’s create a custom analyzer that includes both lowercase and stop filters.

PUT /custom_filter_example
{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": ["lowercase", "stop"]
        }
      }
    }
  }
}

Analyzing text:

GET /custom_filter_example/_analyze
{
  "analyzer": "custom_analyzer",
  "text": "Elasticsearch is a powerful search engine"
}

Output:

{
  "tokens": [
    { "token": "elasticsearch", "start_offset": 0, "end_offset": 14, "type": "word", "position": 0 },
    { "token": "powerful", "start_offset": 20, "end_offset": 28, "type": "word", "position": 1 },
    { "token": "search", "start_offset": 29, "end_offset": 35, "type": "word", "position": 2 },
    { "token": "engine", "start_offset": 36, "end_offset": 42, "type": "word", "position": 3 }
  ]
}

In this example:

The text is tokenized using the standard tokenizer.
Tokens are converted to lowercase.
Stop words (“is“, “a“) are removed.

Full Text Search with Analyzer and Tokenizer

Elasticsearch is renowned for its powerful full-text search capabilities. At the heart of this functionality are analyzers and tokenizers, which play a crucial role in how text is processed and indexed. This guide will help you understand how analyzers and tokenizers work in Elasticsearch, with detailed examples and outputs to make these concepts easy to grasp.

Custom Token Filters

Example: Lowercase and Stop Filter

Analyzing text:

Full Text Search with Analyzer and Tokenizer

Categories

Contact US

Custom Token Filters

Example: Lowercase and Stop Filter

Analyzing text:

Full Text Search with Analyzer and Tokenizer

Similar Reads

Categories

Contact US