Skip to content

So How Does The Elasticsearch Match Query Work⚓︎

Executive Summary⚓︎

The elasticsearch match query is your go to search query whenever starting out some analysis in elasticsearch, this post attempts to explain how the match query works.

Default Match Query⚓︎

Lets say we had these two documents to be searched.

POST /test/_doc/1
{"id":1, "name":"MR JB BROW\nN"}

POST /test/_doc/2
{"id":2, "name":"MR JAMIE BBROWN"}
Our simple match query on the name field would look like this. This query would return both of our documents.
GET /test/_search
{
  "query": {
    "match": {
      "name": "MR JAMES BEN BROWN"
    }
  }
}
You may assume that elasticsearch is really smart and knew most of the words were similar. However the reason the above query matched both of our documents was that it found MR in both of the documents. If you removed MR from the original two documents the above query would not have hit either document.

Fuzziness⚓︎

The simplest option to address minor misspellings is to set the fuzziness parameter. So lets take out MR from our example and try fuzziness.

POST /test/_doc/1
{"id":1, "name":"JB BROW\nN"}

POST /test/_doc/2
{"id":2, "name":"JAMIE BBROWN"}
Now we modify our original match query to include the fuzziness parameter(which is elastic speak for edit distance). The avaliable options are edit distance of 1,2 or AUTO which is 1 for short words and 2 for longer words.
GET /test/_search
{
  "query": {
    "match": {
      "name": {
        "query": "JAMES BEN BROWN",
        "fuzziness": "AUTO"
      }
    }
  }
}
Both documents continue to match, even without the MR. However a search for BROWN TOLIET CLEANING would also be a match.

AND / OR OPERATOR⚓︎

The operator flag allows you to specify if all of the searched terms within a match query must be contained within the searched documents. By default the operator is set to OR, to set the operator to AND we use the below syntax.

GET /test/_search
{
  "query": {
    "match": {
      "name": {
        "query": "BROWN TOLIET CLEANING",
        "fuzziness": "AUTO",
        "operator": "AND"
      }
    }
  }
}
The below search will not return either of our documents, however the matching critiera is pretty high so you might not get many results.

MINIMUM SHOULD MATCH⚓︎

The minimum should match parameter allows you to specify how many of the searched terms are required to match. The below is the simplest usage for minimum should match where we have specified that two of teh searched terms are required to match.

GET /test/_search
{
  "query": {
    "match": {
      "name": {
        "query": "BROWN TOLIET CLEANING",
        "fuzziness": "AUTO",
        "minimum_should_match": 2
      }
    }
  }
}
Elasticsearch allows us to also specify a percentage of terms that are required to match (noting that this percentage is rounded up (i.e. in a three term search anything less than a 67% minimum should match would result in only one term needing to match)

GET /test/_search
{
  "query": {
    "match": {
      "name": {
        "query": "BROWN TOLIET CLEANING",
        "fuzziness": "AUTO",
        "minimum_should_match": "67%"
      }
    }
  }
}

Elasticsearch also allows us to combine minimum should match criteria as follows:

GET /test/_search
{
  "query": {
    "match": {
      "name": {
        "query": "BROWN TOLIET CLEANING",
        "fuzziness": "AUTO",
        "minimum_should_match": "1<2 5<60%"
      }
    }
  }
}
The below table attempts to explain how this minimum should match works.

Search Term Document Hit Example Searched Term Example Hit
1 Word Contain Searched Term JAMES JAMES BROWN
2 Words Contain 2 Searched Term JAMES BROWN JAMES BROWN
3 Words Contain 2 Searched Term JAMES B BROWN JAMES E BROWN
4 Words Contain 2 Searched Term JAMES B D BROWN JAMES E BROWN
5 Words Contain 2 Searched Term MR JAMES B D BROWN JAMES E BROWN
6 Words Contain 4 Searched Term MR JAMES B D VAN BROWN MR JAMES VAN BROWN

MINIMUM SHOULD MATCH⚓︎

The minimum should match parameter allows you to specify how many of the searched terms are required to match. The below is the simplest usage for minimum should match where we have specified that two of teh searched terms are required to match.