So How Does The Elasticsearch Match Query Work⚓︎
Executive Summary⚓︎
The elasticsearch match query is your go to search query whenever starting out some analysis in elasticsearch, this post attempts to explain how the match query works.
Default Match Query⚓︎
Lets say we had these two documents to be searched.
POST /test/_doc/1
{"id":1, "name":"MR JB BROW\nN"}
POST /test/_doc/2
{"id":2, "name":"MR JAMIE BBROWN"}
GET /test/_search
{
"query": {
"match": {
"name": "MR JAMES BEN BROWN"
}
}
}
MR in both of the documents. If you removed MR from the original two documents the above query would not have hit either document.
Fuzziness⚓︎
The simplest option to address minor misspellings is to set the fuzziness parameter. So lets take out MR from our example and try fuzziness.
POST /test/_doc/1
{"id":1, "name":"JB BROW\nN"}
POST /test/_doc/2
{"id":2, "name":"JAMIE BBROWN"}
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "JAMES BEN BROWN",
"fuzziness": "AUTO"
}
}
}
}
MR. However a search for BROWN TOLIET CLEANING would also be a match.
AND / OR OPERATOR⚓︎
The operator flag allows you to specify if all of the searched terms within a match query must be contained within the searched documents. By default the operator is set to OR, to set the operator to AND we use the below syntax.
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "BROWN TOLIET CLEANING",
"fuzziness": "AUTO",
"operator": "AND"
}
}
}
}
MINIMUM SHOULD MATCH⚓︎
The minimum should match parameter allows you to specify how many of the searched terms are required to match. The below is the simplest usage for minimum should match where we have specified that two of teh searched terms are required to match.
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "BROWN TOLIET CLEANING",
"fuzziness": "AUTO",
"minimum_should_match": 2
}
}
}
}
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "BROWN TOLIET CLEANING",
"fuzziness": "AUTO",
"minimum_should_match": "67%"
}
}
}
}
Elasticsearch also allows us to combine minimum should match criteria as follows:
GET /test/_search
{
"query": {
"match": {
"name": {
"query": "BROWN TOLIET CLEANING",
"fuzziness": "AUTO",
"minimum_should_match": "1<2 5<60%"
}
}
}
}
| Search Term | Document Hit | Example Searched Term | Example Hit |
|---|---|---|---|
| 1 Word | Contain Searched Term | JAMES | JAMES BROWN |
| 2 Words | Contain 2 Searched Term | JAMES BROWN | JAMES BROWN |
| 3 Words | Contain 2 Searched Term | JAMES B BROWN | JAMES E BROWN |
| 4 Words | Contain 2 Searched Term | JAMES B D BROWN | JAMES E BROWN |
| 5 Words | Contain 2 Searched Term | MR JAMES B D BROWN | JAMES E BROWN |
| 6 Words | Contain 4 Searched Term | MR JAMES B D VAN BROWN | MR JAMES VAN BROWN |
MINIMUM SHOULD MATCH⚓︎
The minimum should match parameter allows you to specify how many of the searched terms are required to match. The below is the simplest usage for minimum should match where we have specified that two of teh searched terms are required to match.