diff --git a/_im-plugin/index-rollups/index.md b/_im-plugin/index-rollups/index.md
index 4637a95edb..26d718678a 100644
--- a/_im-plugin/index-rollups/index.md
+++ b/_im-plugin/index-rollups/index.md
@@ -9,7 +9,7 @@ has_toc: false
# Index rollups
-Time series data increases storage costs, strains cluster health, and slows down aggregations over time. Index rollup lets you periodically reduce data granularity by rolling up old data into summarized indices.
+Time series data increases storage costs, strains cluster health, and slows down aggregations over time. Index rollup lets you periodically reduce data granularity by rolling up old data into summarized indexes.
You pick the fields that interest you and use index rollup to create a new index with only those fields aggregated into coarser time buckets. You can store months or years of historical data at a fraction of the cost with the same query performance.
@@ -18,7 +18,7 @@ For example, say you collect CPU consumption data every five seconds and store i
You can use index rollup in three ways:
1. Use the index rollup API for an on-demand index rollup job that operates on an index that's not being actively ingested such as a rolled-over index. For example, you can perform an index rollup operation to reduce data collected at a five minute interval to a weekly average for trend analysis.
-2. Use the OpenSearch Dashboards UI to create an index rollup job that runs on a defined schedule. You can also set it up to roll up your indices as it’s being actively ingested. For example, you can continuously roll up Logstash indices from a five second interval to a one hour interval.
+2. Use the OpenSearch Dashboards UI to create an index rollup job that runs on a defined schedule. You can also set it up to roll up your indexes as it’s being actively ingested. For example, you can continuously roll up Logstash indexes from a five second interval to a one hour interval.
3. Specify the index rollup job as an ISM action for complete index management. This allows you to roll up an index after a certain event such as a rollover, index age reaching a certain point, index becoming read-only, and so on. You can also have rollover and index rollup jobs running in sequence, where the rollover first moves the current index to a warm node and then the index rollup job creates a new index with the minimized data on the hot node.
## Create an Index Rollup Job
@@ -26,7 +26,7 @@ You can use index rollup in three ways:
To get started, choose **Index Management** in OpenSearch Dashboards.
Select **Rollup Jobs** and choose **Create rollup job**.
-### Step 1: Set up indices
+### Step 1: Set up indexes
1. In the **Job name and description** section, specify a unique name and an optional description for the index rollup job.
2. In the **Indices** section, select the source and target index. The source index is the one that you want to roll up. The source index remains as is, the index rollup job creates a new index referred to as a target index. The target index is where the index rollup results are saved. For target index, you can either type in a name for a new index or you select an existing index.
@@ -48,7 +48,7 @@ The order in which you select attributes is critical. A city followed by a demog
### Step 3: Specify schedule
-Specify a schedule to roll up your indices as it’s being ingested. The index rollup job is enabled by default.
+Specify a schedule to roll up your indexes as it’s being ingested. The index rollup job is enabled by default.
1. Specify if the data is continuous or not.
3. For roll up execution frequency, select **Define by fixed interval** and specify the **Rollup interval** and the time unit or **Define by cron expression** and add in a cron expression to select the interval. To learn how to define a cron expression, see [Alerting]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/cron/).
@@ -303,7 +303,7 @@ PUT _plugins/_rollup/jobs/example
```
You can query the `example_rollup` index for the terms aggregations on the fields set up in the rollup job.
-You get back the same response that you would on the original `opensearch_dashboards_sample_data_ecommerce` source index.
+You get back the same response that you would on the original `opensearch_dashboards_sample_data_ecommerce` source index:
```json
POST example_rollup/_search
diff --git a/_observing-your-data/alerting/monitors.md b/_observing-your-data/alerting/monitors.md
index 9773610696..da0618f225 100644
--- a/_observing-your-data/alerting/monitors.md
+++ b/_observing-your-data/alerting/monitors.md
@@ -111,7 +111,7 @@ Whereas query-level monitors run your specified query and then check whether the
- Visual definition works well for monitors that you can define as "some value is above or below some threshold for some amount of time."
- - Query definition gives you flexibility in terms of what you query for (using [the OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text)) and how you evaluate the results of that query (Painless scripting).
+ - Query definition gives you flexibility in terms of what you query for (using [OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index)) and how you evaluate the results of that query (Painless scripting).
This example averages the `cpu_usage` field:
@@ -164,7 +164,7 @@ Whereas query-level monitors run your specified query and then check whether the
If you use the Security plugin, you can only choose indexes that you have permission to access. For details, see [Alerting security]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/security/).
- To use a query, choose **Extraction query editor**, add your query (using [the OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/)), and test it using the **Run** button.
+ To use a query, choose **Extraction query editor**, add your query (using [OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index)), and test it using the **Run** button.
The monitor makes this query to OpenSearch as often as the schedule dictates; check the **Query Performance** section and make sure you're comfortable with the performance implications.
diff --git a/_opensearch/query-dsl/bool.md b/_opensearch/query-dsl/compound/bool.md
similarity index 99%
rename from _opensearch/query-dsl/bool.md
rename to _opensearch/query-dsl/compound/bool.md
index ae58f3af6a..78669ea09d 100644
--- a/_opensearch/query-dsl/bool.md
+++ b/_opensearch/query-dsl/compound/bool.md
@@ -1,8 +1,9 @@
---
layout: default
title: Boolean queries
-parent: Query DSL
-nav_order: 45
+parent: Compound queries
+grand_parent: Query DSL
+nav_order: 10
---
# Boolean queries
diff --git a/_opensearch/query-dsl/compound/index.md b/_opensearch/query-dsl/compound/index.md
new file mode 100644
index 0000000000..239af81d46
--- /dev/null
+++ b/_opensearch/query-dsl/compound/index.md
@@ -0,0 +1,19 @@
+---
+layout: default
+title: Compound queries
+parent: Query DSL
+has_children: true
+nav_order: 40
+---
+
+# Compound queries
+
+Compound queries serve as wrappers for multiple leaf or compound clauses either to combine their results or to modify their behavior.
+
+OpenSearch supports the following compound query types:
+
+- **Boolean**: Combines multiple query clauses with Boolean logic. To learn more, see [Boolean queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/compound/bool/).
+- **Constant score**: Wraps a query or a filter and assigns a constant score to all matching documents. This score is equal to the `boost` value.
+- **Disjunction max**: Returns documents that match one or more query clauses. If a document matches multiple query clauses, it is assigned a higher relevance score. The relevance score is calculated using the highest score from any matching clause and, optionally, the scores from the other matching clauses multiplied by the tiebreaker value.
+- **Function score**: Recalculates the relevance score of documents that are returned by a query using a function that you define.
+- **Boosting**: Changes the relevance score of documents without removing them from the search results. Returns documents that match a `positive` query, but downgrades the relevance of documents in the results that match a `negative` query.
\ No newline at end of file
diff --git a/_opensearch/query-dsl/full-text.md b/_opensearch/query-dsl/full-text/index.md
similarity index 99%
rename from _opensearch/query-dsl/full-text.md
rename to _opensearch/query-dsl/full-text/index.md
index e21efaa1b9..9960414d57 100644
--- a/_opensearch/query-dsl/full-text.md
+++ b/_opensearch/query-dsl/full-text/index.md
@@ -2,7 +2,8 @@
layout: default
title: Full-text queries
parent: Query DSL
-nav_order: 40
+has_children: true
+nav_order: 30
---
# Full-text queries
diff --git a/_opensearch/query-dsl/query-string.md b/_opensearch/query-dsl/full-text/query-string.md
similarity index 97%
rename from _opensearch/query-dsl/query-string.md
rename to _opensearch/query-dsl/full-text/query-string.md
index 6371a47db8..3688a2d239 100644
--- a/_opensearch/query-dsl/query-string.md
+++ b/_opensearch/query-dsl/full-text/query-string.md
@@ -1,8 +1,9 @@
---
layout: default
title: Query string queries
-parent: Query DSL
-nav_order: 70
+parent: Full-text queries
+grand_parent: Query DSL
+nav_order: 25
---
# Query string queries
@@ -41,7 +42,7 @@ Parameter | Data type | Description
`phrase_slop` | Integer | The maximum number of words that are allowed between the matched words. If `phrase_slop` is 2, a maximum of two words is allowed between matched words in a phrase. Transposed words have a slop of 2. Default is 0 (an exact phrase match where matched words must be next to each other).
`minimum_should_match` | Positive or negative integer, positive or negative percentage, combination | If the query string contains multiple search terms and you used the `or` operator, the number of terms that need to match for the document to be considered a match. For example, if `minimum_should_match` is 2, "wind often rising" does not match "The Wind Rises." If `minimum_should_match` is 1, it matches.
`rewrite` | String | Determines how OpenSearch rewrites and scores multi-term queries. Valid values are `constant_score`, `scoring_boolean`, `constant_score_boolean`, `top_terms_N`, `top_terms_boost_N`, and `top_terms_blended_freqs_N`. Default is `constant_score`.
-`auto_generate_synonyms_phrase_query` | Boolean | Specifies whether to create [match queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text#match) automatically for multi-term synonyms. Default is `true`.
+`auto_generate_synonyms_phrase_query` | Boolean | Specifies whether to create [match queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match) automatically for multi-term synonyms. Default is `true`.
`boost` | Floating-point | Boosts the clause by the given multiplier. Values less than 1.0 decrease relevance, and values greater than 1.0 increase relevance. Default is 1.0.
`default_operator`| String | The default Boolean operator used if no operators are specified. Valid values are:
- `OR`: The string `to be` is interpreted as `to OR be`
- `AND`: The string `to be` is interpreted as `to AND be`
Default is `OR`.
`enable_position_increments` | Boolean | When true, resulting queries are aware of position increments. This setting is useful when the removal of stop words leaves an unwanted "gap" between terms. Default is `true`.
diff --git a/_opensearch/query-dsl/function-score.md b/_opensearch/query-dsl/function-score.md
deleted file mode 100644
index e69de29bb2..0000000000
diff --git a/_opensearch/query-dsl/geo-bounding-box.md b/_opensearch/query-dsl/geo-and-xy/geo-bounding-box.md
similarity index 98%
rename from _opensearch/query-dsl/geo-bounding-box.md
rename to _opensearch/query-dsl/geo-and-xy/geo-bounding-box.md
index 8d44eb6d5f..7177334827 100644
--- a/_opensearch/query-dsl/geo-bounding-box.md
+++ b/_opensearch/query-dsl/geo-and-xy/geo-bounding-box.md
@@ -1,8 +1,9 @@
---
layout: default
title: Geo-bounding box queries
-parent: Query DSL
-nav_order: 55
+parent: Geographic and xy queries
+grand_parent: Query DSL
+nav_order: 10
---
# Geo-bounding box queries
diff --git a/_opensearch/query-dsl/geo-and-xy/index.md b/_opensearch/query-dsl/geo-and-xy/index.md
new file mode 100644
index 0000000000..ba9f2b590e
--- /dev/null
+++ b/_opensearch/query-dsl/geo-and-xy/index.md
@@ -0,0 +1,32 @@
+---
+layout: default
+title: Geographic and xy queries
+parent: Query DSL
+has_children: true
+nav_order: 50
+---
+
+# Geographic and xy queries
+
+Geographic and xy queries let you search fields that contain points and shapes on a map or coordinate plane. Geographic queries work on geospatial data, while xy queries work on two-dimensional coordinate data. Out of all geographic queries, the geoshape query is very similar to the xy query, but the former searches [geographic fields]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geographic), while the latter searches [Cartesian fields]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy).
+
+## xy queries
+
+[xy queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/geo-and-xy/xy) search for documents that contain geometries in a Cartesian coordinate system. These geometries can be specified in [`xy_point`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy-point) fields, which support points, and [`xy_shape`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/xy-shape) fields, which support points, lines, circles, and polygons.
+
+xy queries return documents that contain:
+- xy shapes and xy points that have one of four spatial relations to the provided shape: `INTERSECTS`, `DISJOINT`, `WITHIN`, or `CONTAINS`.
+- xy points that intersect the provided shape.
+
+## Geographic queries
+
+Geographic queries search for documents that contain geospatial geometries. These geometries can be specified in [`geo_point`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point) fields, which support points on a map, and [`geo_shape`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-shape) fields, which support points, lines, circles, and polygons.
+
+OpenSearch provides the following geographic query types:
+
+- [**Geo-bounding box queries**]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/geo-and-xy/geo-bounding-box/): Return documents with geopoint field values that are within a bounding box.
+- **Geodistance queries** return documents with geopoints that are within a specified distance from the provided geopoint.
+- **Geopolygon queries** return documents with geopoints that are within a polygon.
+- **Geoshape queries** return documents that contain:
+ - geoshapes and geopoints that have one of four spatial relations to the provided shape: `INTERSECTS`, `DISJOINT`, `WITHIN`, or `CONTAINS`.
+ - geopoints that intersect the provided shape.
\ No newline at end of file
diff --git a/_opensearch/query-dsl/index.md b/_opensearch/query-dsl/index.md
index 7eac368029..6f7c277b24 100644
--- a/_opensearch/query-dsl/index.md
+++ b/_opensearch/query-dsl/index.md
@@ -12,118 +12,43 @@ redirect_from:
# Query DSL
-While you can use HTTP request parameters to perform simple searches, you can also use the OpenSearch query domain-specific language (DSL), which provides a wider range of search options. The query DSL uses the HTTP request body, so you can more easily customize your queries to get the exact results that you want.
+OpenSearch provides a search language called *query domain-specific language (DSL)* that you can use to search your data. Query DSL is a flexible language with a JSON interface.
-For example, the following request performs a simple search to search for a `speaker` field that has a value of `queen`.
+With query DSL, you need to specify a query in the `query` parameter of the search. One of the simplest searches in OpenSearch uses the `match_all` query, which matches all documents in an index:
-**Sample request**
-```json
-GET _search?q=speaker:queen
-```
-
-**Sample response**
-```
-{
- "took": 87,
- "timed_out": false,
- "_shards": {
- "total": 68,
- "successful": 68,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": {
- "value": 4080,
- "relation": "eq"
- },
- "max_score": 4.4368687,
- "hits": [
- {
- "_index": "new_shakespeare",
- "_id": "28559",
- "_score": 4.4368687,
- "_source": {
- "type": "line",
- "line_id": 28560,
- "play_name": "Cymbeline",
- "speech_number": 20,
- "line_number": "1.1.81",
- "speaker": "QUEEN",
- "text_entry": "No, be assured you shall not find me, daughter,"
- }
- }
-```
-
-With query DSL, however, you can include an HTTP request body to look for results more tailored to your needs. The following example shows how to search for `speaker` and `text_entry` fields that have a value of `QUEEN`.
-
-**Sample request**
```json
+GET testindex/_search
{
"query": {
- "multi_match": {
- "query": "QUEEN",
- "fields": ["speaker", "text_entry"]
- }
+ "match_all": {
+ }
}
}
```
-**Sample Response**
-```json
-{
- "took": 39,
- "timed_out": false,
- "_shards": {
- "total": 68,
- "successful": 68,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": {
- "value": 5837,
- "relation": "eq"
- },
- "max_score": 7.8623476,
- "hits": [
- {
- "_index": "new_shakespeare",
- "_id": "100763",
- "_score": 7.8623476,
- "_source": {
- "type": "line",
- "line_id": 100764,
- "play_name": "Troilus and Cressida",
- "speech_number": 43,
- "line_number": "3.1.68",
- "speaker": "PANDARUS",
- "text_entry": "Sweet queen, sweet queen! thats a sweet queen, i faith."
- }
- },
- {
- "_index": "shakespeare",
- "_id": "28559",
- "_score": 5.8923807,
- "_source": {
- "type": "line",
- "line_id": 28560,
- "play_name": "Cymbeline",
- "speech_number": 20,
- "line_number": "1.1.81",
- "speaker": "QUEEN",
- "text_entry": "No, be assured you shall not find me, daughter,"
- }
- }
- ]
- }
-}
-```
-The OpenSearch query DSL comes in three varieties: term-level queries, full-text queries, and boolean queries. You can even perform more complicated searches by using different elements from each variety to find whatever data you need.
+A query can consist of many query clauses. You can combine query clauses to produce complex queries.
+
+Broadly, you can classify queries into two categories---*leaf queries* and *compound queries*:
+
+- **Leaf queries**: Leaf queries search for a specified value in a certain field or fields. You can use leaf queries on their own. They include the following query types:
+
+ - **Full-text queries**: Use full-text queries to search text documents. For an analyzed text field search, full-text queries split the query string into terms with the same analyzer that was used when the field was indexed. For an exact value search, full-text queries look for the specified value without applying text analysis. To learn more, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
+
+ - **Term-level queries**: Use term-level queries to search documents for an exact specified term, such as an ID or value range. Term-level queries do not analyze search terms or sort results by relevance score. To learn more, see [Term-level queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/).
+
+ - **Geographic and xy queries**: Use geographic queries to search documents that include geographic data. Use xy queries to search documents that include points and shapes in a two-dimensional coordinate system. To learn more, see [Geographic and xy queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/geo-and-xy/index).
+
+ - **Joining queries**: Use joining queries to search nested fields or return parent and child documents that match a specific query. Types of joining queries include `nested`, `has_child`, `has_parent`, and `parent_id` queries.
+
+ - **Span queries**: Use span queries to perform precise positional searches. Span queries are low-level, specific queries that provide control over the order and proximity of specified query terms. They are primarily used to search legal documents. To learn more, see [Span queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/span-query/).
+
+ - **Specialized queries**: Specialized queries include all other query types (`distance_feature`, `more_like_this`, `percolate`, `rank_feature`, `script`, `script_score`, `wrapper`, and `pinned_query`).
+
+- **Compound queries**: Compound queries serve as wrappers for multiple leaf or compound clauses either to combine their results or to modify their behavior. They include the Boolean, disjunction max, constant score, function score, and boosting query types. To learn more, see [Compound queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/compound/index).
## A note on Unicode special characters in text fields
-Due to word boundaries associated with Unicode special characters, the Unicode standard analyzer cannot index a [text field type](https://opensearch.org/docs/2.2/opensearch/supported-field-types/text/) value as a whole value when it includes one of these special characters. As a result, a text field value that includes a special character is parsed by the standard analyzer as multiple values separated by the special character, effectively tokenizing the different elements on either side of it. This can lead to unintentional filtering of documents and potentially compromise control over their access.
+Due to word boundaries associated with Unicode special characters, the Unicode standard analyzer cannot index a [text field type]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/text/) value as a whole value when it includes one of these special characters. As a result, a text field value that includes a special character is parsed by the standard analyzer as multiple values separated by the special character, effectively tokenizing the different elements on either side of it. This can lead to unintentional filtering of documents and potentially compromise control over their access.
The examples below illustrate values containing special characters that will be parsed improperly by the standard analyzer. In this example, the existence of the hyphen/minus sign in the value prevents the analyzer from distinguishing between the two different users for `user.id` and interprets them as one and the same:
@@ -151,7 +76,6 @@ The examples below illustrate values containing special characters that will be
}
```
-To avoid this circumstance when using either query DSL or the REST API, you can use a custom analyzer or map the field as `keyword`, which performs an exact-match search. See [Keyword field type](https://opensearch.org/docs/2.2/opensearch/supported-field-types/keyword/) for the latter option.
-
-For a list of characters that should be avoided when field type is `text`, see [Word Boundaries](https://unicode.org/reports/tr29/#Word_Boundaries).
+To avoid this circumstance when using either query DSL or the REST API, you can use a custom analyzer or map the field as `keyword`, which performs an exact-match search. See [Keyword field type]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/keyword/) for the latter option.
+For a list of characters that should be avoided for `text` field types, see [Word Boundaries](https://unicode.org/reports/tr29/#Word_Boundaries).
\ No newline at end of file
diff --git a/_opensearch/query-dsl/span-query.md b/_opensearch/query-dsl/span-query.md
new file mode 100644
index 0000000000..6ed2842991
--- /dev/null
+++ b/_opensearch/query-dsl/span-query.md
@@ -0,0 +1,22 @@
+---
+layout: default
+title: Span queries
+parent: Query DSL
+nav_order: 60
+---
+
+# Span queries
+
+You can use span queries to perform precise positional searches. Span queries are low-level, specific queries that provide control over the order and proximity of specified query terms. They are primarily used to search legal documents and patents.
+
+Span queries include the following query types:
+
+- **Span containing**: Wraps a list of span queries and only returns spans that match a second span query.
+- **Span field masking**: Combines `span_near` or `span_or` across different fields.
+- **Span first**: Matches spans close to the beginning of the field.
+- **Span multi-term**: Provides a wrapper around the following query types: `term`, `range`, `prefix`, `wildcard`, `regexp` or `fuzzy`.
+- **Span near**: Matches spans that are near each other. Wraps multiple span queries that must match within the specified `slop` distance of each other, and optionally in the same order. Slop represents the maximum number of intervening unmatched positions and indicates whether matches are required to be returned in order.
+- **Span not**: Provides a wrapper for another span query and excludes any documents that match the internal query.
+- **Span or**: Provides a wrapper for multiple span queries and includes any documents that match any of the specified queries.
+- **Span term**: Functions in the same way as a `term` query, but is designed to be used with other span queries.
+- **Span within**: Used with other span queries to return a single span query if its span is within the spans that are returned by a list of other span queries.
\ No newline at end of file
diff --git a/_opensearch/query-dsl/term-vs-full-text.md b/_opensearch/query-dsl/term-vs-full-text.md
new file mode 100644
index 0000000000..c35fa77bd0
--- /dev/null
+++ b/_opensearch/query-dsl/term-vs-full-text.md
@@ -0,0 +1,233 @@
+---
+layout: default
+title: Term-level and full-text queries compared
+parent: Query DSL
+nav_order: 10
+---
+
+# Term-level and full-text queries compared
+
+You can use both term-level and full-text queries to search text, but while term-level queries are usually used to search structured data, full-text queries are used for full-text search. The main difference between term-level and full-text queries is that term-level queries search documents for an exact specified term, while full-text queries analyze the query string. The following table summarizes the differences between term-level and full-text queries.
+
+| | Term-level queries | Full-text queries
+:--- | :--- | :---
+*Description* | Term-level queries answer which documents match a query. | Full-text queries answer how well the documents match a query.
+*Analyzer* | The search term isn't analyzed. This means that the term query searches for your search term as it is. | The search term is analyzed by the same analyzer that was used for the specific document field at the time it was indexed. This means that your search term goes through the same analysis process as the document's field.
+*Relevance* | Term-level queries simply return documents that match without sorting them based on the relevance score. They still calculate the relevance score, but this score is the same for all the documents that are returned. | Full-text queries calculate a relevance score for each match and sort the results by decreasing order of relevance.
+*Use Case* | Use term-level queries when you want to match exact values such as numbers, dates, or tags and don't need the matches to be sorted by relevance. | Use full-text queries to match text fields and sort by relevance after taking into account factors like casing and stemming variants.
+
+OpenSearch uses the BM25 ranking algorithm to calculate relevance scores. To learn more, see [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25).
+{: .note }
+
+## Should I use a full-text or a term-level query?
+
+To clarify the difference between full-text and term-level queries, consider the following two examples that search for a specific text phrase. The complete works of Shakespeare are indexed in an OpenSearch cluster.
+
+### Example: Phrase search
+
+In this example, you'll search the complete works of Shakespeare for the phrase "To be, or not to be" in the `text_entry` field.
+
+First, use a **term-level query** for this search:
+
+```json
+GET shakespeare/_search
+{
+ "query": {
+ "term": {
+ "text_entry": "To be, or not to be"
+ }
+ }
+}
+```
+
+The response contains no matches, indicated by zero `hits`:
+
+```json
+{
+ "took" : 3,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 0,
+ "relation" : "eq"
+ },
+ "max_score" : null,
+ "hits" : [ ]
+ }
+}
+```
+
+This is because the term “To be, or not to be” is searched literally in the inverted index, where only the analyzed values of the text fields are stored. Term-level queries aren’t suited for searching analyzed text fields because they often yield unexpected results. When working with text data, use term-level queries only for fields mapped as `keyword`.
+
+Now search for the same phrase using a **full-text query**:
+
+```json
+GET shakespeare/_search
+{
+ "query": {
+ "match": {
+ "text_entry": "To be, or not to be"
+ }
+ }
+}
+```
+
+The search query “To be, or not to be” is analyzed and tokenized into an array of tokens just like the `text_entry` field of the documents. The full-text query takes an intersection of tokens between the search query and the `text_entry` fields for all the documents, and then sorts the results by relevance score:
+
+```json
+{
+ "took" : 19,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 10000,
+ "relation" : "gte"
+ },
+ "max_score" : 17.419369,
+ "hits" : [
+ {
+ "_index" : "shakespeare",
+ "_id" : "34229",
+ "_score" : 17.419369,
+ "_source" : {
+ "type" : "line",
+ "line_id" : 34230,
+ "play_name" : "Hamlet",
+ "speech_number" : 19,
+ "line_number" : "3.1.64",
+ "speaker" : "HAMLET",
+ "text_entry" : "To be, or not to be: that is the question:"
+ }
+ },
+ {
+ "_index" : "shakespeare",
+ "_id" : "109930",
+ "_score" : 14.883024,
+ "_source" : {
+ "type" : "line",
+ "line_id" : 109931,
+ "play_name" : "A Winters Tale",
+ "speech_number" : 23,
+ "line_number" : "4.4.153",
+ "speaker" : "PERDITA",
+ "text_entry" : "Not like a corse; or if, not to be buried,"
+ }
+ },
+ {
+ "_index" : "shakespeare",
+ "_id" : "103117",
+ "_score" : 14.782743,
+ "_source" : {
+ "type" : "line",
+ "line_id" : 103118,
+ "play_name" : "Twelfth Night",
+ "speech_number" : 53,
+ "line_number" : "1.3.95",
+ "speaker" : "SIR ANDREW",
+ "text_entry" : "will not be seen; or if she be, its four to one"
+ }
+ }
+ ]
+ }
+}
+...
+```
+
+For a list of all full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
+
+### Example: Exact term search
+
+If you want to search for an exact term like “HAMLET” in the `speaker` field and don't need the results to be sorted by relevance score, a term-level query is more efficient:
+
+```json
+GET shakespeare/_search
+{
+ "query": {
+ "term": {
+ "speaker": "HAMLET"
+ }
+ }
+}
+```
+
+The response contains document matches:
+
+```json
+{
+ "took" : 5,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 1582,
+ "relation" : "eq"
+ },
+ "max_score" : 4.2540946,
+ "hits" : [
+ {
+ "_index" : "shakespeare",
+ "_id" : "32700",
+ "_score" : 4.2540946,
+ "_source" : {
+ "type" : "line",
+ "line_id" : 32701,
+ "play_name" : "Hamlet",
+ "speech_number" : 9,
+ "line_number" : "1.2.66",
+ "speaker" : "HAMLET",
+ "text_entry" : "[Aside] A little more than kin, and less than kind."
+ }
+ },
+ {
+ "_index" : "shakespeare",
+ "_id" : "32702",
+ "_score" : 4.2540946,
+ "_source" : {
+ "type" : "line",
+ "line_id" : 32703,
+ "play_name" : "Hamlet",
+ "speech_number" : 11,
+ "line_number" : "1.2.68",
+ "speaker" : "HAMLET",
+ "text_entry" : "Not so, my lord; I am too much i' the sun."
+ }
+ },
+ {
+ "_index" : "shakespeare",
+ "_id" : "32709",
+ "_score" : 4.2540946,
+ "_source" : {
+ "type" : "line",
+ "line_id" : 32710,
+ "play_name" : "Hamlet",
+ "speech_number" : 13,
+ "line_number" : "1.2.75",
+ "speaker" : "HAMLET",
+ "text_entry" : "Ay, madam, it is common."
+ }
+ }
+ ]
+ }
+}
+...
+```
+
+The term-level queries provide exact matches. So if you search for “Hamlet”, you don’t receive any matches, because “HAMLET” is a keyword field and is stored in OpenSearch literally and not in an analyzed form.
+The search query “HAMLET” is also searched literally. So to get a match for this field, we need to enter the exact same characters.
diff --git a/_opensearch/query-dsl/term.md b/_opensearch/query-dsl/term.md
index 2b5efd2a55..ffe33cd3cd 100644
--- a/_opensearch/query-dsl/term.md
+++ b/_opensearch/query-dsl/term.md
@@ -2,165 +2,143 @@
layout: default
title: Term-level queries
parent: Query DSL
-nav_order: 30
+nav_order: 20
---
# Term-level queries
-OpenSearch supports two types of queries when you search for data: term-level queries and full-text queries.
+Term-level queries search an index for documents that contain an exact search term. Documents returned by a term-level query are not sorted by their relevance scores.
-The following table describes the differences between them:
+When working with text data, use term-level queries for fields mapped as `keyword` only.
-| | Term-level queries | Full-text queries
+Term-level queries are not suited for searching analyzed text fields. To return analyzed fields, use a [full-text query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text).
+
+## Term-level query types
+
+The following table lists all term-level query types.
+
+| Query type | Description
:--- | :--- | :---
-*Description* | Term-level queries answer which documents match a query. | Full-text queries answer how well the documents match a query.
-*Analyzer* | The search term isn't analyzed. This means that the term query searches for your search term as it is. | The search term is analyzed by the same analyzer that was used for the specific field of the document at the time it was indexed. This means that your search term goes through the same analysis process that the document's field did.
-*Relevance* | Term-level queries simply return documents that match without sorting them based on the relevance score. They still calculate the relevance score, but this score is the same for all the documents that are returned. | Full-text queries calculate a relevance score for each match and sort the results by decreasing order of relevance.
-*Use Case* | Use term-level queries when you want to match exact values such as numbers, dates, tags, and so on, and don't need the matches to be sorted by relevance. | Use full-text queries to match text fields and sort by relevance after taking into account factors like casing and stemming variants.
+[`term`](#term) | Searches for documents with an exact term in a specific field.
+[`terms`](#terms) | Searches for documents with one or more terms in a specific field.
+[`terms_set`](#terms-set) | Searches for documents that match a minimum number of terms in a specific field.
+[`ids`](#ids) | Searches for documents by document ID.
+[`range`](#range) | Searches for documents with field values in a specific range.
+[`prefix`](#prefix) | Searches for documents with terms that begin with a specific prefix.
+[`exists`](#exists) | Searches for documents with any indexed value in a specific field.
+[`fuzzy`](#fuzzy) | Searches for documents with terms that are similar to the search term within the maximum allowed [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance). The Levenshtein distance measures the number of one-character changes needed to change one term to another term.
+[`wildcard`](#wildcard) | Searches for documents with terms that match a wildcard pattern.
+[`regexp`](#regexp) | Searches for documents with terms that match a regular expression.
-OpenSearch uses a probabilistic ranking framework called Okapi BM25 to calculate relevance scores. To learn more about Okapi BM25, see [Wikipedia](https://en.wikipedia.org/wiki/Okapi_BM25).
-{: .note }
+## Term
-Assume that you have the complete works of Shakespeare indexed in an OpenSearch cluster. We use a term-level query to search for the phrase "To be, or not to be" in the `text_entry` field:
+Use the `term` query to search for an exact term in a field.
```json
GET shakespeare/_search
{
"query": {
"term": {
- "text_entry": "To be, or not to be"
+ "line_id": {
+ "value": "61809"
+ }
}
}
}
```
+{% include copy-curl.html %}
-#### Sample response
-
-```json
-{
- "took" : 3,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 0,
- "relation" : "eq"
- },
- "max_score" : null,
- "hits" : [ ]
- }
-}
-```
-
-We don’t get back any matches (`hits`). This is because the term “To be, or not to be” is searched literally in the inverted index, where only the analyzed values of the text fields are stored. Term-level queries aren't suited for searching on analyzed text fields because they often yield unexpected results. When working with text data, use term-level queries only for fields mapped as keyword only.
+## Terms
-Using a full-text query:
+Use the `terms` query to search for multiple terms in the same field.
```json
GET shakespeare/_search
{
"query": {
- "match": {
- "text_entry": "To be, or not to be"
+ "terms": {
+ "line_id": [
+ "61809",
+ "61810"
+ ]
}
}
}
```
+{% include copy-curl.html %}
+
+You get back documents that match any of the terms.
+
+## Terms set
-The search query “To be, or not to be” is analyzed and tokenized into an array of tokens just like the `text_entry` field of the documents. The full-text query performs an intersection of tokens between our search query and the `text_entry` fields for all the documents, and then sorts the results by relevance scores:
+With a terms set query, you can search for documents that match a minimum number of exact terms in a specified field. The `terms_set` query is similar to the `terms` query, but you can specify the minimum number of matching terms that are required to return a document. You can specify this number either in a field in the index or with a script.
-#### Sample response
+As an example, consider an index that contains students with classes they have taken. When setting up the mapping for this index, you need to provide a [numeric]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/numeric) field that specifies the minimum number of matching terms that are required to return a document:
```json
+PUT students
{
- "took" : 19,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 10000,
- "relation" : "gte"
- },
- "max_score" : 17.419369,
- "hits" : [
- {
- "_index" : "shakespeare",
- "_id" : "34229",
- "_score" : 17.419369,
- "_source" : {
- "type" : "line",
- "line_id" : 34230,
- "play_name" : "Hamlet",
- "speech_number" : 19,
- "line_number" : "3.1.64",
- "speaker" : "HAMLET",
- "text_entry" : "To be, or not to be: that is the question:"
- }
+ "mappings": {
+ "properties": {
+ "name": {
+ "type": "keyword"
},
- {
- "_index" : "shakespeare",
- "_id" : "109930",
- "_score" : 14.883024,
- "_source" : {
- "type" : "line",
- "line_id" : 109931,
- "play_name" : "A Winters Tale",
- "speech_number" : 23,
- "line_number" : "4.4.153",
- "speaker" : "PERDITA",
- "text_entry" : "Not like a corse; or if, not to be buried,"
- }
+ "classes": {
+ "type": "keyword"
},
- {
- "_index" : "shakespeare",
- "_id" : "103117",
- "_score" : 14.782743,
- "_source" : {
- "type" : "line",
- "line_id" : 103118,
- "play_name" : "Twelfth Night",
- "speech_number" : 53,
- "line_number" : "1.3.95",
- "speaker" : "SIR ANDREW",
- "text_entry" : "will not be seen; or if she be, its four to one"
- }
+ "min_required": {
+ "type": "integer"
}
- ]
+ }
}
}
-...
```
+{% include copy-curl.html %}
-For a list of all full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
+Next, index two documents that correspond to students:
-If you want to query for an exact term like “HAMLET” in the speaker field and don't need the results to be sorted by relevance scores, a term-level query is more efficient:
+```json
+PUT students/_doc/1
+{
+ "name": "Mary Major",
+ "classes": [ "CS101", "CS102", "MATH101" ],
+ "min_required": 2
+}
+```
+{% include copy-curl.html %}
```json
-GET shakespeare/_search
+PUT students/_doc/2
+{
+ "name": "John Doe",
+ "classes": [ "CS101", "MATH101", "ENG101" ],
+ "min_required": 2
+}
+```
+{% include copy-curl.html %}
+
+Now search for students who have taken at least two of the following classes: `CS101`, `CS102`, `MATH101`:
+
+```json
+GET students/_search
{
"query": {
- "term": {
- "speaker": "HAMLET"
+ "terms_set": {
+ "classes": {
+ "terms": [ "CS101", "CS102", "MATH101" ],
+ "minimum_should_match_field": "min_required"
+ }
}
}
}
```
+{% include copy-curl.html %}
-#### Sample response
+The response contains both students:
```json
{
- "took" : 5,
+ "took" : 44,
"timed_out" : false,
"_shards" : {
"total" : 1,
@@ -170,100 +148,62 @@ GET shakespeare/_search
},
"hits" : {
"total" : {
- "value" : 1582,
+ "value" : 2,
"relation" : "eq"
},
- "max_score" : 4.2540946,
+ "max_score" : 1.4544616,
"hits" : [
{
- "_index" : "shakespeare",
- "_id" : "32700",
- "_score" : 4.2540946,
+ "_index" : "students",
+ "_id" : "1",
+ "_score" : 1.4544616,
"_source" : {
- "type" : "line",
- "line_id" : 32701,
- "play_name" : "Hamlet",
- "speech_number" : 9,
- "line_number" : "1.2.66",
- "speaker" : "HAMLET",
- "text_entry" : "[Aside] A little more than kin, and less than kind."
+ "name" : "Mary Major",
+ "classes" : [
+ "CS101",
+ "CS102",
+ "MATH101"
+ ],
+ "min_required" : 2
}
},
{
- "_index" : "shakespeare",
- "_id" : "32702",
- "_score" : 4.2540946,
+ "_index" : "students",
+ "_id" : "2",
+ "_score" : 0.5013843,
"_source" : {
- "type" : "line",
- "line_id" : 32703,
- "play_name" : "Hamlet",
- "speech_number" : 11,
- "line_number" : "1.2.68",
- "speaker" : "HAMLET",
- "text_entry" : "Not so, my lord; I am too much i' the sun."
- }
- },
- {
- "_index" : "shakespeare",
- "_id" : "32709",
- "_score" : 4.2540946,
- "_source" : {
- "type" : "line",
- "line_id" : 32710,
- "play_name" : "Hamlet",
- "speech_number" : 13,
- "line_number" : "1.2.75",
- "speaker" : "HAMLET",
- "text_entry" : "Ay, madam, it is common."
+ "name" : "John Doe",
+ "classes" : [
+ "CS101",
+ "MATH101",
+ "ENG101"
+ ],
+ "min_required" : 2
}
}
]
}
}
-...
```
-The term-level queries are exact matches. So, if you search for “Hamlet”, you don’t get back any matches, because “HAMLET” is a keyword field and is stored in OpenSearch literally and not in an analyzed form.
-The search query “HAMLET” is also searched literally. So, to get a match on this field, we need to enter the exact same characters.
-
----
-
-## Term
-
-Use the `term` query to search for an exact term in a field.
+To specify the minimum number of terms a document should match with a script, provide the script in the `minimum_should_match_script` field:
```json
-GET shakespeare/_search
+GET students/_search
{
"query": {
- "term": {
- "line_id": {
- "value": "61809"
+ "terms_set": {
+ "classes": {
+ "terms": [ "CS101", "CS102", "MATH101" ],
+ "minimum_should_match_script": {
+ "source": "Math.min(params.num_terms, doc['min_required'].value)"
+ }
}
}
}
}
```
-
-## Terms
-
-Use the `terms` query to search for multiple terms in the same field.
-
-```json
-GET shakespeare/_search
-{
- "query": {
- "terms": {
- "line_id": [
- "61809",
- "61810"
- ]
- }
- }
-}
-```
-
-You get back documents that match any of the terms.
+{% include copy-curl.html %}
## IDs
@@ -282,6 +222,7 @@ GET shakespeare/_search
}
}
```
+{% include copy-curl.html %}
## Range
@@ -302,6 +243,7 @@ GET shakespeare/_search
}
}
```
+{% include copy-curl.html %}
Parameter | Behavior
:--- | :---
@@ -325,6 +267,7 @@ GET products/_search
}
}
```
+{% include copy-curl.html %}
Specify relative dates by using [date math]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/date/#date-math).
@@ -342,6 +285,7 @@ GET products/_search
}
}
```
+{% include copy-curl.html %}
The first date that we specify is the anchor date or the starting point for the date math. Add two trailing pipe symbols. You could then add one day (`+1d`) or subtract two weeks (`-2w`). This math expression is relative to the anchor date that you specify.
@@ -361,6 +305,7 @@ GET products/_search
}
}
```
+{% include copy-curl.html %}
The keyword `now` refers to the current date and time.
@@ -378,6 +323,7 @@ GET shakespeare/_search
}
}
```
+{% include copy-curl.html %}
## Exists
@@ -393,8 +339,59 @@ GET shakespeare/_search
}
}
```
+{% include copy-curl.html %}
+
+## Fuzzy
+
+A fuzzy query searches for documents with terms that are similar to the search term within the maximum allowed [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance). The Levenshtein distance measures the number of one-character changes needed to change one term to another term. These changes include:
+
+- Replacements: **c**at to **b**at
+- Insertions: cat to cat**s**
+- Deletions: **c**at to at
+- Transpositions: **ca**t to **ac**t
+
+A fuzzy query creates a list of all possible expansions of the search term that fall within the Levenshtein distance. You can specify the maximum number of such expansions in the `max_expansions` field. Then is searches for documents that match any of the expansions.
+
+The following example query searches for the speaker `HALET` (misspelled `HAMLET`). The maximum edit distance is not specified, so the default `AUTO` edit distance is used:
+
+```json
+GET shakespeare/_search
+{
+ "query": {
+ "fuzzy": {
+ "speaker": {
+ "value": "HALET"
+ }
+ }
+ }
+}
+```
+{% include copy-curl.html %}
+
+The response contains all documents where `HAMLET` is the speaker.
+
+The following example query searches for the word `cat` with advanced parameters:
+
+```json
+GET shakespeare/_search
+{
+ "query": {
+ "fuzzy": {
+ "speaker": {
+ "value": "HALET",
+ "fuzziness": "2",
+ "max_expansions": 40,
+ "prefix_length": 0,
+ "transpositions": true,
+ "rewrite": "constant_score"
+ }
+ }
+ }
+}
+```
+{% include copy-curl.html %}
-## Wildcards
+## Wildcard
Use wildcard queries to search for terms that match a wildcard pattern.
@@ -417,12 +414,13 @@ GET shakespeare/_search
}
}
```
+{% include copy-curl.html %}
If we change `*` to `?`, we get no matches, because `?` refers to a single character.
Wildcard queries tend to be slow because they need to iterate over a lot of terms. Avoid placing wildcard characters at the beginning of a query because it could be a very expensive operation in terms of both resources and time.
-## Regex
+## Regexp
Use the `regexp` query to search for terms that match a regular expression.
@@ -438,6 +436,7 @@ GET shakespeare/_search
}
}
```
+{% include copy-curl.html %}
A few important notes:
diff --git a/_opensearch/query-dsl/text-analyzers.md b/_opensearch/query-dsl/text-analyzers.md
new file mode 100644
index 0000000000..b618ee318a
--- /dev/null
+++ b/_opensearch/query-dsl/text-analyzers.md
@@ -0,0 +1,108 @@
+---
+layout: default
+title: Text analyzers
+parent: Query DSL
+nav_order: 75
+---
+
+
+# Optimizing text for searches with text analyzers
+
+OpenSearch applies text analysis during indexing or searching for `text` fields. There is a standard analyzer that OpenSearch uses by default for text analysis. To optimize unstructured text for search, you can convert it into structured text with our text analyzers.
+
+## Text analyzers
+
+OpenSearch provides several text analyzers to convert your structured text into the format that works best for your searches.
+
+OpenSearch supports the following text analyzers:
+
+- **Standard analyzer** – Parses strings into terms at word boundaries according to the Unicode text segmentation algorithm. It removes most, but not all, punctuation and converts strings to lowercase. You can remove stop words if you enable that option, but it does not remove stop words by default.
+- **Simple analyzer** – Converts strings to lowercase and removes non-letter characters when it splits a string into tokens on any non-letter character.
+- **Whitespace analyzer** – Parses strings into terms between each whitespace.
+- **Stop analyzer** – Converts strings to lowercase and removes non-letter characters by splitting strings into tokens at each non-letter character. It also removes stop words (for example, "but" or "this") from strings.
+- **Keyword analyzer** – Receives a string as input and outputs the entire string as one term.
+- **Pattern analyzer** – Splits strings into terms using regular expressions and supports converting strings to lowercase. It also supports removing stop words.
+- **Language analyzer** – Provides analyzers specific to multiple languages.
+- **Fingerprint analyzer** – Creates a fingerprint to use as a duplicate detector.
+
+The full specialized text analyzers reference is in progress and will be published soon.
+{: .note }
+
+## How to use text analyzers
+
+If you want to use a text analyzer, specify the name of the analyzer for the `analyzer` field: standard, simple, whitespace, stop, keyword, pattern, fingerprint, or language.
+
+Each analyzer consists of one tokenizer and zero or more token filters. Different analyzers have different character filters, tokenizers, and token filters. To pre-process the string before the tokenizer is applied, you can use one or more character filters.
+
+#### Example: Specify the standard analyzer in a simple query
+
+```json
+ GET _search
+{
+ "query": {
+ "match": {
+ "title": "A brief history of Time",
+ "analyzer": "standard"
+ }
+ }
+ }
+ ```
+
+## Analyzer options
+
+Option | Valid values | Description
+:--- | :--- | :---
+`analyzer` | `standard, simple, whitespace, stop, keyword, pattern, language, fingerprint` | The analyzer you want to use for the query. Different analyzers have different character filters, tokenizers, and token filters. The `stop` analyzer, for example, removes stop words (for example, "an," "but," "this") from the query string. For a full list of acceptable language values, see [Language analyzer](#language-analyzer) on this page.
+`quote_analyzer` | String | This option lets you choose to use the standard analyzer without any options, such as `language` or other analyzers. Usage is `"quote_analyzer": "standard"`.
+
+
+
+## Language analyzer
+
+OpenSearch supports the following language values with the `analyzer` option:
+arabic, armenian, basque, bengali, brazilian, bulgarian, catalan, czech, danish, dutch, english, estonian, finnish, french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian, lithuanian, norwegian, persian, portuguese, romanian, russian, sorani, spanish, swedish, turkish, and thai.
+
+To use the analyzer when you map an index, specify the value within your query. For example, to map your index with the French language analyzer, specify the `french` value for the analyzer field:
+
+```json
+ "analyzer": "french"
+ ```
+
+#### Sample Request
+
+The following query maps an index with the language analyzer set to `french`:
+
+```json
+PUT my-index-000001
+
+{
+ "mappings": {
+ "properties": {
+ "text": {
+ "type": "text",
+ "fields": {
+ "french": {
+ "type": "text",
+ "analyzer": "french"
+ }
+ }
+ }
+ }
+ }
+}
+```
+
+
\ No newline at end of file
diff --git a/_opensearch/reindex-data.md b/_opensearch/reindex-data.md
index d1601a379c..166eece64c 100644
--- a/_opensearch/reindex-data.md
+++ b/_opensearch/reindex-data.md
@@ -6,7 +6,7 @@ nav_order: 16
# Reindex data
-After creating an index, you might need to make an extensive change such as adding a new field to every document or combining multiple indices to form a new one. Rather than deleting your index, making the change offline, and then indexing your data all over again, you can use the `reindex` operation.
+After creating an index, you might need to make an extensive change such as adding a new field to every document or combining multiple indexes to form a new one. Rather than deleting your index, making the change offline, and then indexing your data again, you can use the `reindex` operation.
With the `reindex` operation, you can copy all or a subset of documents that you select through a query to another index. Reindex is a `POST` operation. In its most basic form, you specify a source index and a destination index.
@@ -113,13 +113,13 @@ POST _reindex
}
```
-For a list of all query operations, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
+For a list of all query operations, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
-## Combine one or more indices
+## Combine one or more indexes
-You can combine documents from one or more indices by adding the source indices as a list.
+You can combine documents from one or more indexes by adding the source indexes as a list.
-This command copies all documents from two source indices to one destination index:
+This command copies all documents from two source indexes to one destination index:
```json
POST _reindex
@@ -135,7 +135,7 @@ POST _reindex
}
}
```
-Make sure the number of shards for your source and destination indices are the same.
+Make sure the number of shards for your source and destination indexes is the same.
## Reindex only unique documents
@@ -246,7 +246,7 @@ You can specify the following options for your source index:
Option | Valid values | Description | Required
:--- | :--- | :---
-`index` | String | The name of the source index. You can provide multiple source indices as a list. | Yes
+`index` | String | The name of the source index. You can provide multiple source indexes as a list. | Yes
`max_docs` | Integer | The maximum number of documents to reindex. | No
`query` | Object | The search query to use for the reindex operation. | No
`size` | Integer | The number of documents to reindex. | No
diff --git a/_opensearch/search/autocomplete.md b/_opensearch/search/autocomplete.md
index 3dd8e9ff82..36276ba477 100644
--- a/_opensearch/search/autocomplete.md
+++ b/_opensearch/search/autocomplete.md
@@ -42,7 +42,7 @@ GET shakespeare/_search
}
```
-To make the word order and relative positions flexible, specify a `slop` value. To learn about the `slop` option, see [Other advanced options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text#other-advanced-options).
+To make the word order and relative positions flexible, specify a `slop` value. To learn about the `slop` option, see [Other advanced options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#other-advanced-options).
Prefix matching doesn’t require any special mappings. It works with your data as is.
However, it’s a fairly resource-intensive operation. A prefix of `a` could match hundreds of thousands of terms and not be useful to your user.
@@ -63,7 +63,7 @@ GET shakespeare/_search
}
```
-To learn about the `max_expansions` option, see [Other advanced options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text#other-advanced-options).
+To learn about the `max_expansions` option, see [Other advanced options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#other-advanced-options).
The ease of implementing query-time autocomplete comes at the cost of performance.
When implementing this feature on a large scale, we recommend an index-time solution. With an index-time solution, you might experience slower indexing, but it’s a price you pay only once and not for every query. The edge n-gram, search-as-you-type, and completion suggester methods are index-time solutions.
diff --git a/_opensearch/supported-field-types/date.md b/_opensearch/supported-field-types/date.md
index 4b9e7ff10f..4a867d065b 100644
--- a/_opensearch/supported-field-types/date.md
+++ b/_opensearch/supported-field-types/date.md
@@ -154,7 +154,7 @@ Format name and description | Pattern and examples
## Custom formats
-You can create custom formats for date fields. For example, the following request specifies a date in the common "MM/dd/yyyy" format.
+You can create custom formats for date fields. For example, the following request specifies a date in the common "MM/dd/yyyy" format:
```json
PUT testindex
@@ -217,7 +217,7 @@ GET testindex/_search
## Date math
-The date field type supports using date math to specify duration in queries. For example, the `gt`, `gte`, `lt`, and `lte` parameters in [range queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range-query) and the `from` and `to` parameters in [date range aggregations]({{site.url}}{{site.baseurl}}/opensearch/bucket-agg/#range-date_range-ip_range) accept date math expressions.
+The date field type supports using date math to specify durations in queries. For example, the `gt`, `gte`, `lt`, and `lte` parameters in [range queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range) and the `from` and `to` parameters in [date range aggregations]({{site.url}}{{site.baseurl}}/opensearch/bucket-agg/#range-date_range-ip_range) accept date math expressions.
A date math expression contains a fixed date, optionally followed by one or more mathematical expressions. The fixed date may be either `now` (current date and time in milliseconds since the epoch) or a string ending with `||` that specifies a date (for example, `2022-05-18||`). The date must be in the `strict_date_optional_time||epoch_millis` format.
@@ -252,7 +252,7 @@ The following example expressions illustrate using date math:
### Using date math in a range query
-The following example illustrates using date math in a [range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range-query).
+The following example illustrates using date math in a [range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term/#range).
Set up an index with `release_date` mapped as `date`:
diff --git a/_opensearch/supported-field-types/range.md b/_opensearch/supported-field-types/range.md
index ee0806c0da..31160219ed 100644
--- a/_opensearch/supported-field-types/range.md
+++ b/_opensearch/supported-field-types/range.md
@@ -64,7 +64,7 @@ You can use a [term query](#term-query) or a [range query](#range-query) to sear
A term query takes a value and matches all range fields for which the value is within the range.
-The following query will return document 1 because 3.5 is within the range [1.0, 4.0].
+The following query will return document 1 because 3.5 is within the range [1.0, 4.0]:
```json
GET testindex/_search
diff --git a/_search-plugins/sql/full-text.md b/_search-plugins/sql/full-text.md
index 459cd39105..9c60692801 100644
--- a/_search-plugins/sql/full-text.md
+++ b/_search-plugins/sql/full-text.md
@@ -9,7 +9,7 @@ nav_order: 11
Use SQL commands for full-text search. The SQL plugin supports a subset of full-text queries available in OpenSearch.
-To learn about full-text queries in OpenSearch, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
+To learn about full-text queries in OpenSearch, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
## Match
@@ -36,7 +36,7 @@ You can specify the following options in any order:
- `zero_terms_query`
- `boost`
-Please, refer to `match` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match) for parameter description and supported values.
+Refer to the `match` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match) for parameter descriptions and supported values.
### Example 1: Search the `message` field for the text "this is a test":
@@ -224,7 +224,7 @@ You can specify the following options for `QUERY_STRING` in any order:
- `tie_breaker`
- `time_zone`
-Please, refer to `query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#query-string) for parameter description and supported values.
+Refer to the `query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#query-string) for parameter descriptions and supported values.
### Example of using `query_string` in SQL and PPL queries:
@@ -281,7 +281,7 @@ The `MATCHPHRASE`/`MATCH_PHRASE` functions let you specify the following options
- `zero_terms_query`
- `boost`
-Please, refer to `match_phrase` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match-phrase) for parameter description and supported values.
+Refer to the `match_phrase` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match-phrase) for parameter descriptions and supported values.
### Example of using `match_phrase` in SQL and PPL queries:
@@ -349,7 +349,7 @@ You can specify the following options for `SIMPLE_QUERY_STRING` in any order:
- `minimum_should_match`
- `quote_field_suffix`
-Please, refer to `simple_query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#simple-query-string) to check parameter meanings and available values.
+Refer to the `simple_query_string` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#simple-query-string) for parameter descriptions and supported values.
### *Example* of using `simple_query_string` in SQL and PPL queries:
@@ -400,7 +400,7 @@ The `MATCH_PHRASE_PREFIX` function lets you specify the following options in any
- `zero_terms_query`
- `boost`
-Please, refer to `match_phrase_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match-phrase-prefix) for parameter description and supported values.
+Refer to the `match_phrase_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match-phrase-prefix) for parameter descriptions and supported values.
### *Example* of using `match_phrase_prefix` in SQL and PPL queries:
@@ -456,7 +456,7 @@ The `MATCH_BOOL_PREFIX` function lets you specify the following options in any o
- `analyzer`
- `operator`
-Please, refer to `match_bool_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#match-boolean-prefix) for parameter description and supported values.
+Refer to the `match_bool_prefix` query [documentation]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index#match-boolean-prefix) for parameter descriptions and supported values.
### Example of using `match_bool_prefix` in SQL and PPL queries:
diff --git a/_search-plugins/sql/sql/basic.md b/_search-plugins/sql/sql/basic.md
index 7b2ffefba0..6a9711759f 100644
--- a/_search-plugins/sql/sql/basic.md
+++ b/_search-plugins/sql/sql/basic.md
@@ -192,7 +192,7 @@ Specify a condition to filter the results.
`<=` | Less than or equal to.
`IN` | Specify multiple `OR` operators.
`BETWEEN` | Similar to a range query. For more information about range queries, see [Range query]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/term#range).
-`LIKE` | Use for full text search. For more information about full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
+`LIKE` | Use for full-text search. For more information about full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/index).
`IS NULL` | Check if the field value is `NULL`.
`IS NOT NULL` | Check if the field value is `NOT NULL`.