Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] PPL query with head and sort can not properly rewrite as DSL. #494

Open
Tracked by #1872
penghuo opened this issue Mar 14, 2022 · 2 comments
Open
Tracked by #1872

[BUG] PPL query with head and sort can not properly rewrite as DSL. #494

penghuo opened this issue Mar 14, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@penghuo
Copy link
Collaborator

penghuo commented Mar 14, 2022

Describe the bug
PPL query with head and sort can not properly rewrite as DSL.

To Reproduce

POST /_plugins/_ppl/_explain
{
  "query": "source=test_0002 | head 1000 | sort - abletter "
}

{
  "root": {
    "name": "ProjectOperator",
    "description": {
      "fields": "[abletter, 11number]"
    },
    "children": [
      {
        "name": "OpenSearchIndexScan",
        "description": {
          "request": """OpenSearchQueryRequest(indexName=test_0002, sourceBuilder={"from":0,"size":200,"timeout":"1m","_source":{"includes":["abletter","11number"],"excludes":[]},"sort":[{"abletter":{"order":"desc","missing":"_last"}}]}, searchDone=false)"""
        },
        "children": []
      }
    ]
  }
}

Expected behavior
size field in DSL should be 1000 instead of 200.

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@penghuo penghuo added bug Something isn't working Beta untriaged and removed untriaged labels Mar 14, 2022
@ylwu-amzn
Copy link
Contributor

ylwu-amzn commented Mar 14, 2022

Put head after sort works.

POST _plugins/_ppl/_explain
{
  "query": "source=fourclass_data | sort - anomaly_type | head 10000"
}

{
  "root": {
    "name": "ProjectOperator",
    "description": {
      "fields": "[anomaly_type, A, B]"
    },
    "children": [
      {
        "name": "OpenSearchIndexScan",
        "description": {
          "request": """OpenSearchQueryRequest(indexName=fourclass_data, sourceBuilder={"from":0,"size":10000,"timeout":"1m","_source":{"includes":["A","B","anomaly_type"],"excludes":[]},"sort":[{"anomaly_type":{"order":"desc","missing":"_last"}}]}, searchDone=false)"""
        },
        "children": []
      }
    ]
  }
}

It will be good if you can also support head after fields command.

POST _plugins/_ppl/_explain
{  
  "query": "source=nyc_taxi | fields value, timestamp | head 1000  "
}

{
  "root": {
    "name": "ProjectOperator",
    "description": {
      "fields": "[value, timestamp]"
    },
    "children": [
      {
        "name": "LimitOperator",
        "description": {
          "limit": 1000,
          "offset": 0
        },
        "children": [
          {
            "name": "ProjectOperator",
            "description": {
              "fields": "[value, timestamp]"
            },
            "children": [
              {
                "name": "OpenSearchIndexScan",
                "description": {
                  "request": """OpenSearchQueryRequest(indexName=nyc_taxi, sourceBuilder={"from":0,"size":200,"timeout":"1m","_source":{"includes":["value","timestamp"],"excludes":[]}}, searchDone=false)"""
                },
                "children": []
              }
            ]
          }
        ]
      }
    ]
  }
}

@Yury-Fridlyand
Copy link
Collaborator

I can't reproduce is by swapping head and sort commands in query:

source=online | sort - all_client | head 1000 | fields all_client 

But if I move fields ahead everything got messed up:

source=online | fields all_client | sort - all_client | head 1000
{
    "root": {
        "name": "ProjectOperator",
        "description": {
            "fields": "[all_client]"
        },
        "children": [
            {
                "name": "LimitOperator",
                "description": {
                    "limit": 1000,
                    "offset": 0
                },
                "children": [
                    {
                        "name": "SortOperator",
                        "description": {
                            "sortList": {
                                "all_client": {
                                    "sortOrder": "DESC",
                                    "nullOrder": "NULL_LAST"
                                }
                            }
                        },
                        "children": [
                            {
                                "name": "ProjectOperator",
                                "description": {
                                    "fields": "[all_client]"
                                },
                                "children": [
                                    {
                                        "name": "OpenSearchIndexScan",
                                        "description": {
                                            "request": "OpenSearchQueryRequest(indexName=online, sourceBuilder={\"from\":0,\"size\":200,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"all_client\"],\"excludes\":[]}}, searchDone=false)"
                                        },
                                        "children": []
                                    }
                                ]
                            }
                        ]
                    }
                ]
            }
        ]
    }
}

I think optimizer rework announced in #1752 should fix this as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: No status
Development

No branches or pull requests

4 participants