Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a more extensive query warnings generator #1140

Open
knwg145 opened this issue Jul 17, 2019 · 2 comments
Open

Create a more extensive query warnings generator #1140

knwg145 opened this issue Jul 17, 2019 · 2 comments

Comments

@knwg145
Copy link

knwg145 commented Jul 17, 2019

We are currently trying to create a more extensive warning generator to warn our Presto users and give them feedback.

I have separated warnings generation into 3 possible categories:



  • Warnings generated in the connectors

  • Warnings about deprecated tables, deprecated views, deprecated udfs, deprecated session properties etc.
  • Warnings about query execution statistics approaching resource limits

Warnings for deprecated tables, deprecated views, etc or other problems related to them can be very useful to let users know about ongoing issues in production. In order to enable such warnings in real-time, we are working on injecting and refreshing configurations for such warnings through an external database (rather than storing them in a variable or a file, which requires deployment). In this case, we have an external MySQL database that stores such information and warnings will be generated dynamically based on the values in MySQL.

For warnings related to query execution statistics, it is helpful for us to alert users when their queries’ resource usage is above a certain threshold. In this case, we are generating warnings if the memory/cpu usage of a query is above a certain percentage of the maximum amount possible. This is useful for cases like scheduled queries operating on an ever increasing dataset. These warnings will allow the user to know if there might be problems with a query in the future and to tune it accordingly. Another case where these query statistics warnings are useful is for incremental query building. Users like to add on top of existing queries and generating warnings will warn them to be careful about building more on top of the existing query. In addition, this can encourage users to try and modify their queries to be more efficient (one case that comes to mind is using too many CTEs).



For the hive connector, we have a couple of warnings that we want to generate. These are:

  • Generating warnings for tables not using ORC
  • Generating warnings if too many partitions for a table is used
  • Generating warnings for unsupported column types

We want to generate these warnings in the hive connector to keep it modular and extendable.

For the connector warnings, I was thinking about having a WarningCollector for each connector within the WarningCollector. Maybe some kind of mapping of



CatalogName : ConnectorWarningCollector



This connector specific warning collector can be passed in through the function signature whenever there is an engine to connector interaction. Warnings generated in a connector can be added into this specific warning collector and the information will be available to the presto engine. This will allow the presto engine to fetch warnings already generated from any connectors at any time.



There is a similar effort going on here: #1006

@knwg145 knwg145 changed the title Create a robust query warnings generator Create a more extensive query warnings generator Jul 17, 2019
@oneonestar
Copy link
Member

Thanks for the thought. Since many people are interested in enhancing the warning system, we could turn this into a roadmap item. Let's try to work out what the warning system should look like:

Enhanced Warning System

Objective

  • Detect bad queries
  • Collect and classify warnings into categories / levels for better management
  • React to warnings by take proper action(s)

Bad queries (ie. using deprecated UDFs or consuming too much resources) should generate warning(s) to notify system admins or users to take further actions. Depends on the importance of the warning, it should be classify into different categories and levels. System should react to the warnings by taking proper action, such as kill the query, notify external systems, etc.

How warnings are being generated

From Presto Engine:

  • Planning: Before / During / After Optimization
  • During execution
  • Query completed

From connectors:

  • During planning
  • During execution

During query optimization, some information may be added / lost due to the transformations. Runtime statistics could trigger more warnings. For example, the splits information are loaded async, the total amount of data to be scanned are unknown during the planning time. After query completed, the system will have all execution statistics which could trigger even more warnings.

Although the best way is to prevent the harmful queries being executed at all, it is not possible to detect all possible issues during planning time. Detecting issues in runtime and after completion can give us a more complete picture.

What kind of warnings can be generated

  • Performance warning
    • Scanning too much partitions / data
    • MySQL JDBC connector should be used to load too much data
    • Using text format instead of ORC format
  • Deprecated warning
    • Warnings about deprecated tables, deprecated views, deprecated UDFs, deprecated session properties etc
  • Compatibility warning
    • Incompatible type casing
    • Unsupported column types
  • (Others)

IMO, warnings generated by connectors should be classified by their nature. Connector information should be provided as additional information instead of acting as a category.

Configuration management

What to config

  • Enable / Disable of each warning rules
  • Parameters for each warning rules (warning threshold...)
  • What actions to take to react to warnings

How to config

  • File based
  • From External systems such as MySQL

I'm still thinking what should be the best way to config the warning system. We need to figure out how to preserve the modularity of the connectors and provide a common configuration system to pass the config to the sub-warningCollector in each connector.

How to response to warnings

  • Add a QueryWarningEvent to event-listener interface
  • Kill the query
  • Change the query to different resource group
  • Show it in CLI / WebUI (Done)
  • Show in QueryCompletedEvent (Done)

@raghavsethi
Copy link
Member

I'd spent some time thinking about this when we originally designed warnings and diverged from this in a few ways:

  • The different classes of warnings would be: Performance, Correctness, and Deprecation. I would caution against compatibility as a type, since that should fall into one of the the three I outlined. I do want to be clear that Presto will not under any circumstances produce incorrect results, the correctness category should warn users about semantic errors in their queries that may produce unexpected results.
  • I don't think we need to think about dynamic configuration just yet, given the complexity of doing it right, and my hunch is that existing connector.properties file might be the right place to configure the first set of warnings.
  • For connector-produced warnings, my guess is that the engine could simply report them all and does not need to care about thresholds etc. The thresholds could be read and enforced by the connector if required.
  • Responding to warnings is a bit of a minefield. I would recommend delegating that to the client to respond, with the exception of failing the query for certain warnings or warning classes. I can see why query warning events would be useful, but spammy warnings could take down the external systems, we should think about what the dedup/noise reduction strategy is there.
  • Instead of warning rules, we could enable and disable reporting of warning codes or warning types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants