Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Manage Spark job parameter for different query #2192

Closed
dai-chen opened this issue Oct 3, 2023 · 1 comment
Closed

[Refactor] Manage Spark job parameter for different query #2192

dai-chen opened this issue Oct 3, 2023 · 1 comment
Labels
maintenance Improves code quality, but not the product v2.11.0 Issues targeting release v2.11.0

Comments

@dai-chen
Copy link
Collaborator

dai-chen commented Oct 3, 2023

Is your feature request related to a problem?

Different query requires different Spark job submit parameters.

What solution would you like?

  1. CREATE statement:
    a. Set execution-timeout-minutes to 0 so EMR-S won't time out the streaming job at background
    b. Disable DRA (Spark dynamic resource allocator) because it only works well for long-running batch job
    c. [TBD] Flint config that specifies which Env variable to populate to Flint metadata, such as EMR-S job info
  2. SELECT statement

Also think about if any flexible way to manage these parameters in case of changes in future.

@dai-chen dai-chen added enhancement New feature or request untriaged maintenance Improves code quality, but not the product and removed enhancement New feature or request untriaged labels Oct 3, 2023
@penghuo penghuo added the v2.11.0 Issues targeting release v2.11.0 label Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Improves code quality, but not the product v2.11.0 Issues targeting release v2.11.0
Projects
None yet
Development

No branches or pull requests

2 participants