-
Notifications
You must be signed in to change notification settings - Fork 751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GOBBLIN-1773] Fix bugs in quota manager #3636
Conversation
…crease quota twice for run-immediately flow
Codecov Report
@@ Coverage Diff @@
## master #3636 +/- ##
============================================
+ Coverage 40.16% 46.59% +6.43%
- Complexity 3544 10673 +7129
============================================
Files 791 2133 +1342
Lines 33285 83568 +50283
Branches 3699 9292 +5593
============================================
+ Hits 13368 38938 +25570
- Misses 18601 41067 +22466
- Partials 1316 3563 +2247
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
// This block should be reachable only for the first execution for the adhoc flows (flows that either do not have a schedule or have runImmediately=true. | ||
if (!this.warmStandbyEnabled && !jobConfig.containsKey(ConfigurationKeys.JOB_SCHEDULE_KEY)) { | ||
// This block should be reachable only for the execution for the adhoc flows | ||
// For flow that has scheduler but run-immediately set to be true, we won't check teh quota as we will use a different execution id later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo the
@@ -330,8 +330,9 @@ public AddSpecResponse onAddSpec(Spec addedSpec) { | |||
|
|||
// Check quota limits against run immediately flows or adhoc flows before saving the schedule |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets update this to be adhoc flows only
@@ -330,8 +330,9 @@ public AddSpecResponse onAddSpec(Spec addedSpec) { | |||
|
|||
// Check quota limits against run immediately flows or adhoc flows before saving the schedule | |||
// In warm standby mode, this quota check will happen on restli API layer when we accept the flow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaik FlowConfigV2ResourceLocalHandler and FlowCatalog are not doing quota check on adding the spec. Where do we do the check in API layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see did not notice the difference in warm standby mode.
@@ -67,7 +68,8 @@ public MysqlDagActionStore(Config config) throws IOException { | |||
this.tableName = ConfigUtils.getString(config, ConfigurationKeys.STATE_STORE_DB_TABLE_KEY, | |||
ConfigurationKeys.DEFAULT_STATE_STORE_DB_TABLE); | |||
|
|||
this.dataSource = MysqlStateStore.newDataSource(config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we deem this function not safe to use as it does not read from the shared threadpool, would it be appropriate to make MySqlStateStore.newDataSource package private?
@@ -247,6 +253,13 @@ public void orchestrate(Spec spec) throws Exception { | |||
+ "concurrent executions are disabled for this flow.", flowGroup, flowName); | |||
conditionallyUpdateFlowGaugeSpecState(spec, CompiledState.SKIPPED); | |||
Instrumented.markMeter(this.skippedFlowsMeter); | |||
if (!((FlowSpec)spec).getConfigAsProperties().containsKey(ConfigurationKeys.JOB_SCHEDULE_KEY)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can come in a later PR but we should have a static function to check if a FlowSpec is adhoc or not in the FlowSpec class
* upstream/master: [GOBBLIN-1774] Util for detecting non optional uniontypes Hive tables (apache#3632) [GOBBLIN-1773] Fix bugs in quota manager (apache#3636) [GOBBLIN-1782] Fix Merge State for Flow Pending Resume statuses (apache#3639) [GOBBLIN-1755] Support extended ACLs and sticky bit for file based distcp (apache#3616) [GOBBLIN-1780] Refactor/rename YarnServiceIT to YarnServiceTest (apache#3637) [GOBBLIN-1778] Add house keeping thread in DagManager to periodically sync in memory state with mysql table (apache#3635) Register gauge metrics for change monitors (apache#3634)
* upstream/master: [GOBBLIN-1774] Util for detecting non optional uniontypes Hive tables (apache#3632) [GOBBLIN-1773] Fix bugs in quota manager (apache#3636) [GOBBLIN-1782] Fix Merge State for Flow Pending Resume statuses (apache#3639) [GOBBLIN-1755] Support extended ACLs and sticky bit for file based distcp (apache#3616) [GOBBLIN-1780] Refactor/rename YarnServiceIT to YarnServiceTest (apache#3637) [GOBBLIN-1778] Add house keeping thread in DagManager to periodically sync in memory state with mysql table (apache#3635) Register gauge metrics for change monitors (apache#3634)
Dear Gobblin maintainers,
Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
JIRA
Description
We used to check the quota when we accept flow for ad-hoc flow and run immediately flow. But we only use the same execution id for ad-hoc flow. So for a run immediately flow, we will check quota once with a random execution id, and then really run the job with a different execution id and forget the first one. This will cause us to double-check the quota for a run immediately flow and never release one of them.
In this PR:
Tests
Unit tests
Commits