jobs, *: refactor planning vs exec APIs #37691

dt · 2019-05-21T16:01:23Z

The hook plans generally get a limited version of planner (planHookState) some parts of which use the planner's txn.

They use this planner as part of the planning of the job they ultimately start, e.g. resolving the table name or or user privileges, etc that go in to determining if, how and on what the job will run.

Unfortunately, planner usually assumes that it is used for the entirety of a normal SQL statement, so it's embedded txn is usually not committed until after the plan has run. This however means that when the hook plans start and run a job based on calls to this planner, they are running a job -- with side effects -- based on reads made with an uncommitted txn.

However some jobs support being planned in a transaction, and then run after it commits, e.g. using DETACHED. In these cases, it is vital the limit all their planning to use the planner txn to avoid leaking side-effects.

Thus, all job planning statements (BACKUP, IMPORT, RESTORE, etc) should be audited to ensure they only use the planhook (planner) txn.

Then, for those jobs which wish to execute during the statement execution to return results (i.e. non-detached), they should a) assert the planner txn is implicit and then b) commit it inline ala 1PC INSERTs, before running the job.

Separately, job execution currently gets the same PlanHookState interface however, as this closes over a planner, it also closes over a planner transaction field, but as this is outside a statement execution, that field is nil. This has lead to crashes when method that reference that field are called, often indirectly via very non-obvious call-trees.

Instead, job execution (i.e. Resume()) should be passed a different interface entirely from planning, which is not a planner at all and has potentially nil txn field.

The text was updated successfully, but these errors were encountered:

dt · 2019-10-14T23:16:05Z

part of the problem here is that the conn executor has special code to execute "async" schema changes before sending the reply, making them appear synchronous:
https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/conn_executor.go#L1996

This doesn't exist for jobs, so there is no way to a) commit a job record and then b) wait for its results currently -- the job can't run until its creation commits, but that doesn't currently have any way to block on its execution. Current jobs commit, run and block all during the statement creation them, but that is a gross mis-use of the planner txn -- which won't commit until later -- to plan the job.

It's useful to be able to create jobs in transactions. Obviously they would need to execute after the transaction commits. We add a new function ScheduleJob on the planner that allows that. The main use case is converting schema changes to jobs. We sweep under the rug the complicated semantics of handling any failures of these jobs. This will be addressed in follow up PR(s). This PR handles the plumbing and this function is not for production use yet. Touches cockroachdb#37691, cockroachdb#42061. Release note: none.

It's useful to be able to create jobs in transactions. Obviously they would need to execute after the transaction commits. We add a new function QueueJob on the extendedEvalContext to do that. The main use case is converting schema changes to jobs. We sweep under the rug the complicated semantics of handling any failures of these jobs. This will be addressed in follow up PR(s). This PR handles the plumbing and this function is not for production use yet. Touches cockroachdb#37691, cockroachdb#42061. Release note: none.

42689: jobs: schedule a job in a transaction to execute after commit r=spaskob a=spaskob jobs: schedule a job in a transaction to execute after commit It's useful to be able to create jobs in transactions. Obviously they would need to execute after the transaction commits. We add a new function QueueJob on the planner that allows that. The main use case is converting schema changes to jobs. We sweep under the rug the complicated semantics of handling any failures of these jobs. This will be addressed in follow up PR(s). This PR handles the plumbing and this function is not for production use yet. Touches #37691, #42061. Release note: none. 43136: storage: remove getQuorumIndex from computeTruncateDecision r=bdarnell a=irfansharif Some lineage here: `getQuorumIndex` was initially `getBehindIndex`, and was changed to getQuorumIndex to incorporate pending snapshots (561398f). Previous to this we started off computeTruncateDecision using raft.Status.Commit. As of (b64bdc9), we refactored log truncation decision making and pulled out pending snapshot logic back out of getQuorumIndex. `getQuorumIndex` was then far too similar to raft.Status.Commit but not quite. Given that we already use raft.Status.Progress in `computeTruncateDecision` to track lagging, yet recently active followers, we can get rid of `getQuorumIndex` altogether to simplify things. Co-authored-by: Spas Bojanov <[email protected]> Co-authored-by: irfan sharif <[email protected]>

spaskob · 2020-02-06T18:32:53Z

@ajwerner pinging you on this as it may be related to #44739

dt · 2020-11-18T19:48:06Z

#55506

dt added the A-disaster-recovery label May 21, 2019

nstewart mentioned this issue May 21, 2019

importccl: Allow users to IMPORT into an existing table #26834

Closed

15 tasks

awoods187 added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Jun 21, 2019

dt assigned spaskob Oct 14, 2019

thoszhang mentioned this issue Nov 18, 2019

bulkio, sql: schema changes should be jobs #42554

Closed

5 tasks

spaskob mentioned this issue Dec 5, 2019

jobs: schedule a job in a transaction to execute after commit #42689

Merged

spaskob assigned dt and unassigned spaskob Mar 3, 2020

dt removed their assignment Aug 20, 2020

dt changed the title ~~backupccl,importccl: audit and fix txn use during planning~~ jobs, *: refactor planning vs exec APIs Sep 17, 2020

dt closed this as completed Nov 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jobs, *: refactor planning vs exec APIs #37691

jobs, *: refactor planning vs exec APIs #37691

dt commented May 21, 2019 •

edited

Loading

dt commented Oct 14, 2019

spaskob commented Feb 6, 2020

dt commented Nov 18, 2020

jobs, *: refactor planning vs exec APIs #37691

jobs, *: refactor planning vs exec APIs #37691

Comments

dt commented May 21, 2019 • edited Loading

dt commented Oct 14, 2019

spaskob commented Feb 6, 2020

dt commented Nov 18, 2020

dt commented May 21, 2019 •

edited

Loading