Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LIMIT is required on a DELETE ... ORDER BY #4581

Merged
merged 1 commit into from
Apr 15, 2019

Conversation

rmloveland
Copy link
Contributor

Fixes #4378.

Summary of changes:

  • Add new section 'Sorting the output of deletes' to both the DELETE
    and 'Ordering Query Results' pages.

  • Add example section 'Sort an return deleted rows' to the DELETE page

  • Update all examples on DELETE page to use new SQL CLI output table
    formatting

  • Add SQL code needed for DELETE examples in a comment at the bottom
    of the page

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rmloveland rmloveland requested a review from andy-kimball March 27, 2019 20:04
@rmloveland
Copy link
Contributor Author

Direct link for easier reference:

Copy link

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @rmloveland)


v19.1/delete.md, line 221 at r1 (raw file):

{% include copy-clipboard.html %}
~~~ sql
> DELETE FROM account_details ORDER BY account_id LIMIT (SELECT COUNT(*) FROM account_details) RETURNING *;

We shouldn't recommend this pattern, because it would be slow and would trigger transaction restarts in racy cases where multiple transactions were trying to delete rows. The purpose of using ORDER BY / LIMIT with DELETE is not to order output, but to delete up to N rows according to some order.

For example, say I have a table where I delete the oldest data data older than 3 months. I might use a LIMIT to throttle my deletes, so that I delete at most 100 rows per transaction. And I'd use ORDER BY to ensure I delete the oldest rows first:

DELETE FROM account_details WHERE date < now() - INTERVAL '3 months' ORDER BY date

This statement does not have any output. But even if we were to output rows using RETURNING, we still would make no guarantee about their order! The point of ORDER BY in this statement is not to order output rows, but to order input rows (i.e. to order input rows so that we can figure out which rows need to be deleted). If you want to order output, you need to use the other pattern you show.

@rmloveland rmloveland force-pushed the 20190327-delete-order-by-limit branch from 41686a0 to 61e680f Compare April 8, 2019 18:49
Copy link
Contributor Author

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball)


v19.1/delete.md, line 221 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

We shouldn't recommend this pattern, because it would be slow and would trigger transaction restarts in racy cases where multiple transactions were trying to delete rows. The purpose of using ORDER BY / LIMIT with DELETE is not to order output, but to delete up to N rows according to some order.

For example, say I have a table where I delete the oldest data data older than 3 months. I might use a LIMIT to throttle my deletes, so that I delete at most 100 rows per transaction. And I'd use ORDER BY to ensure I delete the oldest rows first:

DELETE FROM account_details WHERE date < now() - INTERVAL '3 months' ORDER BY date

This statement does not have any output. But even if we were to output rows using RETURNING, we still would make no guarantee about their order! The point of ORDER BY in this statement is not to order output rows, but to order input rows (i.e. to order input rows so that we can figure out which rows need to be deleted). If you want to order output, you need to use the other pattern you show.

Thanks for explaining this distinction between sorting the input vs. output, Andy.

I've removed this second pattern from this part of the docs altogether (leaving the first).

Instead, I've updated the DELETE page's sort_clause parameter with the following information from the PR release note:

"ORDER BY clause can no longer be used with a DELETE statement when there is no LIMIT clause present"

Hopefully that covers what I understand to be the 2 informational requirements from the release note:

  • show how to order delete output
  • note that you need a LIMIT if you DELETE ... ORDER BY

PTAL.

Copy link

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@rmloveland rmloveland requested a review from lnhsingh April 9, 2019 19:30
@rmloveland
Copy link
Contributor Author

@lhirata ping

@lnhsingh
Copy link
Contributor

lnhsingh commented Apr 11, 2019 via email

@rmloveland rmloveland requested review from jseldess and removed request for lnhsingh April 11, 2019 19:23
Copy link
Contributor

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: with a few nits.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jseldess and @rmloveland)


_includes/v19.1/misc/sorting-delete-output.md, line 5 at r2 (raw file):

{% include copy-clipboard.html %}
~~~ sql
SELECT ... FROM [DELETE ...] ORDER BY ...

nit: Add > prompt


v19.1/delete.md, line 37 at r2 (raw file):

 `AS table_alias_name` | An alias for the table name. When an alias is provided, it completely hides the actual table name.
`WHERE a_expr`| `a_expr` must be an expression that returns Boolean values using columns (e.g., `<column> = <value>`). Delete rows that return `TRUE`.<br><br/>__Without a `WHERE` clause in your statement, `DELETE` removes all rows from the table.__
 `sort_clause` | An `ORDER BY` clause. <span class="version-tag">New in v19.1</span>: The `ORDER BY` clause can no longer be used with a `DELETE` statement when there is no `LIMIT` clause present.

nit: I'd prefer <br><br> before the callout so it's on a new line.


v19.1/delete.md, line 196 at r2 (raw file):

#### Sort and return deleted rows

To sort and return deleted rows, we recommend using a statement like the following:

nit: I'd remove we recommend and just go with use a statement like the following:


v19.1/delete.md, line 229 at r2 (raw file):

[truncate]: truncate.html

<!--

Is this left for internal reference?

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fixes #4378.

Summary of changes:

- Update `DELETE`'s `sort_clause` parameter to note that as of 19.1, a
  `LIMIT` is required on a `DELETE ... ORDER BY`

- Add new section 'Sorting the output of deletes' to both the `DELETE`
  and 'Ordering Query Results' pages.

- Add example section 'Sort an return deleted rows' to the `DELETE` page

- Update all examples on `DELETE` page to use new SQL CLI output table
  formatting

- Add SQL code needed for `DELETE` examples in a comment at the bottom
  of the page
@rmloveland rmloveland force-pushed the 20190327-delete-order-by-limit branch from 61e680f to a5232d3 Compare April 15, 2019 15:01
Copy link
Contributor Author

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, Jesse!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jseldess)


_includes/v19.1/misc/sorting-delete-output.md, line 5 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

nit: Add > prompt

Fixed.


v19.1/delete.md, line 37 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

nit: I'd prefer <br><br> before the callout so it's on a new line.

Fixed.


v19.1/delete.md, line 196 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

nit: I'd remove we recommend and just go with use a statement like the following:

Fixed.


v19.1/delete.md, line 229 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

Is this left for internal reference?

Yes. I put it there because when I made some updates to this page a little while back, I had to recreate the table and data to match the examples that were already here before I could add additional examples while making sure that the outputs were correct given the data manipulation that happened in previous examples.

Bit hacky but it could save some time (for example, it saved me time on this update).

Copy link
Contributor

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rmloveland)


v19.1/delete.md, line 229 at r2 (raw file):

Previously, rmloveland (Rich Loveland) wrote…

Yes. I put it there because when I made some updates to this page a little while back, I had to recreate the table and data to match the examples that were already here before I could add additional examples while making sure that the outputs were correct given the data manipulation that happened in previous examples.

Bit hacky but it could save some time (for example, it saved me time on this update).

Make sense. Thanks.

@rmloveland rmloveland merged commit 0d37ad3 into master Apr 15, 2019
@rmloveland rmloveland deleted the 20190327-delete-order-by-limit branch April 15, 2019 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: Prune Delete operator input columns
5 participants