Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support using secondary indices with write-committed transactions #13180

Closed
wants to merge 1 commit into from

Conversation

ltamasi
Copy link
Contributor

@ltamasi ltamasi commented Dec 3, 2024

Summary:
The patch adds initial support for secondary indices using write-committed transactions. Currently, only the PutEntity API is supported; other APIs like Put and Delete will be added separately. Applications can set up secondary indices using the new configuration option TransactionDBOptions::secondary_indices. When secondary indices are enabled, calling PutEntity via a (n explicit or implicit) transaction performs the following steps:

  1. Retrieves the current value (if any) of the primary key using GetEntityForUpdate.
  2. If there is an existing primary key-value, it removes any existing secondary index entries using SingleDelete. (Note: as a later optimization, we can avoid removing and recreating secondary index entries when neither the secondary key nor the value changes during an update.)
  3. It invokes UpdatePrimaryColumnValue for all applicable SecondaryIndex objects, that is, those for which the primary column family matches the column family from the PutEntity call and for which the primary column appears in the new wide-column structure.
  4. It writes the new primary key-value. Note that the values of the indexing columns might have been changed in step 3 above.
  5. It builds the secondary key-value for each applicable secondary index using GetSecondaryKeyPrefix and GetSecondaryValue, and writes it to the appropriate secondary column family.

All the above operations are performed as part of the same transaction. The logic uses SavePoints to roll back any earlier operations related to a primary key if a subsequent step fails.

Implementation-wise, the code uses a mixin template SecondaryIndexMixin that can inherit from any kind of transaction, and use the write APIs of the base class (including its concurrency control mechanisms) to implement the index maintenance logic. The mixin will enable to later extend secondary indices to optimistic or write-prepared/write-unprepared pessimistic transactions as well.

Differential Revision: D66672931

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66672931

ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Dec 3, 2024
…cebook#13180)

Summary:

The patch adds initial support for secondary indices using write-committed transactions. Currently, only the `PutEntity` API is supported; other APIs like `Put` and `Delete` will be added separately. Applications can set up secondary indices using the new configuration option `TransactionDBOptions::secondary_indices`. When secondary indices are enabled, calling `PutEntity` via a (n explicit or implicit) transaction performs the following steps:
1) Retrieves the current value (if any) of the primary key using `GetEntityForUpdate`.
2) If there is an existing primary key-value, it removes any existing secondary index entries using `SingleDelete`. (Note: as a later optimization, we can avoid removing and recreating secondary index entries when neither the secondary key nor the value changes during an update.)
3) It invokes `UpdatePrimaryColumnValue` for all applicable `SecondaryIndex` objects, that is, those for which the primary column family matches the column family from the `PutEntity` call and for which the primary column appears in the new wide-column structure.
4) It writes the new primary key-value. Note that the values of the indexing columns might have been changed in step facebook#3 above.
5) It builds the secondary key-value for each applicable secondary index using `GetSecondaryKeyPrefix` and `GetSecondaryValue`, and writes it to the appropriate secondary column family.

All the above operations are performed as part of the same transaction. The logic uses `SavePoint`s to roll back any earlier operations related to a primary key if a subsequent step fails.

Implementation-wise, the code uses a mixin template `SecondaryIndexMixin` that can inherit from any kind of transaction, and use the write APIs of the base class (including its concurrency control mechanisms) to implement the index maintenance logic. The mixin will enable to later extend secondary indices to optimistic or write-prepared/write-unprepared pessimistic transactions as well.

Differential Revision: D66672931
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66672931

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66672931

ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Dec 4, 2024
…cebook#13180)

Summary:

The patch adds initial support for secondary indices using write-committed transactions. Currently, only the `PutEntity` API is supported; other APIs like `Put` and `Delete` will be added separately. Applications can set up secondary indices using the new configuration option `TransactionDBOptions::secondary_indices`. When secondary indices are enabled, calling `PutEntity` via a (n explicit or implicit) transaction performs the following steps:
1) Retrieves the current value (if any) of the primary key using `GetEntityForUpdate`.
2) If there is an existing primary key-value, it removes any existing secondary index entries using `SingleDelete`. (Note: as a later optimization, we can avoid removing and recreating secondary index entries when neither the secondary key nor the value changes during an update.)
3) It invokes `UpdatePrimaryColumnValue` for all applicable `SecondaryIndex` objects, that is, those for which the primary column family matches the column family from the `PutEntity` call and for which the primary column appears in the new wide-column structure.
4) It writes the new primary key-value. Note that the values of the indexing columns might have been changed in step 3 above.
5) It builds the secondary key-value for each applicable secondary index using `GetSecondaryKeyPrefix` and `GetSecondaryValue`, and writes it to the appropriate secondary column family.

All the above operations are performed as part of the same transaction. The logic uses `SavePoint`s to roll back any earlier operations related to a primary key if a subsequent step fails.

Implementation-wise, the code uses a mixin template `SecondaryIndexMixin` that can inherit from any kind of transaction, and use the write APIs of the base class (including its concurrency control mechanisms) to implement the index maintenance logic. The mixin will enable to later extend secondary indices to optimistic or write-prepared/write-unprepared pessimistic transactions as well.

Differential Revision: D66672931
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66672931

ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Dec 4, 2024
…cebook#13180)

Summary:

The patch adds initial support for secondary indices using write-committed transactions. Currently, only the `PutEntity` API is supported; other APIs like `Put` and `Delete` will be added separately. Applications can set up secondary indices using the new configuration option `TransactionDBOptions::secondary_indices`. When secondary indices are enabled, calling `PutEntity` via a (n explicit or implicit) transaction performs the following steps:
1) Retrieves the current value (if any) of the primary key using `GetEntityForUpdate`.
2) If there is an existing primary key-value, it removes any existing secondary index entries using `SingleDelete`. (Note: as a later optimization, we can avoid removing and recreating secondary index entries when neither the secondary key nor the value changes during an update.)
3) It invokes `UpdatePrimaryColumnValue` for all applicable `SecondaryIndex` objects, that is, those for which the primary column family matches the column family from the `PutEntity` call and for which the primary column appears in the new wide-column structure.
4) It writes the new primary key-value. Note that the values of the indexing columns might have been changed in step 3 above.
5) It builds the secondary key-value for each applicable secondary index using `GetSecondaryKeyPrefix` and `GetSecondaryValue`, and writes it to the appropriate secondary column family.

All the above operations are performed as part of the same transaction. The logic uses `SavePoint`s to roll back any earlier operations related to a primary key if a subsequent step fails.

Implementation-wise, the code uses a mixin template `SecondaryIndexMixin` that can inherit from any kind of transaction, and use the write APIs of the base class (including its concurrency control mechanisms) to implement the index maintenance logic. The mixin will enable to later extend secondary indices to optimistic or write-prepared/write-unprepared pessimistic transactions as well.

Differential Revision: D66672931
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66672931

@ltamasi
Copy link
Contributor Author

ltamasi commented Dec 5, 2024

Thanks so much for the detailed review @jowlyzhang !

…cebook#13180)

Summary:

The patch adds initial support for secondary indices using write-committed transactions. Currently, only the `PutEntity` API is supported; other APIs like `Put` and `Delete` will be added separately. Applications can set up secondary indices using the new configuration option `TransactionDBOptions::secondary_indices`. When secondary indices are enabled, calling `PutEntity` via a (n explicit or implicit) transaction performs the following steps:
1) It retrieves the current value (if any) of the primary key using `GetEntityForUpdate`.
2) If there is an existing primary key-value, it removes any existing secondary index entries using `SingleDelete`. (Note: as a later optimization, we can avoid removing and recreating secondary index entries when neither the secondary key nor the value changes during an update.)
3) It invokes `UpdatePrimaryColumnValue` for all applicable `SecondaryIndex` objects, that is, those for which the primary column family matches the column family from the `PutEntity` call and for which the primary column appears in the new wide-column structure.
4) It writes the new primary key-value. Note that the values of the indexing columns might have been changed in step 3 above.
5) It builds the secondary key-value for each applicable secondary index using `GetSecondaryKeyPrefix` and `GetSecondaryValue`, and writes it to the appropriate secondary column family.

All the above operations are performed as part of the same transaction. The logic uses `SavePoint`s to roll back any earlier operations related to a primary key if a subsequent step fails.

Implementation-wise, the code uses a mixin template `SecondaryIndexMixin` that can inherit from any kind of transaction and use the write APIs and concurrency control mechanisms of the base class to implement the index maintenance logic. The mixin will enable us to later extend secondary indices to optimistic or write-prepared/write-unprepared pessimistic transactions as well.

Differential Revision: D66672931
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66672931

Copy link
Contributor

@jowlyzhang jowlyzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 1f96e65.

@pdillinger
Copy link
Contributor

Is there a plan for the requested optimization of WAL size (only include in WAL writes that aren't implied by others)? @ltamasi

@ltamasi
Copy link
Contributor Author

ltamasi commented Dec 6, 2024

Is there a plan for the requested optimization of WAL size (only include in WAL writes that aren't implied by others)?

I wasn't aware of this request but it sounds like a Good Idea (TM)

ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Jan 10, 2025
Summary: The patch adds support for `Put` / `PutUntracked` to the secondary indexing logic. Similarly to `PutEntity` (see facebook#13180), calling these APIs automatically add or remove secondary index entries as needed in an atomic and transparent fashion.

Differential Revision: D68035089
ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Jan 10, 2025
Summary:

The patch adds support for `Put` / `PutUntracked` to the secondary indexing logic. Similarly to `PutEntity` (see facebook#13180), calling these APIs automatically add or remove secondary index entries as needed in an atomic and transparent fashion.

Differential Revision: D68035089
ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Jan 10, 2025
Summary:

The patch adds support for `Put` / `PutUntracked` to the secondary indexing logic. Similarly to `PutEntity` (see facebook#13180), calling these APIs automatically add or remove secondary index entries as needed in an atomic and transparent fashion.

Differential Revision: D68035089
ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Jan 10, 2025
Summary:

The patch adds support for `Put` / `PutUntracked` to the secondary indexing logic. Similarly to `PutEntity` (see facebook#13180), calling these APIs automatically add or remove secondary index entries as needed in an atomic and transparent fashion.

Differential Revision: D68035089
ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Jan 14, 2025
Summary:

The patch adds support for `Put` / `PutUntracked` to the secondary indexing logic. Similarly to `PutEntity` (see facebook#13180), calling these APIs automatically add or remove secondary index entries as needed in an atomic and transparent fashion.

Differential Revision: D68035089
ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Jan 14, 2025
Summary:

The patch adds support for `Put` / `PutUntracked` to the secondary indexing logic. Similarly to `PutEntity` (see facebook#13180), calling these APIs automatically add or remove secondary index entries as needed in an atomic and transparent fashion.

Reviewed By: jaykorean

Differential Revision: D68035089
ltamasi added a commit to ltamasi/rocksdb that referenced this pull request Jan 14, 2025
Summary:

The patch adds support for `Put` / `PutUntracked` to the secondary indexing logic. Similarly to `PutEntity` (see facebook#13180), calling these APIs automatically add or remove secondary index entries as needed in an atomic and transparent fashion.

Reviewed By: jaykorean

Differential Revision: D68035089
facebook-github-bot pushed a commit that referenced this pull request Jan 14, 2025
Summary:
Pull Request resolved: #13289

The patch adds support for `Put` / `PutUntracked` to the secondary indexing logic. Similarly to `PutEntity` (see #13180), calling these APIs automatically add or remove secondary index entries as needed in an atomic and transparent fashion.

Reviewed By: jaykorean

Differential Revision: D68035089

fbshipit-source-id: db37bce62151ae1909b46b1020592c8348156653
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants