-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guarantee that bootstraps get applied if they receive the final bootstrap message #5331
Conversation
src/realm/sync/client.cpp
Outdated
VersionInfo version_info; | ||
ClientReplication& repl = access_realm(); // Throws | ||
if (m_wrapper.m_flx_pending_bootstrap_store) { | ||
while (m_wrapper.m_flx_pending_bootstrap_store->has_pending()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we have the same problem if we don't get to apply all changeset batches here? I thought the idea was to apply all or nothing (i.e., concatenate all batches and have one single call to integrate_changesets)
59d7b4c
to
986e35d
Compare
a41ac48
to
765f8ac
Compare
@danieltabacaru and @ironage , do you think you'll have time to review this soon? |
// state. | ||
PendingBatch peek_pending(size_t limit_in_bytes); | ||
|
||
// Removes the first set of changesets from the current pending bootstrap batch. The transaction must be int the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int -> in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
// state. | ||
PendingBatch peek_pending(size_t limit_in_bytes); | ||
|
||
// Removes the first set of changesets from the current pending bootstrap batch. The transaction must be int the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int -> in
// writing state. | ||
void pop_front_pending(const TransactionRef& tr, size_t count); | ||
|
||
// Adds a set of changesets to the store and returns the local version produced by the transaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does not return anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
// writing state. | ||
void pop_front_pending(const TransactionRef& tr, size_t count); | ||
|
||
// Adds a set of changesets to the store and returns the local version produced by the transaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does not return anything
// If there are any incomplete bootstraps in the table, we need to clear them and then update our has_pending | ||
// flag accordingly. | ||
if (auto bootstrap_table = tr->get_table(m_table); !bootstrap_table->is_empty()) { | ||
m_has_pending = (bootstrap_table->is_empty() == false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't the check above conclude that bootstrap_table is not empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. also removed the comment above since that gets handled elsewhere now.
// If there are any incomplete bootstraps in the table, we need to clear them and then update our has_pending | ||
// flag accordingly. | ||
if (auto bootstrap_table = tr->get_table(m_table); !bootstrap_table->is_empty()) { | ||
m_has_pending = (bootstrap_table->is_empty() == false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't the check above conclude that bootstrap_table is not empty?
ret.changeset_data.push_back(util::AppendBuffer<char>()); | ||
auto& uncompressed_buffer = ret.changeset_data.back(); | ||
|
||
auto raw_changeset_data = cur_changeset.get<BinaryData>(m_changeset_data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you could name this compressed_changeset_data
ret.changeset_data.push_back(util::AppendBuffer<char>()); | ||
auto& uncompressed_buffer = ret.changeset_data.back(); | ||
|
||
auto raw_changeset_data = cur_changeset.get<BinaryData>(m_changeset_data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you could name this compressed_changeset_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bootstrap_obj.get<int64_t>(m_query_version), changeset_list.size()); | ||
} | ||
|
||
m_has_pending = (bootstrap_table->is_empty() == false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not m_has_pending = !bootstrap_table->is_empty()
?
bootstrap_obj.get<int64_t>(m_query_version), changeset_list.size()); | ||
} | ||
|
||
m_has_pending = (bootstrap_table->is_empty() == false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not m_has_pending = !bootstrap_table->is_empty()
?
src/realm/sync/client.hpp
Outdated
std::function<void()> on_download_message_received_hook; | ||
std::function<void(const sync::SyncProgress&, int64_t, sync::DownloadBatchState)> | ||
on_download_message_received_hook; | ||
/// Will be called after each download message is integrated. For use in testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is a bit misleading since we do not guarantee this is called for each download messages received (i.e., we "construct" our own download messages through peek_pending)
Edit: This actually seems to be called when a download message is added to the store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've copied the comment in src/realm/sync/config.hpp for this parameter here since I think that one is actually accurate.
src/realm/sync/client.hpp
Outdated
std::function<void()> on_download_message_received_hook; | ||
std::function<void(const sync::SyncProgress&, int64_t, sync::DownloadBatchState)> | ||
on_download_message_received_hook; | ||
/// Will be called after each download message is integrated. For use in testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is a bit misleading since we do not guarantee this is called for each download messages received (i.e., we "construct" our own download messages through peek_pending)
|
||
if (batch_state == DownloadBatchState::LastInBatch) { | ||
update_progress(progress); // Throws | ||
if (process_flx_bootstrap_message(progress, batch_state, query_version, received_changesets)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be pass downloadable_bytes
and save it in the metadata table too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
downloadable_bytes
doesn't really have meaning in FLX sync and this should always be zero.
|
||
if (batch_state == DownloadBatchState::LastInBatch) { | ||
update_progress(progress); // Throws | ||
if (process_flx_bootstrap_message(progress, batch_state, query_version, received_changesets)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be pass downloadable_bytes
and save it in the metadata table too?
src/realm/sync/config.hpp
Outdated
std::function<void(std::weak_ptr<SyncSession>)> on_download_message_received_hook; | ||
std::function<void(std::weak_ptr<SyncSession>, const sync::SyncProgress&, int64_t, sync::DownloadBatchState)> | ||
on_download_message_received_hook; | ||
// Will be called after each bootstrap message is added to the pending bootstrap store, but before_frozen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is a bit more accurate, but I'm not sure what before_frozen
has to do with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my new auto-completer in neovim is auto-completing some nonsense.
src/realm/sync/config.hpp
Outdated
std::function<void(std::weak_ptr<SyncSession>)> on_download_message_received_hook; | ||
std::function<void(std::weak_ptr<SyncSession>, const sync::SyncProgress&, int64_t, sync::DownloadBatchState)> | ||
on_download_message_received_hook; | ||
// Will be called after each bootstrap message is added to the pending bootstrap store, but before_frozen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is a bit more accurate, but I'm not sure what before_frozen
has to do with it.
bool process_flx_bootstrap_message(const SyncProgress& progress, DownloadBatchState batch_state, | ||
int64_t query_version, const ReceivedChangesets& received_changesets); | ||
|
||
// Processes any pending FLX bootstraps, if one excists. Otherwise this is a noop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
bool process_flx_bootstrap_message(const SyncProgress& progress, DownloadBatchState batch_state, | ||
int64_t query_version, const ReceivedChangesets& received_changesets); | ||
|
||
// Processes any pending FLX bootstraps, if one excists. Otherwise this is a noop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exists
@@ -34,6 +34,8 @@ using TransactionRef = std::shared_ptr<Transaction>; | |||
namespace realm::sync { | |||
namespace internal_schema_groups { | |||
constexpr static std::string_view c_flx_subscription_store("flx_subscription_store"); | |||
constexpr static std::string_view c_pending_bootstraps("PendingBootstraps"); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -34,6 +34,8 @@ using TransactionRef = std::shared_ptr<Transaction>; | |||
namespace realm::sync { | |||
namespace internal_schema_groups { | |||
constexpr static std::string_view c_flx_subscription_store("flx_subscription_store"); | |||
constexpr static std::string_view c_pending_bootstraps("PendingBootstraps"); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
@@ -19,6 +19,7 @@ | |||
#ifndef REALM_OS_SYNC_SESSION_HPP | |||
#define REALM_OS_SYNC_SESSION_HPP | |||
|
|||
#include "realm/sync/client_base.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. I think this got accidentally added by clangd
|
||
auto tr = m_db->start_write(); | ||
auto bootstrap_table = tr->get_table(m_table); | ||
auto incomplete_bootstraps = Query(bootstrap_table).less(m_query_version, query_version).find_all(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one bootstrap can be active at a time right? Can we change less
to not_equal
? I haven't implemented this yet, but I think the query version may decrease in a client reset with recovery. Or actually I may need to just clear out the entire PendingBootstrap store upon reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmmmm... I think we need to guarantee that the query version doesn't decrease? or at least we may need to coordinate that with the server. I think the server currently assumes that the query version will never decrease over the life of a file ident.
but yes, we can change this to not_equal.
} | ||
|
||
void PendingBootstrapStore::pop_front_pending(const TransactionRef& tr, size_t count) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert that tr is a write transaction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -34,6 +34,8 @@ using TransactionRef = std::shared_ptr<Transaction>; | |||
namespace realm::sync { | |||
namespace internal_schema_groups { | |||
constexpr static std::string_view c_flx_subscription_store("flx_subscription_store"); | |||
constexpr static std::string_view c_pending_bootstraps("PendingBootstraps"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we doing snake_case or CamelCaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
104ff16
to
4cefac0
Compare
0d2740b
to
1266fc3
Compare
What, How & Why?
When FLX sync is enabled we need to defer applying download messages that contain partial bootstrap changesets until we have the entire set of changesets available. This is because if the bootstrap is interrupted, when we retry the bootstrap later on, the server may be using a different snapshot with different results than the first bootstrap attempt. By using a different snapshot, the server may have a different set of objects that are in-view between bootstrap attempts, and if we were to apply partial bootstrap download messages, those objects that were in-view in the first attempt but out-of-view in the second attempt will be orphaned on the client.
This change introduces a new PendingBootstrapStore that will store the contents of bootstrap download messages in a side-table until a download message with last_in_batch=true has been received. If the bootstrap is interrupted before a last_in_batch=true message is received, none of the prior messages shall be applied and the bootstrap will restart. If the bootstrap is interrupted after the last_in_batch=true message is received, but before the bootstrap has been fully applied, the sync client shall apply all the remaining bootstrap messages before sending the IDENT message an resuming the session.
☑️ ToDos