-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Sending semi-sync ACKs from Rdonly type tablets #13606
Comments
One benefit we see in our production with the current behaviour is reducing the volume/overhead of semi-sync ACKs on the In our configuration (more than just 1 x Another thought: in my experience |
@timvaillancourt We had a discussion and we realized that sending semi-sync ACKs from
For this reason, we propose not changing the current durability policies, but to introduce 2 more. This gives the users the flexibility to choose whatever configuration they want to run in based on their own use-case and load distribution. The new policies can be named We can then rework the example to use What do you think about this? |
My thoughts: we should allow semi-sync acks from Discussion: if the
The overhead for enabling semi-sync is IMHO negligible. It's two more bits to each downstream event, plus a small ack upstream. The impact to the IO thread should be unnoticeable. And there is no impact to the SQL thread - so enabling semi-sync does not cause a replica to lag more, irrespective of whether the replica is overloaded or not.
Agreed and that is an important advantage on small shards.
With this suggestion there is no impact to existing users, it's configurable and opt-in, and I'm in favor. |
Description
Up until now, we have always configured RDONLY tablets to not send semi-sync ACKs to the primary tablets. Both the in-built durability policies
semi_sync
andcross_cell
configure the cluster in this way.It looks like the reason we used to do this was because ERS didn't have a 2 step promotion process before and if a RDONLY tablet was the one sending semi-sync ACKs and a primary went down, it could have been the only tablet that had a committed write that the user received a Success for. However, ERS doesn't allow promotion of RDONLY tablets.
This limitation is no longer valid for us since ERS now has a 2 step promotion process. Even if a RDONLY is the most advanced, we now just get a REPLICA tablet to replicate from it.
A secondary reason was that this was the only way to get cross-cell durability with semi-sync. Now that we have a cross-cell durability policy, we no longer need to rely on "no ACKs from RDONLY" to get that.
Advantages
Allowing RDONLY tablets to send semi-sync ACKs will let users run semi-sync in the cluster that only has a PRIMARY, REPLICA and RDONLY tablet (like in our examples) and still tolerate a PRIMARY failure.
The text was updated successfully, but these errors were encountered: