-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
epoch_interval setting in the journal file needs to be honored in all cases #341
Comments
nars1
added a commit
to nars1/YottaDB
that referenced
this issue
Aug 14, 2018
…honored in all cases Code fixes for GTM-8708 in GT.M V6.3-002 wanted to prevent the update helper writer process from unnecessarily making calls to grab_crit_immediate() in case the instance had no updates. It did this by setting the next epoch time (jbp->next_epoch_time) to a huge value (MAXUINT4) the moment a wcs_flu() call was done to write a regular epoch (JRE gvstat) where it noticed an idle epoch (JRI gvstat) had already been written (because of inactivity since the last update). This caused the jbp->next_epoch_time vs jgbl.gbl_jrec_time check in updhelper_writer() to automatically skip invoking the grab_crit_immediate() during periods of idleness. But this introduced the YottaDB#341 regression. It is now possible for a regular epoch to not be written in a timely fashion. This is because once jbp->next_epoch_time gets set to MAXUINT4 as part of an update, if the next update on the same journal file happens say N seconds later (where N could be as high as 5 seconds, the hardcoded inactivity time before an idle epoch kicks in), the call to t_end/tp_tend for the next update finds jbp->next_epoch_time set to MAXUINT4 and resets it to the current time (which is the time of the next update, not the time of the prior update) plus the epoch_interval. This means the next_epoch_time is set to a value that is incorrect by at most N seconds. That is, it is possible for two updates to be separated by as much as N + epoch_interval seconds without having an intervening EPOCH in the journal file. This violates the definition of the epoch_interval setting. The fix for this is to undo all the changes done as part of GTM-8708. So jbp->next_epoch_time is never set to MAXUINT4. Instead, updhelper_writer.c is enhanced to check if an EPOCH is the most recent record in the journal buffer (jbp->post_epoch_freeaddr == jbp->rsrv_freeaddr) and if so we skip checking jbp->next_epoch_time (and invoking grab_crit_immediate()) altogether since we know there is nothing left to be flushed in the journal file. This achieves the objective of GTM-8708. And for YottaDB#341, jbp->next_epoch_time is updated in wcs_flu() at the time when we notice an epoch is the most recent journal record in the journal file. This ensures we honor the epoch_interval setting.
nars1
added a commit
to nars1/YottaDBtest
that referenced
this issue
Aug 14, 2018
nars1
added a commit
that referenced
this issue
Aug 14, 2018
… in all cases (#342) Code fixes for GTM-8708 in GT.M V6.3-002 wanted to prevent the update helper writer process from unnecessarily making calls to grab_crit_immediate() in case the instance had no updates. It did this by setting the next epoch time (jbp->next_epoch_time) to a huge value (MAXUINT4) the moment a wcs_flu() call was done to write a regular epoch (JRE gvstat) where it noticed an idle epoch (JRI gvstat) had already been written (because of inactivity since the last update). This caused the jbp->next_epoch_time vs jgbl.gbl_jrec_time check in updhelper_writer() to automatically skip invoking the grab_crit_immediate() during periods of idleness. But this introduced the #341 regression. It is now possible for a regular epoch to not be written in a timely fashion. This is because once jbp->next_epoch_time gets set to MAXUINT4 as part of an update, if the next update on the same journal file happens say N seconds later (where N could be as high as 5 seconds, the hardcoded inactivity time before an idle epoch kicks in), the call to t_end/tp_tend for the next update finds jbp->next_epoch_time set to MAXUINT4 and resets it to the current time (which is the time of the next update, not the time of the prior update) plus the epoch_interval. This means the next_epoch_time is set to a value that is incorrect by at most N seconds. That is, it is possible for two updates to be separated by as much as N + epoch_interval seconds without having an intervening EPOCH in the journal file. This violates the definition of the epoch_interval setting. The fix for this is to undo all the changes done as part of GTM-8708. So jbp->next_epoch_time is never set to MAXUINT4. Instead, updhelper_writer.c is enhanced to check if an EPOCH is the most recent record in the journal buffer (jbp->post_epoch_freeaddr == jbp->rsrv_freeaddr) and if so we skip checking jbp->next_epoch_time (and invoking grab_crit_immediate()) altogether since we know there is nothing left to be flushed in the journal file. This achieves the objective of GTM-8708. And for #341, jbp->next_epoch_time is updated in wcs_flu() at the time when we notice an epoch is the most recent journal record in the journal file. This ensures we honor the epoch_interval setting.
nars1
added a commit
to YottaDB/YDBTest
that referenced
this issue
Aug 14, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Final Release Note
Description
Below is a test case that demonstrates the issue.
Below is the output of the above test
Notice that the epoch-interval is set to 2 seconds which means we expect at least one EPOCH record to be written between any updates that are separated by >= 2 seconds. But in the above case, we have the SET of ^x=2 done at 64874,50940 and the SET of ^x=4 done at 64874,50946 which are separated by 6 seconds (a lot more than the epoch-interval of 2 seconds) but still no EPOCH is written.
This is a regression introduced in GT.M V6.3-002 (most likely as part of GTM-8708).
Draft Release Note
The epoch_interval setting in a journal file is honored in all cases. That is, if two updates to the journal file are separated by N seconds where N is the epoch_interval setting, then the two updates are guaranteed to be separated by at least one EPOCH records in the journal file. Previously (starting GT.M V6.3-002), this setting was not honored in some cases (e.g. when an idle/free epoch got written) resulting in delaying the EPOCH by as much as 6 seconds.
The text was updated successfully, but these errors were encountered: