-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-16513. [SBN read] Observer Namenode should not trigger the edits rolling of active Namenode #4087
base: trunk
Are you sure you want to change the base?
Conversation
…olling of active Namenode
💔 -1 overall
This message was automatically generated. |
To avoid frequent edtis rolling, we should disable OBN from triggering the edits rolling of active Namenode. Hi @sunchao, please have a look at this. Thank you very much. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is pretty similar to https://issues.apache.org/jira/browse/HDFS-14378. If I remember correctly, the better approach would be to have ANN roll its own edit logs.
Even though we address the observer issue here, in a real scenario there could still be multiple SNNs.
Thank you @sunchao very much for your review. Active Namenode does automatically roll logs periodically. It might be risky(we can look at here HDFS-2737) by simply disabling all SNN to trigger active roll edits log. However, disabling OBN rolle active edits has no side effects. What do you think of this? |
Hi @xkrogen, could you please also take a look? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tomscut, sorry for the delay in my response.
I am inclined to agree with @sunchao that the approach laid out in HDFS-14378 is a better long-term solution.
It might be risky(we can look at here HDFS-2737) by simply disabling all SNN to trigger active roll edits log.
Can you clarify what from HDFS-2737 makes you feel that it is risky? I skimmed the discussed and didn't notice anything alarming. You may also want to see this comment on HDFS-14378 where this same point was discussed.
That all being said, I think this PR may be a good step in the interim, since HDFS-14378 is a more substantial change. I would appreciate some other opinions, though.
cc @simbadzina @aajisaka @shvachko
@@ -1938,6 +1938,14 @@ public boolean isInStandbyState() { | |||
HAServiceState.OBSERVER == haContext.getState().getServiceState(); | |||
} | |||
|
|||
public boolean isInObserverState() { | |||
if (haContext == null || haContext.getState() == null) { | |||
return haEnabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this was probably copied from isInStandbyState()
? But I don't think it's right. If we can't find a state, we assume STANDBY
state. If we assume STANDBY
state because a valid state could not be found, then isInObserverState()
should be false. So I think we should just return false
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this was probably copied from
isInStandbyState()
? But I don't think it's right. If we can't find a state, we assumeSTANDBY
state. If we assumeSTANDBY
state because a valid state could not be found, thenisInObserverState()
should be false. So I think we should justreturn false
here.
I agree with you. Thanks.
Thanks you @xkrogen very much for your comments.
The pendingDatanodeMessage issue mentioned here strikes me as a bit risky. However, after supporting I would also appreciate some other opinions. |
💔 -1 overall
This message was automatically generated. |
I'm not following. The issue described from HDFS-2737 says that "if the active NN is not rolling its logs periodically ... many datanode messages [will] be queued up in the PendingDatanodeMessage structure". Certainly it is bad if we don't have a way to ensure that the logs are rolled regularly. But HDFS-14378 just proposes making the ANN roll its own edit logs, instead of relying on the SbNN to roll them. I don't see the risk -- we are still ensuring that the logs are rolled periodically, just triggered by the ANN itself instead of the SbNN. |
Thank you @xkrogen for your detailed explanation. I left out some information. You are right. I thought it was ANN automatic rolledits feature first, then discuss whether to let SNN trigger ANN to rolledits. I got the order of the two wrong. And I thought that "if the active NN is not rolling its logs periodically" meant that the configuration cycle is very large, or that EditLogTailerThread exits because of some UnknowException. As a result, ANN cannot normally roll its logs. Let SNN trigger ANN to roll edits, just to add another layer of assurance. I made a mistake here. |
JIRA: HDFS-16513.
To avoid frequent edtis rolling, we should disable OBN from triggering the edits rolling of active Namenode.
It is sufficient to retain only the
triggering of SNN
and theauto rolling of ANN
.