-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update reconnect protocol to deal with half-open connections #276
Conversation
This will allow _this_ client to reconnect to the same source, allowing it to recover from the _half open_ condition.
@@ -38,6 +38,9 @@ | |||
public static final String RESPONSE_ADDRESS_HEADER = "X-Reply-Address"; | |||
public static final String PROPERTY_MESSAGE_HEADERS = "messageHeaders"; | |||
|
|||
// Set in parameters to allow reconnect after network half-open situation | |||
public static String RECONNECT_SECRET = "reconnectSecret"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[PMD Java] Do not use non-final non-private static fields (view)
Rule | Ruleset | Priority |
---|---|---|
MutableStaticState |
Design | 3 |
References:
You can close this issue if no need to fix it. Learn more.
@@ -29,6 +29,7 @@ | |||
public static String testSourceName = null; | |||
public static String testTypeName = null; | |||
public static String testRuleName = null; | |||
public static Boolean testRepeatedConnectsEnabled = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[PMD Java] Do not use non-final non-private static fields (view)
Rule | Ruleset | Priority |
---|---|---|
MutableStaticState |
Design | 3 |
References:
You can close this issue if no need to fix it. Learn more.
@@ -38,6 +38,9 @@ | |||
public static final String RESPONSE_ADDRESS_HEADER = "X-Reply-Address"; | |||
public static final String PROPERTY_MESSAGE_HEADERS = "messageHeaders"; | |||
|
|||
// Set in parameters to allow reconnect after network half-open situation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[PMD Java] Too many fields (view)
Rule | Ruleset | Priority |
---|---|---|
TooManyFields |
Design | 3 |
References:
You can close this issue if no need to fix it. Learn more.
* Unique identifier for this loader's class. This is set on load and then left. This | ||
* shared secret allows instances of this client to perform reconnects to the server. | ||
*/ | ||
protected final static String clientReconnectSecret = UUID.randomUUID().toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Checkstyle] 'static' modifier out of order with the JLS suggestions. (view)
Rule | Severity |
---|---|
ModifierOrderCheck |
info |
References:
You can close this issue if no need to fix it. Learn more.
@@ -38,6 +38,9 @@ | |||
public static final String RESPONSE_ADDRESS_HEADER = "X-Reply-Address"; | |||
public static final String PROPERTY_MESSAGE_HEADERS = "messageHeaders"; | |||
|
|||
// Set in parameters to allow reconnect after network half-open situation | |||
public static String RECONNECT_SECRET = "reconnectSecret"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Checkstyle] Name 'RECONNECT_SECRET' must match pattern '^[a-z][a-zA-Z0-9]*$'. (view)
Rule | Severity |
---|---|
StaticVariableNameCheck |
info |
References:
You can close this issue if no need to fix it. Learn more.
|
||
if (rootProject.hasProperty("TestRepeatedConnects")) { | ||
systemProperty "TestRepeatedConnects", rootProject.findProperty("TestRepeatedConnects") ?: "empty" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temporary until we're working with 1.32 systems
* shared secret allows instances of this client to perform reconnects to the server. | ||
*/ | ||
protected final static String clientReconnectSecret = UUID.randomUUID().toString(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual secret. It is generated at class load/initialization time and allows this connector instance to reconnect at will. It will be the same for all sources to which this connector connects, but that's irrelevant -- we just want to prove ownership.
@@ -40,6 +41,8 @@ public static void getProps() { | |||
testSourceName = System.getProperty("EntConTestSourceName", "testSourceName"); | |||
testTypeName = System.getProperty("EntConTestTypeName", "testTypeName"); | |||
testRuleName = System.getProperty("EntConTestRuleName", "testRuleName"); | |||
testRepeatedConnectsEnabled = | |||
Boolean.valueOf(System.getProperty("TestRepeatedConnects", "false")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temporary -- we can remove when we're testing mostly against 1.32+ systems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably add an issue reminding us to remove this once 1.32 is out.
assertTrue("Failed to open connection", client.isOpen()); | ||
assertTrue("Failed to authenticate", client.isAuthed()); | ||
assertTrue("Failed to connect to source (may fail if run against Vantiq version < 1.32)", | ||
client.isConnected()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but just in case, we'll alert...
# This allows for manual testing for connector disconnection recovery. | ||
|
||
for ((i = 0; i < 100; i++)) do | ||
echo Attempt $i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ShellCheck] Double quote to prevent globbing and word splitting. (view)
Rule | Severity | Code |
---|---|---|
SC2086 |
info |
2086 |
References:
You can close this issue if no need to fix it. Learn more.
# Shell script for Mac OS that will turn off then on wifi to monkey with network. | ||
# This allows for manual testing for connector disconnection recovery. | ||
|
||
for ((i = 0; i < 100; i++)) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ShellCheck] In POSIX sh, arithmetic for loops are undefined. (view)
Rule | Severity | Code |
---|---|---|
SC3005 |
warning |
3005 |
References:
You can close this issue if no need to fix it. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be ignored as per spellcheck directive atop file
# Shell script for Mac OS that will turn off then on wifi to monkey with network. | ||
# This allows for manual testing for connector disconnection recovery. | ||
|
||
for ((i = 0; i < 100; i++)) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ShellCheck] In POSIX sh, ++ is undefined. (view)
Rule | Severity | Code |
---|---|---|
SC3018 |
warning |
3018 |
References:
You can close this issue if no need to fix it. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be ignored as per spellcheck directive atop file
Related Vantiq PR: Vantiq/ag2rs#8032 |
OPCUA failures are linux/amz vs Mac OS. |
@rblumer4, this may be related to the OPC errors. Looks like it all stems from some bug(s) in OpenJDK... bcgit/bc-java#941 says it's fixed in their 301 release. It's not important to fix this now, just noting for the record. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Two minor comments, one about adding a new issue and one about a potential name change. Feel free to ignore.
@@ -38,6 +39,9 @@ | |||
public static final String RESPONSE_ADDRESS_HEADER = "X-Reply-Address"; | |||
public static final String PROPERTY_MESSAGE_HEADERS = "messageHeaders"; | |||
|
|||
// Set in parameters to allow reconnect after network half-open situation | |||
public static final String RECONNECT_SECRET = "reconnectSecret"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be the "reconnect secret" or "connect secret"?
I think it's fine either way, but the secret would be used for both the first connection as well as any other reconnection afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to leave it as is. CONNECT_SECRET sounds like something that you need to connect (and, generally, you don't). It's also invisible to users of the SDK.
I'll think if there's a comment I can add that clarifies things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me!
@@ -40,6 +41,8 @@ public static void getProps() { | |||
testSourceName = System.getProperty("EntConTestSourceName", "testSourceName"); | |||
testTypeName = System.getProperty("EntConTestTypeName", "testTypeName"); | |||
testRuleName = System.getProperty("EntConTestRuleName", "testRuleName"); | |||
testRepeatedConnectsEnabled = | |||
Boolean.valueOf(System.getProperty("TestRepeatedConnects", "false")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably add an issue reminding us to remove this once 1.32 is out.
Issue to remove stuff is #277 |
Jenkins re-run after all the changes... (no changes) |
// This is sent on any CONNECT_EXTENSION messages. It is used when a connector | ||
// "re-connects" to the server to verify that it's the same connector connecting in | ||
// (otherwise, the reconnect is rejected). So, while it's send on all CONNECT... calls, | ||
// it's use is in only in the reconnect case (that is, usurping an existing connection). | ||
// The name's a bit confusing, but CONNECT_SECRET sounds too much like something you need | ||
// to connect (which, under normal circumstances, you don't). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This is sent on any CONNECT_EXTENSION messages. It is used when a connector | |
// "re-connects" to the server to verify that it's the same connector connecting in | |
// (otherwise, the reconnect is rejected). So, while it's send on all CONNECT... calls, | |
// it's use is in only in the reconnect case (that is, usurping an existing connection). | |
// The name's a bit confusing, but CONNECT_SECRET sounds too much like something you need | |
// to connect (which, under normal circumstances, you don't). | |
// This is sent on any CONNECT_EXTENSION messages. It is used when a connector | |
// "re-connects" to the server to verify that it's the same connector connecting in | |
// (otherwise, the reconnect is rejected). So, while it's sent on all CONNECT... calls, | |
// its use is only in the reconnect case (that is, usurping an existing connection). | |
// The name's a bit confusing, but CONNECT_SECRET sounds too much like something you need | |
// to connect (which, under normal circumstances, you don't). |
(wrong link) |
Note: For this to fully work, this requires an update to the Vantiq server. This is anticipated in release 1.32. |
Fixes #275 -- will require fix to Vantiq/vantiq-Issues#295 to work completely.
Adds reconnect secret to the connection requests. This secret allows a connector to usurp its old connection to Vantiq. This is required when the server fails to notice that the connection between the connector and the server has been unavailable. This is not uncommon in unstable network connection cases. When this happens, the connector's attempts to reconnect are rejected by the server because it (thinks it) already has a connection for the source in question.
This fix, in a parallel with the server side fix, allows the connector, using a reconnect secret to assert that it is the connection the server thinks it knows about. The server will, then, honor the connection & close the old, now defunct, session.
The fix uses parts of the connect extension message not previous used, so old servers will ignore it. And it is optional, so old clients talking to new servers will be fine.