You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Related to #3881
On adding a node of a higher version to an opensearch cluster and manually rerouting a shard onto the new node (which is using a higher version codec) we see a compatibility check failure due to the codec mismatch.
For eg: node1, node2, node3 are nodes using codec Lucene94 and we add a node "new-1" which is on a new version using codec Lucene95.
We then run a reroute of a shard from node1 to new-1.
We then see a replication failure with the root cause being:
Caused by: org.opensearch.common.util.CancellableThreads$ExecutionCancelledException: ParameterizedMessage[messagePattern=Requested unsupported codec version {}, stringArgs=[Lucene95], throwable=null]
To Reproduce
Steps to reproduce the behavior:
Create cluster with nodes of differing lucene codec versions.
Manually reroute shards from lower codec to higher codec.
You will see a replicationFailure thrown.
Expected behavior
This is faulty behavior since we always want to enable movement of shards onto higher version nodes - the most common scenario being during an upgrade. We want the check to allow the shard movement to go through if the target node is on a higher version.
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
OS: [e.g. iOS]
Version [e.g. 22]
Additional context
Add any other context about the problem here.
Full stack trace:
Caused by: org.opensearch.transport.RemoteTransportException: [node-2][172.31.6.116:9300][internal:index/shard/recovery/start_recovery]
Caused by: org.opensearch.transport.RemoteTransportException: [new-2][172.31.4.178:9300][internal:index/shard/replication/segments_sync]
Caused by: org.opensearch.indices.replication.common.ReplicationFailedException: [my-index1][1]: Replication failed on
at org.opensearch.indices.replication.SegmentReplicationTargetService$3.onFailure(SegmentReplicationTargetService.java:362) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.action.ActionListener$1.onFailure(ActionListener.java:88) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.action.ActionRunnable.onFailure(ActionRunnable.java:103) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:54) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.common.util.concurrent.OpenSearchExecutors$DirectExecutorService.execute(OpenSearchExecutors.java:341) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.common.util.concurrent.ListenableFuture.notifyListener(ListenableFuture.java:120) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.common.util.concurrent.ListenableFuture.lambda$done$0(ListenableFuture.java:112) ~[opensearch-3.0.0.jar:3.0.0]
at java.util.ArrayList.forEach(ArrayList.java:1511) ~[?:?]
at org.opensearch.common.util.concurrent.ListenableFuture.done(ListenableFuture.java:112) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.common.util.concurrent.BaseFuture.setException(BaseFuture.java:178) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.common.util.concurrent.ListenableFuture.onFailure(ListenableFuture.java:149) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.action.StepListener.innerOnFailure(StepListener.java:82) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.action.NotifyOnceListener.onFailure(NotifyOnceListener.java:62) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.action.ActionListener$4.onFailure(ActionListener.java:190) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.action.ActionListener$6.onFailure(ActionListener.java:309) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.action.support.RetryableAction$RetryingListener.onFinalFailure(RetryableAction.java:218) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.action.support.RetryableAction$RetryingListener.onFailure(RetryableAction.java:210) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:74) ~[opensearch-3.0.0.jar:3.0.0]
... 6 more
Caused by: org.opensearch.common.util.CancellableThreads$ExecutionCancelledException: ParameterizedMessage[messagePattern=Requested unsupported codec version {}, stringArgs=[Lucene95], throwable=null]
at org.opensearch.indices.replication.OngoingSegmentReplications.prepareForReplication(OngoingSegmentReplications.java:154) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.indices.replication.SegmentReplicationSourceService$CheckpointInfoRequestHandler.messageReceived(SegmentReplicationSourceService.java:138) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.indices.replication.SegmentReplicationSourceService$CheckpointInfoRequestHandler.messageReceived(SegmentReplicationSourceService.java:119) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:106) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:453) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[opensearch-3.0.0.jar:3.0.0]
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-3.0.0.jar:3.0.0]
... 3 more
The text was updated successfully, but these errors were encountered:
Describe the bug
Related to #3881
On adding a node of a higher version to an opensearch cluster and manually rerouting a shard onto the new node (which is using a higher version codec) we see a compatibility check failure due to the codec mismatch.
For eg: node1, node2, node3 are nodes using codec Lucene94 and we add a node "new-1" which is on a new version using codec Lucene95.
We then run a reroute of a shard from node1 to new-1.
We then see a replication failure with the root cause being:
To Reproduce
Steps to reproduce the behavior:
Expected behavior
This is faulty behavior since we always want to enable movement of shards onto higher version nodes - the most common scenario being during an upgrade. We want the check to allow the shard movement to go through if the target node is on a higher version.
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
Additional context
Add any other context about the problem here.
Full stack trace:
The text was updated successfully, but these errors were encountered: