Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate analyzer path when starting/resuming replication #64

Merged
merged 1 commit into from
Jul 28, 2021

Conversation

soosinha
Copy link
Member

@soosinha soosinha commented Jul 21, 2021

Signed-off-by: Sooraj Sinha [email protected]

Description

If a custom analyzer is present in the leader index, then the replication would silently fail if the same analyzer is not present in the follower cluster as well.
So this change validates the presence of analyzer on the follower cluster before starting or resuming the replication. If the analyzer is not present, the we will fail the replication upfront.
Also, the user input settings in the replication request are now set to the restore request so that the follower index is created with the bootstrapped settings.

Testing

Normal IT are passing.
Manual testing as below:

  • Start the replication for an index which has analyzer on the leader index but the analyzer file is not present on follower
{{LOCAL_FOLLOWER}}/_plugins/_replication/{{FOLLOWER_INDEX}}/_start?pretty
{
    "remote_cluster":"remote-cluster-1",
    "remote_index": "{{LEADER_INDEX}}"
}

{
    "error": {
        "root_cause": [
            {
                "type": "resource_not_found_exception",
                "reason": "IOException while reading index.analysis.filter.my_filter.synonyms_path: /Volumes/workplace/replication_oss/cross-cluster-replication/build/testclusters/followCluster-0/config/analysis/synonyms.txt"
            }
        ],
        "type": "resource_not_found_exception",
        "reason": "IOException while reading index.analysis.filter.my_filter.synonyms_path: /Volumes/workplace/replication_oss/cross-cluster-replication/build/testclusters/followCluster-0/config/analysis/synonyms.txt"
    },
    "status": 404
}
  • Copied the analyzer file to the follower cluster and started the replication again.
{
    "acknowledged": true
}
  • Resume the replication for an index where the analyzer has been added at the leader
{{LOCAL_FOLLOWER}}/_plugins/_replication/{{FOLLOWER_INDEX}}/_resume?pretty
{
    "error": {
        "root_cause": [
            {
                "type": "resource_not_found_exception",
                "reason": "IOException while reading index.analysis.filter.my_filter.synonyms_path: /Volumes/workplace/replication_oss/cross-cluster-replication/build/testclusters/followCluster-0/config/analysis/synonyms.txt"
            }
        ],
        "type": "resource_not_found_exception",
        "reason": "IOException while reading index.analysis.filter.my_filter.synonyms_path: /Volumes/workplace/replication_oss/cross-cluster-replication/build/testclusters/followCluster-0/config/analysis/synonyms.txt"
    },
    "status": 404
}
  • Resume the replication after copying the analyzer to the follower
{
    "acknowledged": true
}
  • Exception occurs in autofollow if the analyzer is not found
[2021-07-21T12:30:27,015][WARN ][c.a.e.r.t.a.AutoFollowTask] [followCluster-0][remote-cluster-1] Failed to start replication for remote-cluster-1:af2 -> af2.
org.elasticsearch.ResourceNotFoundException: IOException while reading index.analysis.filter.my_filter.synonyms_path: /Volumes/workplace/replication_oss/cross-cluster-replication/build/testclusters/followCluster-0/config/analysis/synonyms1.txt
        at com.amazon.elasticsearch.replication.action.index.TransportReplicateIndexAction.validateAnalyzerSettings(TransportReplicateIndexAction.kt:108) ~[?:?]
        at com.amazon.elasticsearch.replication.action.index.TransportReplicateIndexAction.access$validateAnalyzerSettings(TransportReplicateIndexAction.kt:44) ~[?:?]
        at com.amazon.elasticsearch.replication.action.index.TransportReplicateIndexAction$doExecute$1.invokeSuspend(TransportReplicateIndexAction.kt:72) ~[?:?]
  • For user override settings, created index with synonyms.txt on leader then placed synonyms1.txt and follower. Then while starting the replication, overrode the setting with synonyms1.txt and the replication was successful
{
    "remote_cluster":"remote-cluster-1",
    "remote_index": "{{LEADER_INDEX}}",
    "settings": {
        "index":{
            "analysis": {
                "filter":{
                    "my_filter":{
                        "synonyms_path":"synonyms1.txt"
                    }
                }
            }
        }
    }
}
{
    "acknowledged": true
}

Issues Resolved

#63

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Collaborator

@gbbafna gbbafna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall.

Can we please add Integ Tests as per your manual tests as well ?

Comment on lines 177 to 189
private fun validateAnalyzerSettings(settings: Settings) {
val analyserSettings = settings.filter{ k: String? -> k!!.matches(Regex("index.analysis.*path"))}
for (analyserSetting in analyserSettings.keySet()) {
val settingValue = analyserSettings.get(analyserSetting)
val path: Path = environment.configFile().resolve(settingValue)
if (!Files.exists(path)) {
val message = "IOException while reading ${analyserSetting}: ${path.toString()}"
log.error(message)
throw ResourceNotFoundException(message)
}
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the duplication of this method ?

@soosinha soosinha force-pushed the validate_analyser branch from 3b071c7 to 2ef3106 Compare July 21, 2021 19:41
gbbafna
gbbafna previously approved these changes Jul 22, 2021
Copy link
Collaborator

@gbbafna gbbafna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes. LGTM.

}
}

fun `test that replication starts successfully when custom analyser is present in follower`() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include one where mapping uses it as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation is only on whether the analyzer setting is present or not. Whether the replication starts will only depend on the presence of the analyser file and the presence of mapping makes no change.
Why do think there should be a test for the case the mapping uses the analyser ?

@soosinha
Copy link
Member Author

Made some changes in the approach. Along with the validation on the analyzer settings, I am now passing the settings supplied by the user in the replication request to the restore request so that the follower index is bootstrappped with the overridden settings.

@krishna-ggk krishna-ggk requested a review from gbbafna July 23, 2021 12:07
Comment on lines +702 to +703
val replMetadata = replicationMetadataManager.getIndexReplicationMetadata(this.followerIndexName)
restoreRequest.indexSettings(replMetadata.settings)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are we overriding the setting provided from Start API ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is the line where it is done.
replMetadata.settings provides the settings input by the user in the request.

@soosinha soosinha merged commit 90c0ef1 into opensearch-project:main Jul 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants