Enable Azure snapshot plugin to support taking snapshot into multiple storage accounts. #22709

JeffreyZZ · 2017-01-20T03:27:09Z

The default Elasticsearch Azure snapshot plugin write the whole snapshot data into a single Azure storage account. For big Elasticsearch cluster with multiple TB data in size, the snapshot could fail because of the storage account throttling limit. Here is an example of the snapshot failure error:

    {
      "node_id": "cR_OjhxaRvW0ZmR14w8c6g",
      "index": "rawevents_v3.2016_06_29",
      "reason": "IndexShardSnapshotFailedException[[rawevents_v3.2016_06_29][1] Failed to perform snapshot (index files)]; nested: IOException; nested: StorageException[The server encountered an unknown failure: ]; nested: IOException[Error writing to server]; ",
      "shard_id": 1,
      "status": "INTERNAL_SERVER_ERROR"
    }

To address this problem, we extend Elasticsearch Azure plugin by adding the feature to support taking snapshot into and restore from multiple storage accounts to avoid overload a single storage account. That said, Elasticsearch can write their snapshot data into multiple storage accounts evenly and in parallel.

Here are the configuration as well as the commands to take snapshot and restore snapshot:

Elasticsearch.yml
cloud.azure.storage.my_account1.account: storageaccount1
cloud.azure.storage.my_account1.key: key1
cloud.azure.storage.my_account1.default: true
cloud.azure.storage.my_account2.account: storageaccount2
cloud.azure.storage.my_account2.key: key2
cloud.azure.storage.my_account2.default: true
cloud.azure.storage.my_account3.account: storageaccount3
cloud.azure.storage.my_account3.key: key3
cloud.azure.storage.my_account3.default: true

Commands
#1: define repository
PUT _snapshot/plugintest160921
{
"type": "azure",
"settings": {
"account": "my_account1,my_account2,my_account3",
"container": "plugintest160921"
}
}

#2: take snapshot
PUT _snapshot/plugintest160921/backup0921?wait_for_completion=true
{
}

#3: restore
POST _snapshot/plugintest160921/backup0921/_restore?wait_for_completion=true
{
"ignore_unavailable": "true",
"include_global_state": false
}

#4: define repository for secondary
PUT _snapshot/plugintest160921
{
"type": "azure",
"settings": {
"account": "my_account1,my_account2,my_account3",
"container": "plugintest160921",
"location_mode": "secondary_only"
}
}

…age accounts

clintongormley · 2017-01-20T17:39:09Z

Thanks for the PR. Please could I ask you to sign the CLA before we review it?
http://www.elasticsearch.org/contributor-agreement/

rjernst · 2017-01-20T21:36:33Z

Perhaps we need to be better about detecting throttling events and backing off instead of failing, and you also may need to expand your storage to increase the IOPS capacity, but I don't think we should make the (already complicated) repository settings here even more complicated. Introducing multiple accounts for a single repository raises all kinds of questions that I don't think we should be worrying about.

abeyad · 2017-01-20T21:47:46Z

I'm not sure this will solve the problem either. What if all configured storage accounts become full - then what? We would have to rehash the blobs to different buckets once more storage accounts are added. I think it will become too difficult to maintain and will require a lot of logic to solve a problem that most don't have. The simple solution here would be to create a new repository located at a different storage account. It will add a bit of extra complexity on the user to look into multiple repositories to find the snapshot they may be looking for, but in this case, I believe its the right tradeoff.

abeyad · 2017-01-20T21:57:36Z

Sorry I misunderstood regarding a throttling limit on the Azure storage accounts vs. a size limit. In any case, I think the complexity argument still holds (adding or removing a storage account would require rehashing all blobs to different accounts).

Perhaps we need to be better about detecting throttling events and backing off instead of failing

++

rjernst · 2017-01-20T23:34:46Z

@JeffreyZZ Thank you for the PR, but we are going decline for now. I suggest looking into increasing your IOPS as I mentioned earlier, and I opened an issue to make our behavior on throttling better (so as not to fail a snapshot when throttled): #22728. And you are of course welcome to work on a PR for that issue!

JeffreyZZ · 2017-01-22T06:13:28Z

Thanks for the quick response. I think that the feature to enable Elasticsearch to write and read snapshot to/from multiple Azure storage accounts is very important for running big production clusters (with 50+ data nodes) on public cloud, such as Azure. Here I’d like to provide more investigation details that we did before I started to add this feature to the plugin for our production cluster running on Azure cloud.

Azure Storage accounts of higher IOPs.
When we ran into the throttling issues consistently with our production clusters on Azure while writing the snapshot data into a single storage account, we engaged with Azure Storage team to confirm that the issue was because that we wrote too much data into the Azure storage accounts over a period time, which exceeded the account bandwidth limit and triggered the throttling. To solve this problem easily in an easy way, we first consulted Azure Storage production team to ask for the storage accounts of higher IOPs or the storage accounts in the dedicated clusters, where there were NO noisy neighbors. However, the answer that we got from Azure Storage team is that there were NO accounts of higher IOPS, and the only alternative is to shard across multiple accounts. As Elasticsearch is a distributed system, when taking snapshot, each data node writes the data to the storage account independent of each other. So it’s a perfect scenario to allow it to write the potential big snapshot data into different storage accounts to achieve higher IOPS by distributing the data load across multiple accounts. I’m NOT familiar with AWS, but I think that it should also apply to the clusters running on AWS cloud.
Exception handling
The current exception handling and its retry logical is already able to handle the exceptions thrown when writing the data into Azure Storage account, it includes the throttling exception. What’s more, we also introduced ExponentialRetry to enhance the retry logic to make sure that it has enough retries when the exceptions happened, see the code line below. But based on our investigation, this doesn’t help on the throttling scenarios, because when the snapshot data size is too big, such as 9TB in our case, it’s beyond the capability of exception retries and throttling last over a long period of time. Since the root cause is the bandwidth limit of a single storage account, I think that enabling the sharding of snapshot across multiple storage accounts is the right way to go for it.
client.getDefaultRequestOptions().setRetryPolicyFactory(new RetryExponentialRetry(1000 * 30, 7));
Backward combability
I agree that supporting more multiple storage accounts incurs more complicated repository settings. But I think this complexity is reasonable and acceptable considering the scenario that it could address. Furthermore, it’s back-ward compatible. That is to say, if you’re using one storage account, there is NO change to you with this feature added.

clintongormley · 2017-01-30T12:06:16Z

Hi @JeffreyZZ

We've had a long internal discussion about this PR and the problem in general. We believe that this PR is not the right way to solve the problem because of the notion of bucketing blobs based on the number of accounts; if you change the number of accounts, you break everything (you can’t restore anymore), and your future snapshots will be sending blobs to different accounts. The design is fundamentally flawed.

Long term, we would like to rewrite snapshot-restore to use Lucene's recovery process instead of the BlobStore that we use today. With that rewrite in place, we could treat multiple repos as separate disks, and put one index on each "disk" (the same way we treat multiple local mount-points today). This would solve your issue in a much cleaner way.

Obviously, that is a major rewrite and will not be happening anytime soon.

In the meantime (and given the limitations of Azure) I'd suggest breaking your snapshots down by index (which could be sent to different accounts) so that you do not run into these issues with throttling.

JeffreyZZ · 2017-02-03T16:42:38Z

Hi, @clintongormley ,

Thanks for the team's efforts to evaluate this PR and sharing your thoughts on. I think this might be the right way to solve the throttling problem in the long run. By the way, this is another change in PR about improving the retry with exponential retry, see the code line below, for your reference. This should help to improve the performance of retry.

client.getDefaultRequestOptions().setRetryPolicyFactory(new RetryExponentialRetry(1000 * 30, 7));

Thanks, Jeffrey

dadoonet · 2017-02-03T17:46:18Z

@JeffreyZZ I agree that it could be a nice separated PR to send. Wanna do it? I believe it should be available as a setting though.

Jeffrey Zhou added 2 commits January 7, 2017 23:31

Update blobstore, service and settings files to support multiple stor…

3fb4a84

…age accounts

Enable exponential retry for Azure storage access.

0464784

dadoonet self-assigned this Jan 20, 2017

dadoonet added the :Plugin Repository Azure label Jan 20, 2017

clintongormley added >enhancement review Awaiting CLA labels Jan 20, 2017

rjernst mentioned this pull request Jan 20, 2017

Backoff and retry instead of failing on throttling from azure #22728

Closed

rjernst closed this Jan 20, 2017

clintongormley added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs and removed :Plugin Repository Azure labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Azure snapshot plugin to support taking snapshot into multiple storage accounts. #22709

Enable Azure snapshot plugin to support taking snapshot into multiple storage accounts. #22709

JeffreyZZ commented Jan 20, 2017

clintongormley commented Jan 20, 2017

rjernst commented Jan 20, 2017

abeyad commented Jan 20, 2017

abeyad commented Jan 20, 2017

rjernst commented Jan 20, 2017

JeffreyZZ commented Jan 22, 2017

clintongormley commented Jan 30, 2017

JeffreyZZ commented Feb 3, 2017

dadoonet commented Feb 3, 2017

Enable Azure snapshot plugin to support taking snapshot into multiple storage accounts. #22709

Enable Azure snapshot plugin to support taking snapshot into multiple storage accounts. #22709

Conversation

JeffreyZZ commented Jan 20, 2017

clintongormley commented Jan 20, 2017

rjernst commented Jan 20, 2017

abeyad commented Jan 20, 2017

abeyad commented Jan 20, 2017

rjernst commented Jan 20, 2017

JeffreyZZ commented Jan 22, 2017

clintongormley commented Jan 30, 2017

JeffreyZZ commented Feb 3, 2017

dadoonet commented Feb 3, 2017