Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM REST methods fail on 201 response #205

Closed
axw opened this issue Sep 10, 2015 · 20 comments
Closed

ARM REST methods fail on 201 response #205

axw opened this issue Sep 10, 2015 · 20 comments
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved.

Comments

@axw
Copy link

axw commented Sep 10, 2015

I am testing out the ARM APIs, and have observed frequent (but not 100%) failures from methods due to "201 Created" being returned (e.g. by ResourceGroupsClient.CreateOrUpdate).

Looking at the code, I think the second-last "autorest.Respond" in each method that expects status codes other than 200 is invalid: in each, it uses WithErrorUnlessOK; I think it should be using WithErrorUnlessStatusCode, with the same values passed into DoErrorUnlessStatusCode in the SendWithSender call above.

@axw axw changed the title ARM RET methods fail on 201 response ARM REST methods fail on 201 response Sep 10, 2015
@ahmetb
Copy link
Contributor

ahmetb commented Sep 10, 2015

@axw are the respective REST calls actually returning 200/201 depending on the situation? If an endpoint returns both status codes we should not be failing. Or are we checking for the wrong status code?

@axw
Copy link
Author

axw commented Sep 10, 2015

@ahmetalpbalkan The resource-group creation is responding with 201.

If you look here:
https://github.com/Azure/azure-sdk-for-go/blob/master/arm/resources/resourcegroups.go#L202
the initial check on response status will not error if it's 201. But then it continues on to here:
https://github.com/Azure/azure-sdk-for-go/blob/master/arm/resources/resourcegroups.go#L211
and will return an error because the status code is not 200/OK.

@brendandixon
Copy link
Contributor

@axw Good catch! Your suggestion is correct. The code to support multiple "Ok" return codes came late and, obviously, missed a statement. We'll make changes to the needed changes to the generator. Thank you for finding this!

@axw
Copy link
Author

axw commented Sep 10, 2015

@brendandixon No worries, thank you.

@ahmetb ahmetb added bug This issue requires a change to an existing behavior in the product in order to be resolved. area/azureresourcemanager labels Sep 10, 2015
@vgarg
Copy link

vgarg commented Sep 17, 2015

@brendandixon just FYI, noticing a similar behavior when deleting subnets. Current SDK throws the following error upon deletion call:

Error: autorest:WithErrorUnlessStatusCode GET https://management.azure.com/subscriptions/***/providers/Microsoft.Network/locations/westus/operationResults/eddb29d0-f856-403a-bba1-233cafaaf706?api-version=2015-05-01-preview failed with 204 No Content

Note that the subnet actually gets deleted.

@brendandixon
Copy link
Contributor

@vgarg What method did you use? SubnetsClient#Delete? If so, then this is the same issue. As the code is generated, it's either all correct or all wrong. :)

@vgarg
Copy link

vgarg commented Sep 17, 2015

Yep, I used SubnetsClient#Delete.

@brendandixon
Copy link
Contributor

@vgarg Ok. Should be the same issue then. The sender has the correct set of return codes (e.g., 204 - No Content), but the responder does not.

@vgarg
Copy link

vgarg commented Sep 17, 2015

👍 Yeah, applying a similar fix to my copy. Frankly can't wait to get the official fixes for these as wrangling with godeps unhappiness is not fun :)

@brendandixon
Copy link
Contributor

@vgarg @axw This issue is addressed in the latest changes. Why not take a look and let us know. Thanks!

@brendandixon
Copy link
Contributor

Changes merged from PR #221

@vgarg
Copy link

vgarg commented Oct 5, 2015

Thanks @brendandixon . Looking forward to trying this new release.

@axw
Copy link
Author

axw commented Oct 28, 2015

@brendandixon It seems there's still some issues. I just got a similar error, deleting a resource group. It works most of the time.

Error:

autorest:DoErrorIfStatusCode GET https://management.azure.com/subscriptions/6f0e3b87-f601-40fd-9727-76d05563fedd/operationresults/eyJqb2JJZCI6IlJFU09VUkNFR1JPVVBERUxFVElPTkpPQi1KVUpVOjJERU5WSVJPTk1FTlQ6MkQ1QTMzNUU5RjoyRDQ3QUE6MkQ0OUE0OjJEOHxFQ0U0NTEyNEEyRkVFMDI0LVNPVVRIRUFTVEFTSUEiLCJqb2JMb2NhdGlvbiI6InNvdXRoZWFzdGFzaWEifQ?api-version=2014-04-01-preview failed with 202 Accepted

@brendandixon
Copy link
Contributor

@axw Exactly which API did you use? The standard Delete method for a ResourceGroup (https://github.com/Azure/azure-sdk-for-go/blob/master/arm/resources/groups.go@216) appears to properly handle 202 Accepted.

@axw
Copy link
Author

axw commented Oct 28, 2015

@brendandixon Sorry, should have been clearer. Yes, GroupsClient.Delete is the one. I did see that in the code, so I'm not sure why it would have happened. My local clone is up-to-date.

I'm digging into the code now to understand how the operationresults query is made.

@axw
Copy link
Author

axw commented Oct 28, 2015

@brendandixon It's possible that the deletion attempt took > 15mins, the default polling duration. If it happens again, I'll report back.

@brendandixon
Copy link
Contributor

@axw All good. I think I understand why you saw the error: 15 minutes is the default (pulled out of thin air) polling time (that is, the maximum amount of time the request polls for success). If the request failed to complete within that window, then the code returns whatever the last response code was (in this case, 202 Accepted).

Treating a 202 as success does not feel exactly right, since success implies you can begin using the resource (or, in your case, assume it's gone). It is possible that, after a 202, an error could still occur as well. However, a 202 is not truly an error such as, say, a 500 would be. It's almost as if we want a response that distinguishes success from state (that is, the request was successfully made, it's completion state is unknown). Such a distinction does not naturally fit within either the response or error object. Hmmmm...

@axw
Copy link
Author

axw commented Oct 28, 2015

@brendandixon Yes, understood. I think it might be most useful to return an error indicating that the operation has taken too long, and the sender is giving up waiting. The last response is probably not that interesting, since it's always going to be 202 (I think?)

I don't think it's OK to assume success on acceptance unless it's guaranteed that that implies it will succeed. After previous email correspondences, it was concluded that this is at least not true for resource group deletion.

@brendandixon
Copy link
Contributor

@axw I'm reviewing our open issues and took another look at this. It appears that, in all cases, the http.Response is returned. If the code directly invokes the generated xyzSender, the http.Response is returned directly. If the code invokes the helper-wrapper, the http.Response is either returned directly or as an embedded member of the returned "result." So, in all cases, I believe the calling code has sufficient information to make a good decision. I don't see that there's more to do here...Ok?

@axw
Copy link
Author

axw commented Nov 11, 2015

@brendandixon OK. I think it's going to be a pain to wrap each and every call to check for 202 and convert to a timeout error, but perhaps others will want to react differently. I'll close this, thanks.

@axw axw closed this as completed Nov 11, 2015
richardpark-msft pushed a commit to richardpark-msft/azure-sdk-for-go that referenced this issue Aug 5, 2021
* Consolidate auth auto-refresh

Moved the enabling of auth auto-refresh to when calling
negotiateClaim().  This ensures that all code paths that require
authentication will enable auto-refresh.
Ensure that there's only one goroutine that performs the refresh.
Changed auth refresh interval to 15 minutes per guideance from SB.
Exit auth auto-refresh goroutine if it fails; this should help trigger
recovering of clients due to failed authentication.

* cancel auth when closing RPC client

* reuse existing timer

* reset the the auto-refresh guard on exit

The Recover() case will rebuild the AMQP client which means a new
refresh goroutine should be started.
Use a channel to synchronize with the refresh goroutine exiting.

* improved refresh goroutine synchronization

* don't exit refresh goroutine on refresh failure

* rename vars

* simplify timer
benbp pushed a commit to benbp/azure-sdk-for-go that referenced this issue Sep 15, 2021
* Consolidate auth auto-refresh

Moved the enabling of auth auto-refresh to when calling
negotiateClaim().  This ensures that all code paths that require
authentication will enable auto-refresh.
Ensure that there's only one goroutine that performs the refresh.
Changed auth refresh interval to 15 minutes per guideance from SB.
Exit auth auto-refresh goroutine if it fails; this should help trigger
recovering of clients due to failed authentication.

* cancel auth when closing RPC client

* reuse existing timer

* reset the the auto-refresh guard on exit

The Recover() case will rebuild the AMQP client which means a new
refresh goroutine should be started.
Use a channel to synchronize with the refresh goroutine exiting.

* improved refresh goroutine synchronization

* don't exit refresh goroutine on refresh failure

* rename vars

* simplify timer
@github-actions github-actions bot locked and limited conversation to collaborators Apr 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved.
Projects
None yet
Development

No branches or pull requests

4 participants