Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardization of DBSServer error code/message #10962

Closed
amaltaro opened this issue Jan 27, 2022 · 12 comments · Fixed by #11173
Closed

Standardization of DBSServer error code/message #10962

amaltaro opened this issue Jan 27, 2022 · 12 comments · Fixed by #11173

Comments

@amaltaro
Copy link
Contributor

Impact of the new feature
WMCore (and possibly other DMWM projects)

Is your feature request related to a problem? Please describe.
This issue still needs further discussion, and it all started from this comment and a couple comments below it:
#10961 (review)

in short, we are considering to standardize the response sent back from the DBSServer, such that our clients no longer need to match for a specific message in the HTTP response object (body attribute). Instead, we would only check for the response code status and that would fully identify which error our request hit.

Describe the solution you'd like
It's a placeholder and the final solution needs to be yet discussed and agreed upon.

In short, a full list of error status code and message should be defined in DBSServer, such that when clients are verifying why a DBS HTTP request fail, we can solely use a code or status response attribute to identify that. In addition to DBSServer, we also expect this to be integrated within dbs3-client/dbs3-pycurl.

With that, we need to update DBS error handling within WMCore (and other projects), including but not limited to DBS3Reader/DBS3Writer, DBS3Upload component and possible the MicroService Common module.

Describe alternatives you've considered
Keep the usual schema, which searches for a given sentence in the error message.

Additional context
None

@vkuznet
Copy link
Contributor

vkuznet commented Jan 27, 2022

Here is placeholder for work in dbs2go: dmwm/dbs2go#1

@vkuznet
Copy link
Contributor

vkuznet commented Feb 4, 2022

The new DBS error code structure is deployed to testbed and all DBS unit/integration tests are passed. The changes are fully compatible with existing DBS server errors, i.e. I preserved message, type, exception. Full description of new DBS error codes can be found in dmwm/dbs2go#1

@vkuznet
Copy link
Contributor

vkuznet commented Mar 22, 2022

I'm not sure what else should be done here since new DBS error structures and codes are in place on production servers. The documentation about new DBS error codes are in place too and can be found here. It provides concrete example of error structure when error is issued. I don't think anything specifically needs to be done here and we can safely close this issue and marked as resolved.

If there are specific suggestions to improve the error structures and code they should be put explicitly in this ticket.

@vkuznet vkuznet self-assigned this Mar 22, 2022
@amaltaro
Copy link
Contributor Author

Haven't we standardized these errors in DBS such that clients no longer need to parse generic error codes to see what the error reason was? If that's the case, then I think we now need to update the WMCore DBS-related code, whenever it parses errors from DBS, such that we only check the error code.

BTW, your previous message points to an example that has inconsistent information, this:

    "error": {
      "reason": "DBSError Code:114 Description:DBS validation error when wrong pattern is provided Function:dbs.validator.Check Message:unable to match 'data_tier_name' value '1' Error: invalid parameter(s)",
      "message": "not str type",
      "function": "dbs.Validate",
      "code": 113
    },

the code attribute does not match the code reported in the reason attr.

@vkuznet
Copy link
Contributor

vkuznet commented Mar 22, 2022

Alan, I'm not sure what do you mean by saying "Haven't we standardized these errors in DBS". I made error codes, I added appropriate structures, etc. If this is what you mean by standardized procedure, the answer is yes. If you need to parse DBS error you'll follow provided documentation. I made changes such that they are backward compatible with python code implementation. Finally, I already explained to you that new DBS error provided nested errors and example which you refer is correct. I think I already provided you explanation in different ticket. But here it is again:

  • the code and message represent outermost error
  • the reason goes deep on why it happens and it shows nested error
    In this case:
  • the outer error code is 113, and it was done in dbs.Validate function and it is because not str type error, but
  • this error actually happen in a deeper codebase. The reason says (I only added newlines for you to read it easily):
"DBSError Code:114 
 Description:DBS validation error when wrong pattern is provided 
 Function:dbs.validator.Check 
 Message:unable to match 'data_tier_name' value '1' 
Error: invalid parameter(s)"

So, the actual error has code 114, which happened in function dbs.validator.Check which through error invalid parameter(s) and caused by unable to match data_tier_name value 1`

SO, as a client you can easily understand what and how the error happens.

@amaltaro
Copy link
Contributor Author

Valentin, what I had understood is that we would be able to replace a code like this:
https://github.com/dmwm/WMCore/blob/master/src/python/WMComponent/DBS3Buffer/DBSUploadPoller.py#L87-L109

by something similar to

if exceptionObject.code == 113:
  then error is blah
if exceptionObject.code == 114:
  then blah blah

without actually parsing the error message as a string search.

If I'm reading the DBSClient code correctly, here:
https://github.com/dmwm/DBSClient/blob/main/src/python/dbs/apis/dbsClient.py#L452

should be passing back to the client the code attribute from the error dictionary (not from the http one). If that's correct, then indeed we could have a map of such error codes to its actual error message - similar to what we have in WMExceptions - and stop relying on the written out error message, only relying on the code.

@vkuznet
Copy link
Contributor

vkuznet commented Mar 23, 2022

Yes, you now have a possibility to avoid parsing HTTP messages and only rely on error codes which are defined in new DBS server.

Regarding DBSClient codebase. It has so many (in my view unnecessary) wrappers around provided error that it will require a separation discussion. But since new DBS server provides standardize error we can converge and clean-up DBSClient code to return it to upstream caller.

@vkuznet
Copy link
Contributor

vkuznet commented Apr 1, 2022

Alan, I don't know how you'd like to proceed with this issue. But here is my proposal:

  • I added to DBS servers the /errors end-point which anyone can query to obtain all available DBS error codes along with their explanation. Please check it out on any Go based DBS server, e.g. https://cmsweb.cern.ch/dbs/prod/global/DBSReader/errors and it provides long list of errors:
[
  {
    "reason": "Generic DBS error",
    "message": "DBS error definition defined at run-time",
    "function": "DBS error origin defined at run-time",
    "code": 100
  },
  {
    "reason": "DBS DB error",
    "message": "DBS error definition defined at run-time",
    "function": "DBS error origin defined at run-time",
    "code": 101
  },
...
]
  • I suggest that you adapt in WMCore a request to /errors end-point to obtain full JSON list and extract from it error codes which later you can use in WMCore code base.
  • feel free to put a proposal how to adjust WMCore code to adapt new error codes and we can start providing PRs, e.g. you can create DBSErrors class which will cache error codes and reasons, then you may use this class across WMCore code to handle any DBS error. For its logic please review provided DBS error structure and decide how to handle it. As you can see each record has four attributes:
    • reason defines reason or definition of the error (the later is used by /errors DBS API)
    • message defines actual message thrown at run-time
    • function provides location of the function in Go based code used as module.Function pairs
    • code defines the error code

@vkuznet
Copy link
Contributor

vkuznet commented Jun 6, 2022

@amaltaro please review initial implementation in #11173

@vkuznet
Copy link
Contributor

vkuznet commented Nov 6, 2022

@amaltaro , your review of #11173 is pending since Jun 6th. Please either review or decide to close this ticket. And, if we need to proceed please move it to Q4.

@amaltaro
Copy link
Contributor Author

amaltaro commented Nov 6, 2022

There is no review requested through GH, hence it does not show up when I filter for pending reviews.

@amaltaro
Copy link
Contributor Author

amaltaro commented Nov 16, 2022

Given that we need to synchronize the proposed changes with builds made against cmsdist (dependency on the T0 team), I am moving this issue to the "Waiting" column.

UPDATE: actually it wasn't planned for Q4... I just added it to the Q4/2022 project board.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants