You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
CMSSW releases generate invalid CMSSW popularity JSON records which we send to our UDP popularity service which by itself send them over to CMS popularity in CERN MONIT infrastructure.
Finally, I got time to debug our UDP proxy server used by many CMS GRID jobs who send to it its popularity info (via JSON). Here is a problematic JSON:
To make your life easier to spot the problem, the issue is this part: "read_single_sigma":-nan which is INVALID from JSON data-format point of view, i.e. the -nan is invalid value.
The code in question is here, and, in particular, here is how it assign this value:
As far as I can tell there is no check if single_op_count can be zero which may cause -nan value. I truly hope that someone can take care of this bug and fix it in ALL CMSSW releases.
Because of it we loose around 7K records per day in CMS popularity metrics and I can't judge if it is too much or negligible but it seems to me it is sufficient to make a fix. So far our server ignores these invalid JSON records but it would be nice to finally patch CMSSW and get rid off this problem, and get better data for monitoring purposes.
Best,
Valentin.
The text was updated successfully, but these errors were encountered:
I also suggest to inspect the entire codebase to verify that what is written to os stream is valid numerical or string values. I propose to add proper JSON validation to the code.
Hi,
CMSSW releases generate invalid CMSSW popularity JSON records which we send to our UDP popularity service which by itself send them over to CMS popularity in CERN MONIT infrastructure.
Finally, I got time to debug our UDP proxy server used by many CMS GRID jobs who send to it its popularity info (via JSON). Here is a problematic JSON:
To make your life easier to spot the problem, the issue is this part:
"read_single_sigma":-nan
which is INVALID from JSON data-format point of view, i.e. the-nan
is invalid value.The code in question is here, and, in particular, here is how it assign this value:
As far as I can tell there is no check if single_op_count can be zero which may cause -nan value. I truly hope that someone can take care of this bug and fix it in ALL CMSSW releases.
Because of it we loose around 7K records per day in CMS popularity metrics and I can't judge if it is too much or negligible but it seems to me it is sufficient to make a fix. So far our server ignores these invalid JSON records but it would be nice to finally patch CMSSW and get rid off this problem, and get better data for monitoring purposes.
Best,
Valentin.
The text was updated successfully, but these errors were encountered: