Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misrepresentation of data when using the Optimized PostProcessor #111

Open
rjwills28 opened this issue Nov 18, 2020 · 1 comment
Open

Misrepresentation of data when using the Optimized PostProcessor #111

rjwills28 opened this issue Nov 18, 2020 · 1 comment

Comments

@rjwills28
Copy link
Contributor

We have seen an issue with the AA Optimized post-processor when returning values for bins that contain 0 samples. When a bin contains samples there are no problems, the Optimized post processor returns the mean, standard deviation, min, max and number of samples. However when a bin contains no samples, the current implementation of the Optimized post processor is such that it inherits the values from the previous bin that did contain samples. This leads to a misrepresentation of the data, which we have observed when plotting the results. For example, say a bin that contains samples has a mean of 20 and the last value in that bin reflects the PV going to 5. The next 10 bins have no samples (i.e. no change in the PV) and so the value of the bin with mean 20 is inherited for all of these. This does not reflect the fact that the PV actually had the value of 5 for this period of time.

We have a proposed and tested fix for this issue. This involves creating a new post processor named 'OptimizedWithLastValue'. This post processor always keeps track of the last sample to be added to each bin. If a bin contains no samples, the mean, min and max are determined using the last sample added to the previous bin that contained samples. The number of samples would also be set to 0.

For example:

  • Bin with samples:
    Bin 40000: nSamples = 10, mean = 30, min = 0, max = 50, last value = 20

  • Next bin has no samples:
    Bin 40001: nSamples = 0, mean = 20, min = 20, max = 20, last value = 20

  • Next bin has no samples:
    Bin 40002: nSamples = 0, mean = 20, min = 20, max = 20, last value = 20

This means that in periods where the PV does not change, the last value it reported is the one that is returned.

The proposed changes that would be required to the AA include:

  • The addition of a new PostProcessor class: archiveappliance/retrieval/postprocessors/OptimizedWithLastSample.java
  • Modification to the archiveappliance/retrieval/postprocessors/PostProcessor.java class to register the new PostProcessor.
@willrogers
Copy link
Contributor

We think this would solve some of the user problems we've seen when using the AA and CS-Studio at DLS, and adding another PostProcessor shouldn't be a problem I guess.

What do you think @slacmshankar ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants