Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1 branch: flash loader bugs #528

Open
rettigl opened this issue Nov 30, 2024 · 7 comments
Open

v1 branch: flash loader bugs #528

rettigl opened this issue Nov 30, 2024 · 7 comments

Comments

@rettigl
Copy link
Member

rettigl commented Nov 30, 2024

The bam correction and normalization histograms do not work correctly in the v1 feature branch. See Tutorial 10. I suspect the issue to be in the new flash loader somewhere.

@rettigl rettigl mentioned this issue Nov 30, 2024
12 tasks
@rettigl
Copy link
Member Author

rettigl commented Nov 30, 2024

The issue seems mostly with the normalization histogram, which is very uneven for these scans. Also, the histograms from timestamps and from timed dataframes look very different

@rettigl
Copy link
Member Author

rettigl commented Nov 30, 2024

There are several fundamental flaws with the way we handle this at the moment:

  • Old vs. new flash loader: the difference is that the old flashloader only considered pulses in the timed dataframe containing electrons. Thus, we essentially normalize on the total amount of electrons in the whole spectrum. This gives smooth curves, but might induce artefacts if the total count rate depends significantly on pump-probe delay.
  • The new flash loader takes all configured pulses into the timed dataframe, i.e. means also pulses where actually no X-ray pulses are present. This again leads to a wrong normalization (if e.g. the bam is systematically different for "empty" pulses).
  • The histogram from timestamps also gives rather wrong results, if pulses are not continously coming. It essentially gives the pulses at the end of a train way too large weight (we build the derivative of the time stamps).
  • In addition, applying the corrections (e.g. bam) with conserve_mean=True also leads to inconstistent results between the normal and timed_dataframe, because the means of the two dataframes are different, and thus a different correction is applied in both cases.

I am not sure what is the best way how to correct his. What is your opinion @zain-sohail ?

@rettigl
Copy link
Member Author

rettigl commented Dec 1, 2024

Here is some further output:
Histograms for uncorrected data:
main:
grafik
v1:
grafik
Histograms for BAM correction w/ preserve_mean=True:
main:
grafik
v1:
grafik
Histograms for BAM correction w/ preserve_mean=False:
main:
grafik
v1:
grafik

@zain-sohail
Copy link
Member

The bam correction and normalization histograms do not work correctly in the v1 feature branch. See Tutorial 10. I suspect the issue to be in the new flash loader somewhere.

If I look at the Gmd trace of this run 44498 used in tutorial 10:
image
Looking at the events before the jump:
image

This is likely the culprit for problems with normalization. Did you notice this in other runs too?

I am not sure what is the best way how to correct his. What is your opinion @zain-sohail ?

So I see three options for fixing this:

  • not using configured pulses but pulses with x-rays. Meaning a GMD based filtering. Though, I don't know how valid this is.
  • only including pulses with electrons like before, and saving the configured pulse parquet files separately to be used in diagnostic tools but not in any SedProcessor workflow
  • Add a flag in corrections workflow that filters out only pulses with electrons. This could be easily done with the multiindex scheme but not sure how feasible with dask.

@rettigl
Copy link
Member Author

rettigl commented Dec 2, 2024

I'm not sure I understand what this GMD signal means. Regarding how to fix this best, I think it would be good to discuss this at another meeting. How could we implement/test the first option you propose? Regarding the second option, this used to produce much smoother curves, however at the cost of normalizing away any pump-induced changed to the total photoelectron yield. They might be anyways very small at the larger photon energies typically used at Flash.

@zain-sohail
Copy link
Member

I'm not sure I understand what this GMD signal means.

I'll check with dima as well about this.

Regarding how to fix this best, I think it would be good to discuss this at another meeting.

Yes we were thinking to host one before the end of year. Could you send around a mail, depending on your availability?

How could we implement/test the first option you propose?

This should be simple. The filtering can happen per file. The GMD curves you saw were time averaged per minute but looking at per pulse, we should see 0s when there is no x-ray pulse. So just discarding the rows with GMD=0. Testing should be simple as well but the validity needs to be ascertained.

Regarding the second option, this used to produce much smoother curves, however at the cost of normalizing away any pump-induced changed to the total photoelectron yield. They might be anyways very small at the larger photon energies typically used at Flash.

Let me also check this with Dima.

@rettigl
Copy link
Member Author

rettigl commented Dec 9, 2024

This should be simple. The filtering can happen per file. The GMD curves you saw were time averaged per minute but looking at per pulse, we should see 0s when there is no x-ray pulse. So just discarding the rows with GMD=0. Testing should be simple as well but the validity needs to be ascertained.

This does not appear to be the case:
grafik
Pulse Ids with electrons are only up to ~500 or so (a few ones with >4000 also exist, no idea where from...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants