v1 branch: flash loader bugs #528

rettigl · 2024-11-30T20:38:02Z

The bam correction and normalization histograms do not work correctly in the v1 feature branch. See Tutorial 10. I suspect the issue to be in the new flash loader somewhere.

rettigl · 2024-11-30T21:47:01Z

The issue seems mostly with the normalization histogram, which is very uneven for these scans. Also, the histograms from timestamps and from timed dataframes look very different

rettigl · 2024-11-30T23:32:57Z

There are several fundamental flaws with the way we handle this at the moment:

Old vs. new flash loader: the difference is that the old flashloader only considered pulses in the timed dataframe containing electrons. Thus, we essentially normalize on the total amount of electrons in the whole spectrum. This gives smooth curves, but might induce artefacts if the total count rate depends significantly on pump-probe delay.
The new flash loader takes all configured pulses into the timed dataframe, i.e. means also pulses where actually no X-ray pulses are present. This again leads to a wrong normalization (if e.g. the bam is systematically different for "empty" pulses).
The histogram from timestamps also gives rather wrong results, if pulses are not continously coming. It essentially gives the pulses at the end of a train way too large weight (we build the derivative of the time stamps).
In addition, applying the corrections (e.g. bam) with conserve_mean=True also leads to inconstistent results between the normal and timed_dataframe, because the means of the two dataframes are different, and thus a different correction is applied in both cases.

I am not sure what is the best way how to correct his. What is your opinion @zain-sohail ?

rettigl · 2024-12-01T17:01:26Z

Here is some further output:
Histograms for uncorrected data:
main:

v1:

Histograms for BAM correction w/ preserve_mean=True:
main:

v1:

Histograms for BAM correction w/ preserve_mean=False:
main:

v1:

zain-sohail · 2024-12-02T12:09:53Z

The bam correction and normalization histograms do not work correctly in the v1 feature branch. See Tutorial 10. I suspect the issue to be in the new flash loader somewhere.

If I look at the Gmd trace of this run 44498 used in tutorial 10:

Looking at the events before the jump:

This is likely the culprit for problems with normalization. Did you notice this in other runs too?

I am not sure what is the best way how to correct his. What is your opinion @zain-sohail ?

So I see three options for fixing this:

not using configured pulses but pulses with x-rays. Meaning a GMD based filtering. Though, I don't know how valid this is.
only including pulses with electrons like before, and saving the configured pulse parquet files separately to be used in diagnostic tools but not in any SedProcessor workflow
Add a flag in corrections workflow that filters out only pulses with electrons. This could be easily done with the multiindex scheme but not sure how feasible with dask.

rettigl · 2024-12-02T23:27:32Z

I'm not sure I understand what this GMD signal means. Regarding how to fix this best, I think it would be good to discuss this at another meeting. How could we implement/test the first option you propose? Regarding the second option, this used to produce much smoother curves, however at the cost of normalizing away any pump-induced changed to the total photoelectron yield. They might be anyways very small at the larger photon energies typically used at Flash.

zain-sohail · 2024-12-03T11:30:34Z

I'm not sure I understand what this GMD signal means.

I'll check with dima as well about this.

Regarding how to fix this best, I think it would be good to discuss this at another meeting.

Yes we were thinking to host one before the end of year. Could you send around a mail, depending on your availability?

How could we implement/test the first option you propose?

This should be simple. The filtering can happen per file. The GMD curves you saw were time averaged per minute but looking at per pulse, we should see 0s when there is no x-ray pulse. So just discarding the rows with GMD=0. Testing should be simple as well but the validity needs to be ascertained.

Regarding the second option, this used to produce much smoother curves, however at the cost of normalizing away any pump-induced changed to the total photoelectron yield. They might be anyways very small at the larger photon energies typically used at Flash.

Let me also check this with Dima.

rettigl · 2024-12-09T22:44:16Z

This should be simple. The filtering can happen per file. The GMD curves you saw were time averaged per minute but looking at per pulse, we should see 0s when there is no x-ray pulse. So just discarding the rows with GMD=0. Testing should be simple as well but the validity needs to be ascertained.

This does not appear to be the case:

Pulse Ids with electrons are only up to ~500 or so (a few ones with >4000 also exist, no idea where from...)

rettigl mentioned this issue Nov 30, 2024

Upgrade to V1 #437

Open

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1 branch: flash loader bugs #528

v1 branch: flash loader bugs #528

rettigl commented Nov 30, 2024

rettigl commented Nov 30, 2024

rettigl commented Nov 30, 2024

rettigl commented Dec 1, 2024

zain-sohail commented Dec 2, 2024

rettigl commented Dec 2, 2024

zain-sohail commented Dec 3, 2024

rettigl commented Dec 9, 2024

v1 branch: flash loader bugs #528

v1 branch: flash loader bugs #528

Comments

rettigl commented Nov 30, 2024

rettigl commented Nov 30, 2024

rettigl commented Nov 30, 2024

rettigl commented Dec 1, 2024

zain-sohail commented Dec 2, 2024

rettigl commented Dec 2, 2024

zain-sohail commented Dec 3, 2024

rettigl commented Dec 9, 2024