Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow SnowblindStep to run on _rate and _rateints #1

Merged
merged 7 commits into from
Dec 13, 2023

Conversation

gbrammer
Copy link
Contributor

Add SnowblindStep handling for ImageModel and CubeModel inputs, e.g., rate and rateints products. The logic for deciding what to do is whether or not the datamodel has a groupdq or dq attribute.

The update also includes a new parameter

new_jump_flag = integer(default={JUMP_DET}) # DQ flag to set for dilated jumps

to allow the user to set a custom integer flag on the dilated jumps.

Basic tests also included, based on the original tests for the RampModel inputs.

@jdavies-st
Copy link
Collaborator

Thanks for the PR @gbrammer!

Is the idea here to run it after Detector1Pipeline? One major disadvantage of that is that it massively throws away data, as in virtually all cases for snowballs (except those that hit in the first group), a rate can be computed for the affected pixels if the snowball flagging is done within Detector1Pipeline.

Is running snowball flagging on the _rate file what is currently done in grizli?

@jdavies-st
Copy link
Collaborator

@gbrammer
Copy link
Contributor Author

Yes, grizli just flags the big snowball blobs in the Detector1Pipeline rate products. It doesn't make any effort to fix them, it just wants to make a conservative mask around them. I'll have a look at the pep8 problems.

@gbrammer
Copy link
Contributor Author

OK, ci tests pass locally now.

@jdavies-st jdavies-st changed the title Implement logical paths for running the SnowblindStep on other file types Allow SnowblindStep to run on _rate and _rateints Nov 29, 2023
@gbrammer
Copy link
Contributor Author

The last fix was for the option to do a contraction with a negative scale_factor. My idea for this was for flagging saturated cores in rateints with two passes, the first with a positive scale_factor for integration i and the the second for something like scale_factor = -0.5 for flagging the cores in integrations i+1 : NINTS.

The main reason I'm working on this is that rerunning Detector1Pipeline on exposures with, like

  'nframes': 5,
  'ngroups': 20,
  'nints': 4,

takes forever and a huge amount of memory. Doing the flagging directly on the MAST rateints seems to be pretty comparable in terms of data quality and is much faster + cheaper.

@gbrammer
Copy link
Contributor Author

gbrammer commented Dec 6, 2023

Is there anything to consider before merging this into a new pip release? It's been working very well for some updates on grizli and it would be helpful to include a release version of snowblind as a requirement there.

@jdavies-st
Copy link
Collaborator

Hi @gbrammer, sorry for the radio silence. Euclid is keeping me pretty busy! Yeah, let's get this merged.

I think it's almost there. One question is why have a custom new_jump_flag parameter? Certainly the default for flagging jumps at the group level is to set JUMP_DET (GROUPDQ is 4), but this doesn't work if flagging the _rate file, as it needs both JUMP_DET and DO_NOT_USE set (so 5) so that later on drizzle knows not to use that. Is this the reasoning?

@jdavies-st
Copy link
Collaborator

jdavies-st commented Dec 11, 2023

Also, I'd like to try to convince you that you should be running SnowblindStep within Detector1PIpeline instead of afterwards. If you run it afterwards, you're actually throwing out lots of good data. For a typical deep field with 10 groups, 1 integration, and say 4 dithers per pointing, if you run snowblind on _rate file, for those pixels that were affected by a snowball, your drizzled stack will have 3/4 of the depth, since you masked all the pixels in a single dither _rate file.

But if you flag at the group level, you will only be flagging 1 or 2 groups out of 10 that are eventually used to compute the _rate. In the most typical case where NFRAMES>1, that's 2 groups flagged, so your final drizzled depth at those pixels is not 3/4 but 38/40 19/20 of the original depth. And for long exposures with lots of snowballs, this can have significant effect on final S/N for faint sources.

Of course at the center of the snowball which is saturated (and the ~2 pixel ring around it where you get charge migration) you will still lose all the signal in the ramp, but it's still likely that there are some good groups beforehand and those will be useful. But the number of saturated pixels relative to the size of the expanded snowball mask is tiny.

The animated GIF below is an example illustrating this with some NIRCam LONG data in one of the CEERS fields. 10 groups, 1 integration, 4 dithers. Final mosaic drizzled after SnowblindStep and 1/f noise mitigation done on both. The only difference is running snowblind on the _rate after Detector1Pipeline vs running it on the _ramp within Detector1Pipeline. You can see that for the expanded masks where snowballs hit, masking at the group level retains S/N.

snowblind_rate_vs_ramp

@gbrammer
Copy link
Contributor Author

Hi @gbrammer, sorry for the radio silence. Euclid is keeping me pretty busy! Yeah, let's get this merged.

I think it's almost there. One question is why have a custom new_jump_flag parameter? Certainly the default for flagging jumps at the group level is to set JUMP_DET (GROUPDQ is 4), but this doesn't work if flagging the _rate file, as it needs both JUMP_DET and DO_NOT_USE set (so 5) so that later on drizzle knows not to use that. Is this the reasoning?

I added this just to make the flagging more flexible. It defaults to the behavior you describe here, but when flagging on the rate products in grizli I want to set the snowball flags to something other than JUMP_DET because the bulk of the JUMP_DET pixels are generally fine to use when I pass the images to the low-level drizzle function.

I use the 1024 bit as my own DO_NOT_USE flag for optional additional bad pixel masks, along with 4096 pixels identified when I run groups of exposures through the legacy AstroDrizzle CR rejection steps. That is, I drizzle exposures into mosaics excluding DO_NOT_USE = (DQ | 1+1024+4096) > 0 pixels.

@gbrammer
Copy link
Contributor Author

Also, I'd like to try to convince you that you should be running SnowblindStep within Detector1PIpeline instead of afterwards. If you run it afterwards, you're actually throwing out lots of good data. For a typical deep field with 10 groups, 1 integration, and say 4 dithers per pointing, if you run snowblind on _rate file, for those pixels that were affected by a snowball, your drizzled stack will have 3/4 of the depth, since you masked all the pixels in a single dither _rate file.

I certainly accept your point, but the CPU/memory resources required for Detector1Pipeline is still a bit too expensive/unpredictable for me to use for the general processing of tons of archival data.

Copy link
Collaborator

@jdavies-st jdavies-st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this Gabe. Glad this expands the functionality. I note only the one problem below of SATURATED not being propagated to the _rate files properly, which might make it less effective recognizing snowballs.

else:
self._has_groups = False
bool_jump = (jump.dq & JUMP_DET) == JUMP_DET
bool_sat = (jump.dq & SATURATED) == SATURATED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of _rate files currently, only pixels that are saturated in all groups get the SATURATED flag. This is not good in my view, but what it means in practice is that a snowball saturates some pixels mid-exposure and the _rate file has a slope computed for these pixels due to the first part of the ramp, and the DQ only shows JUMP_DET in the area around the core. This may make it more difficult to detect snowballs.

Btw, there's an open issue to have this changed. See spacetelescope/jwst#8124

If the snowball lands in the 1st group, then the core may be saturated (it might not be if NFRAMES is large) for all groups, and so one would get NaNs in the core in the _rate file. Not sure of the best way to handle this.

@jdavies-st jdavies-st merged commit 4443b05 into mpi-astronomy:main Dec 13, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants