Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make maude_decom.get_aca_images robust agains missing/repeated VCDU frames #159

Merged
merged 9 commits into from
Dec 19, 2023

Conversation

javierggt
Copy link
Contributor

@javierggt javierggt commented Oct 24, 2023

Description

This PR fixes two issues that were found while reading monitoring data:

  • maude_decom.get_aca_images crashes if stop is after the last frame available in MAUDE. In this case some maude queries return no data, and the downstream code should not crash.
  • maude_decom.get_aca_images crashes if there are missing/repeated frames.

A test was added for the case of repeated/missing frames.

Interface impacts

Testing

Unit tests

  • No unit tests
  • Mac
  • Linux
  • Windows

Independent check of unit tests by Jean

  • Mac

Functional tests

No functional testing.

@javierggt javierggt changed the title WIP: Fix errors found when reading monitor windows in IR zone Make maude_decom.get_aca_images robust agains missing/repeated frames Nov 9, 2023
- VCDU counters come in packets of four, with each group starting with a multiple of 4.
"""
current_packet = [] # to group in fours
last_frame = 0 # for monotonicity check
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last_frame is never updated so the check below last_frame > vcdu_counter is not doing anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a fix for this and included a corresponding test case.

@@ -601,6 +601,29 @@ def _group_packets(packets, discard=True):
yield res


def filter_vcdu_jumps(vcdu_counters):
"""
Return a boolean mas to filter VCDU counters that are not continuous.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the annoying complication of the VCDU counter rolling over. While it is rare, this function should handle that. I think the rollover is 2**18 but that needs to be checked. Basically if you see a negative delta (from one VCDU to the next) of more than (2**18 - 3600 * 4) then all the counters after that point should be incremented by 2**18.

The 3600*4 is meant to reflect the largest number of VCDU's that could be dropped during a real-time support. You need to protect against a rollover happening right in the middle of a loss of comm during telemetry which would otherwise mask out every VCDU frame after the loss of comm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a fix for this and included a corresponding test case.

The VCDU frame rolls over at 2^24. I used the maximum number of missing frames as 3600*4 as you said here, but why is this the right number?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the VCDU rollover, I see there a new test with what I would call "synthetic" VCDUs... from this work can you point me to a handy real rollover or should I go find one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 3600 * 4 is a bit arbitrary. That corresponds to a one-hour comm loss. In the end it doesn't matter all that much since the worst case scenario (which is incredibly unlikely) is dropping one image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeanconn

ska_tdb.msids.find("vcdu", "counter")
dat = fetch.Msid("CVCDUCTR", "2023:001", "2023:120", stat="5min")
%matplotlib
dat.plot()

This is a bit interesting because it shows some resets in the safe modes.
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I don't know how far we need to get to the solution in this PR, but it looks like your changes helped the aca_view test a bit, in that it looks like the aca_view playback was no longer stuck at 2023:296:22:20:35 . But in playback it also didn't show any new ACA data changes for the rest of the interval (time and att updating but all the ACA data stuck).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the behavior I see. I ran this:

python -m aca_view.scripts.aca_viewer --start 2023:296:22:20:26 --stop 2023:296:22:20:50

and I see the data updating after the rollover

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, I was running

python -m aca_view.scripts.aca_viewer --start 2023:296:22:20:00.000 --stop 2023:296:22:24:00.000 --maude

And it looks like it stops working at about 2023:296:22:20:40.301

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I can confirm the issue in aca_view. One can check it by starting aca_view, opening the ipython widget, and typing:

from cxotime import CxoTime

CxoTime(timeline.table['TIME'][-1]).date
Out[3]: '2023:296:22:21:59.837'

CxoTime(timeline.slots[0].table['TIME'][-1]).date
Out[4]: '2023:296:22:20:39.887'

I think this is an aca_view issue, because running chandra_aca.maude_decom.get_aca_images returns correct data:

In [1]: from chandra_aca import maude_decom
   ...: from cxotime import CxoTime
   ...: start = '2023:296:22:20:26'
   ...: stop = '2023:296:22:22:00'
   ...: images = maude_decom.get_aca_images(start, stop)
   ...: print(CxoTime(images[images['IMGNUM'] == 0]['TIME'][-1]).date)
   ...: np.asarray(images[images['IMGNUM'] == 0]['IMG'][-1])

2023:296:22:21:55.914
Out[1]: 
array([[  31.  ,  112.  ,   91.75,   71.5 ,  139.  ,   98.5 ,   44.5 ,   17.5 ],
       [ 132.25,   44.5 ,  105.25,  267.25,  442.75,  226.75,   64.75,   31.  ],
       [  71.5 ,  105.25,  267.25,  706.  ,  753.25,  469.75,  166.  ,  132.25],
       [  91.75,  321.25, 1286.5 , 5788.75, 5289.25, 1192.  ,  280.75,  226.75],
       [  71.5 ,  280.75, 1246.  , 6855.25, 6841.75, 1711.75,  361.75,   98.5 ],
       [  24.25,   78.25,  287.5 , 1192.  , 1151.5 ,  510.25,  247.  ,   64.75],
       [  17.5 ,   64.75,  152.5 ,  152.5 ,  247.  ,  233.5 ,  145.75,   71.5 ],
       [  24.25,   24.25,   51.25,   71.5 ,  118.75,   37.75,   64.75,   44.5 ]])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I think the integration issue is not part of this PR.

)
assert len(packets) == 32

# the following checks the underlying function used to filter out repeated/missing frames
Copy link
Member

@taldcroft taldcroft Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write these tests like below. This will make it much easier for me to review the tests.

filter_vcdu_test_cases = [
    {
        "inp": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        "exp": [1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
    },
    {
        "inp": [0, 1, 3, 2, 4, 5, 6, 7, 8, 9],
        "exp": [0, 0, 0, 0, 1, 1, 1, 1, 0, 0],
    },
    {
        "inp": [0, 1, 2, 4, 5, 6, 7, 8, 9],
        "exp": [0, 0, 0, 1, 1, 1, 1, 0, 0],
    },
    ...,
]


@pytest.mark.parametrize("case", filter_vcdu_test_cases)
def test_filter_vcdu_jumps(case):
    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did something different than I asked. Your version does not demonstrate how the code behaves for repeated VCDU's.

        {
            "inp": np.array([1, 2, 3, 4, 5, 6, 6, 7, 8, 9]),
            "exp": np.array([4, 5, 6, 7]),  # Which vcdu=6 was selected?
        },

Versus

        {
            "inp": [1, 2, 3, 4, 5, 6, 6, 7, 8, 9],
            "exp": [0, 0, 0, 1, 1, 1, 0, 1, 0, 0],
        },

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I just changed that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I give up. What you did still looks nothing like what I asked for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry there is some miscommunication. Perhaps you could invest just a couple of minutes (I doubt it would take you more than two minutes) to convince yourself that this checks what you requested.

The only difference is that what you wrote above gives an integer index for the resulting groups, whereas I gave a boolean mask that tells you which frame was selected.

For example, in this case:

            "inp": np.array([1, 2, 3, 4, 5, 6, 6, 7, 8, 9]),
            "exp": np.array(
                [False, False, False, True, True, True, False, True, False, False]
            ),
        },

frame 6 is repeated, and the one selected is the first one.

Copy link
Contributor Author

@javierggt javierggt Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and by the way: I understood your request above as '1' and '0' being boolean, and you wanted to check which instance of a repeated frame was chosen, so what is in the test is equivalent to (with bool arrays as int)

            "inp": np.array([1, 2, 3, 4, 5, 6, 6, 7, 8, 9]),
            "exp": np.array(
                [0, 0, 0, 1, 1, 1, 0, 1, 0, 0]
            ),

which is verbatim what you wrote above.

Copy link
Member

@taldcroft taldcroft Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I didn't explain. The entire point of what I wrote was so that the mask values would visually line up with the vcdu numbers. That would make it easy for me to review and make sure the output is what I would expect.

The way you wrote it is just too much work for me as a reviewer to mentally line things up, so I didn't bother and just 🤞 it's OK.

Also the np.array wrappers are not in what I wrote and they can cause the numbers to not line up visually again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I totally did not get that what you meant was to line up visually. I understood that what you meant was to actually check which frame was being selected (the previous version of the test did not check that, and that is the change I made).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also did not pull the test cases out and use pytest.mark.parametrize as I requested. I think you understand why that is important for testing.

@jeanconn
Copy link
Contributor

It looks like @taldcroft's comments are substantive, so feel free to ping me for a new review when those issues are resolved.

@jeanconn jeanconn removed their request for review November 14, 2023 17:53
Copy link
Contributor

@jeanconn jeanconn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with these changes and their testing.

@taldcroft
Copy link
Member

I'm delegating to @jeanconn's review and bowing out.

@taldcroft taldcroft removed their request for review December 19, 2023 16:10
@javierggt javierggt merged commit bf34134 into master Dec 19, 2023
2 checks passed
@javierggt javierggt mentioned this pull request Feb 6, 2024
@javierggt javierggt deleted the ir-monitor branch August 29, 2024 16:51
@javierggt javierggt changed the title Make maude_decom.get_aca_images robust agains missing/repeated frames Make maude_decom.get_aca_images robust agains missing/repeated VCDU frames Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants