Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto yaml #83

Merged
merged 13 commits into from
Feb 18, 2021
Merged

Auto yaml #83

merged 13 commits into from
Feb 18, 2021

Conversation

javierggt
Copy link
Contributor

@javierggt javierggt commented Feb 2, 2021

Description

This PR adds a functionality to create an obs-status file during the weekly run, which is then used to update the supplement.

It also adds required functionality for the entire cycle to work:

  • weekly run -> check report
    • stars with not-ok observations show up red in the report
  • edit obs status file
    • the error encountered should be there in the comments section
  • run the update again, using the same arguments and --obs-status-file option. After this:
    • mags should be updated in the supplement,
    • the report is updated
    • stars with expected not-ok observations should show up yellow in the report. These are the ones in the obs table in the supplement.
    • all suspect observations should show up as ok/not-ok in the report, depending on the resolution

This PR includes the following changes:

  • refactor get_agasc_stats and get_obs_stats methods so all observations are kept in the final table, even if they fail. This fixes the cases where some observations were missing in the report table.
  • Changed the logic by which observations were skipped if they are already in the supplement. Now they are not skipped if the agasc ID is set explicitly. This is needed so the reports are properly updated when obs-status is modified in the supplement.
  • Added functionality to automatically save a YAML file with observations that failed or are suspicious.
  • Fixes in report
    • to make it more resilient: do not fail if get_telemetry raises an exception.
    • Redefined the meaning of 'n_obs_bad_new' to mean the number of failed observations, which include actual failures and suspect obs.
    • Redefined the 'warning' flag for agasc_stats so it includes stars with no observation at all (hence no "bad" observations).

Testing

  • Passes unit tests on MacOS, linux, Windows (at least one required)
  • Functional testing. The following starts from a supplement with no mag-stats. Then inspected the resulting supplement
    # set it to import supplement from here
    export AGASC_DIR=`pwd`
    cp $SKA/data/agasc/* .;
    cp ~/SAO/ska/data/agasc/miniagasc* .;
    
    # create a list of interesting stars to test
    echo 331751984 > agasc_id # an OK new star
    echo 991559968 >> agasc_id # star fails (anyway there should be a single-star page for this star)
    echo 505545160 >> agasc_id # only one suspect obs (anyway there should be a singla-star page for this star)
    echo 42864776 >> agasc_id # an OK older star that gets updated
    echo 1200760024 >> agasc_id  # one suspect OBS (the suspect OBS should show up in the single-star page)
    echo 314189096 >> agasc_id
    echo 553396016 >> agasc_id
    
    # start with a set of OK stars
    agasc-supplement-update --log-level debug --start 2011:151 --stop 2011:152 --report
    agasc-supplement-update --log-level debug --start 2019:280 --stop 2019:281 --report
    
    # add the interesting stars, check the report and check the resulting obs_status.yml
    agasc-supplement-update --agasc-id-file agasc_id --log-level debug --report
    open supplement_reports/weekly/latest/index.html
    
    # One could update the supplement 'obs' table with this...
    # agasc-supplement-obs-status --obs-status-file obs_status.yml # check supplement
    
    # but that would not update the existing report. We do the following instead
    # the important thing is that the report directory must be the same, otherwise it does not work
    # the report directory is given by the stop time (this still needs a better approach)
    agasc-supplement-update --obs-status-file obs_status.yml --log-level debug --report # all failures should be cleared, pages still there
    

Fixes #57

@javierggt javierggt force-pushed the auto-yaml branch 3 times, most recently from 231ad05 to f3a3c6d Compare February 4, 2021 19:33
@javierggt javierggt force-pushed the auto-yaml branch 2 times, most recently from 1f921cd to fb91cfb Compare February 10, 2021 18:59
@javierggt javierggt changed the title WIP: Auto yaml Auto yaml Feb 10, 2021
… even if they fail. This includes setting default values where needed.
…ady in the supplement. Now they are not skipped if the agasc ID is set explicitly (with --agasc-id-file or --obs-status-file)
…_mag_supplement script (used when updating obs-status at the same time as the mag estimates)
- to make it more resilient: do not fail if get_telemetry raises an 
exception.
- Redefined the meaning of 'n_obs_bad_new' to mean the number of failed 
observations, which include actual failures and suspect obs.
- Redefined the 'warning' flag for agasc_stats so it includes stars with 
no observation at all (hence no "bad" observations).
- some other trivial fixes.
@javierggt javierggt requested a review from taldcroft February 16, 2021 21:39
@javierggt
Copy link
Contributor Author

@taldcroft , I just noticed this PR has been sitting here neglected. I thought of just merging now, but figured I should at least give you the chance to object.

@taldcroft
Copy link
Member

Not neglected, I've been (slowly) working through them in order. 😄

@taldcroft
Copy link
Member

taldcroft commented Feb 17, 2021

@javierggt - The detailed testing procedure is great and very helpful to let me try this out and get a sampling of the features.

I am trying something slightly different and minimal with setup to avoid copying Gb's of files on HEAD. Is this OK?

  526  export AGASC_DIR=$PWD
  527  ln -s /proj/sot/ska/data/agasc/miniagasc.h5 ./
  530  cp ~/ska/data/agasc/agasc_supplement.h5 ./
  531  ln -s /proj/sot/ska/data/agasc/ra_dec.npy 

Then I'm running with the following so I don't have to install this version into an environment (which is I think necessary to use the console scripts like you've shown. Do I have that right?):

python -m agasc.scripts.update_mag_supplement --log-level debug --start 2011:151 --stop 2011:152 --report

One thing I noticed in the logs is a warning from the agasc module:

2021-02-17 08:50:12,911 Getting stats for AGASC ID 42864776...
/home/aldcroft/git/agasc/agasc/supplement/utils.py:62: UserWarning: No dataset 'mags' in /home/aldcroft/tmp/agasc-yaml/agasc_supplement.h5, returning empty table
  warnings.warn(f"No dataset '{name}' in {supplement_file},"

This happens because the agasc_supplement.h5 I am starting from is the flight version without mags. But this warning surprises me a little since I would have thought that when the supplement processing code is asking for star mags, it wants the catalog mags, not observed. I might have that wrong, but maybe you need a use_supplement=False flag? Again, I didn't look at the code, so I could be misinterpreting.

Finally, this code needs to be updated/rebased to use the new API for saving the version:

Traceback (most recent call last):
  File "/proj/sot/ska3/flight/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/proj/sot/ska3/flight/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/aldcroft/git/agasc/agasc/scripts/update_mag_supplement.py", line 127, in <module>
    main()
  File "/home/aldcroft/git/agasc/agasc/scripts/update_mag_supplement.py", line 111, in main
    update_mag_supplement.do(
  File "/home/aldcroft/git/agasc/agasc/supplement/magnitudes/update_mag_supplement.py", line 486, in do
    new_stars, updated_stars = update_supplement(agasc_stats, filename=filename)
  File "/home/aldcroft/git/agasc/agasc/supplement/magnitudes/update_mag_supplement.py", line 292, in update_supplement
    save_version(filename, mags=agasc.__version__)
TypeError: save_version() got an unexpected keyword argument 'mags'

@jeanconn
Copy link
Contributor

Somewhat unrelated process nit... I think the ra_dec.npy file was for an older version of agasc.

@javierggt
Copy link
Contributor Author

That should work.

Actually, what I did was to install using:

pip install -e .

that is why the scripts work.

I just fixed the call to save_version which gave you the error.

@javierggt
Copy link
Contributor Author

ah, and in relation to the warning... I don't know. I have not touched that part of the code, so it should not serve the observed magnitudes.

Before saving the mags table, it also reads it, in order to update it and save it again. That must be when the warning is triggered.

Before updating the magnitudes, at the very beginning, the code explicitly fetches the mags table to see which stars have been observed since the recorded time of last observation, so that is another point where the warning could appear (but I don't think so)

@taldcroft
Copy link
Member

Somewhat unrelated process nit... I think the ra_dec.npy file was for an older version of agasc.

Ah, I forgot and wasn't thinking too hard. Maybe we just delete that file?

@jeanconn
Copy link
Contributor

With regard to the old npy file, I think it is more work to try to figure out if any old code (ska2 or whatnot) is still using the shared-ska-data-file than to just leave it.

@taldcroft
Copy link
Member

in relation to the warning... I don't know. I have not touched that part of the code, so it should not serve the observed magnitudes.

I believe we decided to make the agasc provide observed mags by default, so one needs to opt-out with the use_supplement=False.

…pected failures in n_obsids_fail. Set n_obsids earlier so it is set even if all obs are bad. fixed logging line.
Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the functional testing and it all works as expected now. Some minor things have been pushed out to new issues.

This looks good to me for merging! 🎉

failures = []
all_telem = []
stats = []
last_obs_time = 0
for i, o in enumerate(star_obs):
oi, ai = o['obsid', 'agasc_id']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been trying to avoid bikeshedding with variable names, but the pattern of using 1-character names like o makes reading the code very difficult. No need to change now, but going forward please use longer more descriptive names in general. Even oi and ai would make reading the code quicker for me (and for you in 3 years) as agasc_id and obsid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto-generate YAML obs-status update file for suspect stars and use as basis for mission processing
3 participants