Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quality control and database information when updating metadata #156

Open
rdstern opened this issue Jan 25, 2019 · 3 comments
Open

Quality control and database information when updating metadata #156

rdstern opened this issue Jan 25, 2019 · 3 comments

Comments

@rdstern
Copy link

rdstern commented Jan 25, 2019

This is not ready for a quick solution yet, but is an important issue.

I put 2 related points together to make my case. It came from the Rwanda workshops in September.

  1. When we make corrections to the climatic data there needs to be a record of the details of the correction. This is agreed and was discussed and @smachua had clear ideas of what to do to keep a record of these changes.
  2. When we looked at the maps of the station locations (i.e. some of the metadata) we found some where the lat and long were clearly wrong - they were outside Rwanda. @smachua explained to the workshop how corrections to the metadata were equally important and also what was currently being done in Kenya.

So, my point is now that when we make changes to the metadata (i.e. corrections, updates) then logically we should similarly keep a record of those changes as well. If keeping a record of changes is important in CLIMSOFT, then this "record keeping" should apply to any piece of information, whether it is "data", e.g. a rainfall value changing, or metadata, e.g. latitude, or type of instrument.

What I don't know is how best this should be done? Might this be a single new table - which is already (I think) agreed to record changes in the data - we just generalise it to also apply to the metadata? Or is it another table that is needed? And might this additional table add (or change) any field in the existing data and metadata tables. I hope not.

@dannyparsons
Copy link

I think this is a duplicate of a discussion started here climsoft#439 by Maxwell.

@smachua
Copy link

smachua commented Jan 28, 2019

@rdstern my idea was to introduce a log file that will list changes made on data and metadata. Data in the log file will be delimited hence can be easily analysed. This way there will be no need to create an additional table. Any other approach is welcome.

@Steve-Palmer
Copy link

I think there are several issues here. First of all, metadata needs to handled differently from observation data.

  1. If there is an error in the metadata, then correction is needed, but there is no overriding need to record the history of the error. The example here is (as Roger says) stations with incorrect lat/long locations. An earlier example from the Met Office, when we did data cleaning we found numerous station with lat/long values that put them in the sea, because the primary location information was taken from map grids. Other errors could be station start dates (e.g. when earlier data becomes available) or closure dates (e.g. when a station stops working properly and the decision is taken to terminate it). These would be professional decisions by the metadata manager - I think we have to apply some trust in professionalism, but it might make sense to log such changes for a limited period in case the change itself is in error and needs to be rolled back, or for senior managers to monitor the work of the metadata manager.
  2. If the change is due to e.g. a genuine change in location or a change in the environment or equipment of a station, then that needs to be recorded with effective_from and effective_to dates, or perhaps a completely new station. WMO Guide to Climatological Practices gives guidance - see also https://oscar.wmo.int/surface//index.html#/ . Note that WIGOS and OSCAR are considering how to link multiple stations which can be used for to produce a single long-period record. Again, this is a professional decision by the metadata manager (though I recognise that in many met services there is a training need and poor awareness of the WMO Guide). Again, a log file held for a limited period would enable any changes to be monitored and rolled back if necesssary, but I don't see the need to hold the log for more than a month or two. The changes should be fully available from the metadata itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants