Additional entry of length or number of octets in GRIB2_Template_ csv-files #219

SibylleK · 2023-09-18T14:40:57Z

Details

In GRIB2 Template definition files the octet number (from - to) for each entry in the GRIB section is given.
Most of the GRIB processing software packages need the length or number of octets for each entry, which has to be calculated from the specification of "OctetNo".

But with variables and repitition within some templates an automated calculation is sometimes not easy. (e.g. 37 + (ND-1)*4 + (NF-1)*4 -40 +(ND-1)*4 + (NF-1)*4)

Therefore, this is a proposal to add a column with the length of each entry in the GRIB2_Template files.

Requestor

Sibylle Krebber, @SibylleK

shahramn · 2023-09-18T14:46:38Z

A very good suggestion

amilan17 · 2023-09-22T09:10:29Z

https://github.com/wmo-im/CCT/wiki/20.to.22.September.2023 notes:

Title_en, OctetNo, octetCount, Contents_en, Note_en, noteIDs, codeTable, flagTable, Status
there could be an impact on the software that is ingesting the machine-readable codes;
the script to generate the TXT,CSV will need to be updated
template with a sample and the software can be updated for testing

antoinemerle · 2023-09-22T09:37:26Z

@antoinemerle :
On roadmap purpose and planning on EUM side :

: need to check in the vocabulary manager what would be the impact : are we taking the data from the xml or note, how do we parse the xml in a way it is going to break or not our interfaces.
Estimate the cost of the impact on EUM side / implementation of this new colomn

e.g :

sebvi · 2023-11-20T09:58:39Z

Are we implementing this? I don't see a branch created for it

marijanacrepulja · 2023-11-20T10:15:13Z

I believe we agreed to address this after finalising FT2024-1.

amilan17 · 2024-10-15T13:30:06Z

https://github.com/wmo-im/tt-tdcf/wiki/2024.10.15.tt.tdcf notes:
@amilan17 will update master after merging in FT2024-2 branch

amilan17 · 2024-11-11T13:12:52Z

Do the machine-readable files also need to include OctetCount?

sebvi · 2024-11-11T13:51:09Z

Octet counts are not needed on our side

EDIT: actually it is a useful validation if the count is present

amilan17 · 2024-11-12T13:16:12Z

https://github.com/wmo-im/tt-tdcf/wiki/2024.11.12.tt.tdcf notes:

empty columns are created for all templates, needs checking
@antoinemerle can help with scripts to ensure that the TXT and XML output files do NOT include the OctetCount column
Sibylle it would be even better (eventually) if the octet can be calculated from the OctetCount but not necessary today
@antoinemerle new script to retroactively populate all octetCount columns

as requested in : - #219 >In GRIB2 Template definition files the octet number (from - to) for each entry in the GRIB section is given. Most of the GRIB processing software packages need the length or number of octets for each entry, which has to be calculated from the specification of "OctetNo". But with variables and repitition within some templates an automated calculation is sometimes not easy. (e.g. 37 + (ND-1)*4 + (NF-1)*4 -40 +(ND-1)*4 + (NF-1)*4) Therefore, this is a proposal to add a column with the length of each entry in the GRIB2_Template files.

antoinemerle · 2025-01-14T10:51:51Z

Hi @amilan17,

I 've comit the changes (you can see it here 5646f0d)

I am going to review now the output and to verify this is working as expected

at the moment this is the correct behavior, I will use the same script to populate changes on all existing columns

antoinemerle · 2025-01-14T12:52:26Z

Hi @SibylleK ,

I made the changes,

the only thing I don't know how to handle is the following

What should be the following length when :

73-nn ? at the moment I put the same value for the length

amilan17 · 2025-01-14T13:33:25Z

https://github.com/wmo-im/et-data/wiki/2025.01.14.et.data notes:

Sibylle entries like "73-nn" are probably in the wrong column

amilan17 · 2025-01-15T12:55:53Z

@SibylleK @antoinemerle
This is an example an entry from template 3.40. Does every octetCount cell need to be populated? Is it ok to leave as is?

…or all templates. the script will be added in the issue #219

antoinemerle · 2025-01-29T11:30:13Z

Dear @amilan17 and @SibylleK

I applied the changes as discussed in b88e9a1

I also run a script to go through tall the existing template file.

here is the script for tracing and consitency .

#!/usr/bin/env python3

import os
import csv
import re

def parse_octet_count(octet_str):
    """
    Return a string with the numeric length if OctetNo is simple, else "".
    """
    if any(sym in octet_str for sym in ['(', ')', '+', '*', 'ND', 'NF', 'n']):
        return ""

    total_length = 0
    parts = [p.strip() for p in octet_str.split(',')]
    for p in parts:
        range_match = re.match(r'^(\d+)-(\d+)$', p)
        single_match = re.match(r'^(\d+)$', p)
        if range_match:
            start = int(range_match.group(1))
            end = int(range_match.group(2))
            total_length += (end - start + 1)
        elif single_match:
            total_length += 1
        else:
            return ""
    return str(total_length)

def update_octet_count_inplace():
    """
    For each GRIB2_Template*.csv file in the current directory,
    ensure it has an 'OctetCount' column (inserted right after 'OctetNo'),
    and fill that column (if empty) based on 'OctetNo'.
    """
    all_files = [f for f in os.listdir('.')
                 if f.startswith("GRIB2_Template") and f.endswith(".csv")]

    for fname in all_files:
        print(f"Processing: {fname}")
        tmp_name = fname + ".tmp"

        with open(fname, mode="r", encoding="utf-8", newline='') as inf, \
             open(tmp_name, mode="w", encoding="utf-8", newline='') as outf:

            reader = csv.DictReader(inf)
            original_fieldnames = reader.fieldnames[:]  # make a copy

            # If "OctetCount" not in columns, insert it right after "OctetNo"
            if "OctetCount" not in original_fieldnames:
                if "OctetNo" in original_fieldnames:
                    idx = original_fieldnames.index("OctetNo")
                    original_fieldnames.insert(idx+1, "OctetCount")
                else:
                    # fallback: if no "OctetNo" either, just add it at the end
                    original_fieldnames.append("OctetCount")

            writer = csv.DictWriter(
                outf,
                fieldnames=original_fieldnames,
                delimiter=',',
                quotechar='"'
            )
            writer.writeheader()

            # For each row, fill 'OctetCount' if missing or empty
            for row in reader:
                for fn in original_fieldnames:
                    if fn not in row:
                        row[fn] = ""

                if not row["OctetCount"]:  # empty or missing
                    octet_no = row.get("OctetNo", "")
                    row["OctetCount"] = parse_octet_count(octet_no)

                writer.writerow(row)

        # Replace the old file with the updated one
        os.replace(tmp_name, fname)

if __name__ == "__main__":
    update_octet_count_inplace()
    print("Done updating OctetCount in GRIB2_Template*.csv files.")

antoinemerle · 2025-01-30T10:09:12Z

Dear @SibylleK and @marijanacrepulja,

May I ask you to tell me if this is what we wanted to achieve.

Current implementation in the current branch

all GRIB 2 csv templates have now the OctetCount number filled in (when possible)
After each commit made by anyone this column will be also filled / added in in the ./txt/template.txt (not the XML)

example of current files :

Example of ./txt/template.txt

Title_en,OctetNo,Contents_en,Note_en,noteIDs,codeTable,flagTable,OctetCount,Status
Identification template 1.0 - calendar definition,24,Type of calendar,(see Code table 1.6),,1.6,,1,Operational
Identification template 1.1 - paleontological offset,24-25,Number of tens of thousands of years of offset,,,,,2,Operational
Identification template 1.2 - calendar definition and paleontological offset,24,Type of calendar,(see Code table 1.6),,1.6,,1,Operational

Example of the csv

Title_en,OctetNo,Contents_en,Note_en,noteIDs,codeTable,flagTable,OctetCount,Status
Identification template 1.0 - calendar definition,24,Type of calendar,(see Code table 1.6),,1.6,,1,Operational
Identification template 1.1 - paleontological offset,24-25,Number of tens of thousands of years of offset,,,,,2,Operational
Identification template 1.2 - calendar definition and paleontological offset,24,Type of calendar,(see Code table 1.6),,1.6,,1,Operational

Question :

the pending question on my side is :

@SibylleK and @marijanacrepulja : Do we need/want to populate this OctetCount in the txt file being generated after each commit or not ?

PS : I want to be sure I am not impacting any other operational SW by editing the txt file

Thanks a lot

amilan17 · 2025-01-30T13:09:03Z

@antoinemerle I think we should be consistent across the .txt and the .xml files -- and neither should include the octetCount at this time.

antoinemerle · 2025-01-31T07:53:31Z

Hi @amilan17, I have finally updated the branch and scripts accordingly.

Here is a summary of the changes:

update the CI/CD script to not fail when generating the XML and TXT (knowing we are not populating the octet Count )
all the GRIB2_Template* have been updated : OctetCount column has been filled in when needed

the new behavior now to be adopted by the team should be :

when commit a new template : they should manually enter the right value in the template
the OctetCount is not going to be populated in the ./txt/ and ./xml/ Template

In the future, maybe we would like to actually run a batch script that is verifying the value of the OctetCount pushed by any of the team member.

Thanks again @amilan17 for your quick answer for any of my questions.

* Update create_master_lists.py as requested in : - #219 >In GRIB2 Template definition files the octet number (from - to) for each entry in the GRIB section is given. Most of the GRIB processing software packages need the length or number of octets for each entry, which has to be calculated from the specification of "OctetNo". But with variables and repitition within some templates an automated calculation is sometimes not easy. (e.g. 37 + (ND-1)*4 + (NF-1)*4 -40 +(ND-1)*4 + (NF-1)*4) Therefore, this is a proposal to add a column with the length of each entry in the GRIB2_Template files. * Update create_master_lists.py remove any keys not in 'fieldnames' to avoid ValueError * xml,txt files * Update create_master_lists.py make the count to be computed when the limit is not fixed but variable * xml,txt files * run the script update_octetcount.py to update the OctetCount number for all templates. the script will be added in the issue #219 * Replace Length per octetCount and remove it from the xml * xml,txt files * remove the octetCount in the txt and from the CI/CD * update the create_master to avoid any issue while populating fields in the xml and txt * update the create_master to avoid any issue while populating fields in the xml and txt * Apply suggestions from code review * xml,txt files --------- Co-authored-by: antoineMerleEUM <[email protected]> Co-authored-by: Enrico Fucile <[email protected]> Co-authored-by: antoinemerle <[email protected]>

amilan17 added this to GRIB2 Amendments Aug 22, 2024

amilan17 moved this to In discussion in GRIB2 Amendments Aug 22, 2024

amilan17 added this to the noTargetMilestone milestone Aug 22, 2024

amilan17 assigned amilan17, SibylleK and antoinemerle Nov 12, 2024

amilan17 mentioned this issue Nov 28, 2024

219 additional entry of length or number of octets in grib2 template csv files #290

Merged

amilan17 linked a pull request Nov 28, 2024 that will close this issue

219 additional entry of length or number of octets in grib2 template csv files #290

Merged

amilan17 closed this as completed in #290 Nov 28, 2024

github-project-automation bot moved this from In progress to Ready for FT approval procedure in GRIB2 Amendments Nov 28, 2024

amilan17 reopened this Nov 28, 2024

github-project-automation bot moved this from Ready for FT approval procedure to In progress in GRIB2 Amendments Nov 28, 2024

amilan17 modified the milestones: noTargetMilestone, FT2025-1 Jan 15, 2025

antoinemerle added a commit that referenced this issue Jan 29, 2025

run the script update_octetcount.py to update the OctetCount number f…

b88e9a1

…or all templates. the script will be added in the issue #219

amilan17 mentioned this issue Jan 31, 2025

219 new #307

Merged

amilan17 linked a pull request Jan 31, 2025 that will close this issue

219 new #307

Merged

amilan17 closed this as completed in #307 Jan 31, 2025

github-project-automation bot moved this from In progress to Ready for FT approval procedure in GRIB2 Amendments Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional entry of length or number of octets in GRIB2_Template_ csv-files #219

Additional entry of length or number of octets in GRIB2_Template_ csv-files #219

SibylleK commented Sep 18, 2023

shahramn commented Sep 18, 2023

amilan17 commented Sep 22, 2023 •

edited

Loading

antoinemerle commented Sep 22, 2023

sebvi commented Nov 20, 2023

marijanacrepulja commented Nov 20, 2023

amilan17 commented Oct 15, 2024

amilan17 commented Nov 11, 2024

sebvi commented Nov 11, 2024 •

edited

Loading

amilan17 commented Nov 12, 2024 •

edited by antoinemerle

Loading

antoinemerle commented Jan 14, 2025

antoinemerle commented Jan 14, 2025

amilan17 commented Jan 14, 2025

amilan17 commented Jan 15, 2025

antoinemerle commented Jan 29, 2025

antoinemerle commented Jan 30, 2025 •

edited

Loading

amilan17 commented Jan 30, 2025

antoinemerle commented Jan 31, 2025

Additional entry of length or number of octets in GRIB2_Template_ csv-files #219

Additional entry of length or number of octets in GRIB2_Template_ csv-files #219

Comments

SibylleK commented Sep 18, 2023

Details

Requestor

shahramn commented Sep 18, 2023

amilan17 commented Sep 22, 2023 • edited Loading

antoinemerle commented Sep 22, 2023

sebvi commented Nov 20, 2023

marijanacrepulja commented Nov 20, 2023

amilan17 commented Oct 15, 2024

amilan17 commented Nov 11, 2024

sebvi commented Nov 11, 2024 • edited Loading

amilan17 commented Nov 12, 2024 • edited by antoinemerle Loading

antoinemerle commented Jan 14, 2025

antoinemerle commented Jan 14, 2025

amilan17 commented Jan 14, 2025

amilan17 commented Jan 15, 2025

antoinemerle commented Jan 29, 2025

antoinemerle commented Jan 30, 2025 • edited Loading

Current implementation in the current branch

example of current files :

Question :

amilan17 commented Jan 30, 2025

antoinemerle commented Jan 31, 2025

amilan17 commented Sep 22, 2023 •

edited

Loading

sebvi commented Nov 11, 2024 •

edited

Loading

amilan17 commented Nov 12, 2024 •

edited by antoinemerle

Loading

antoinemerle commented Jan 30, 2025 •

edited

Loading