-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional entry of length or number of octets in GRIB2_Template_ csv-files #219
Comments
A very good suggestion |
https://github.com/wmo-im/CCT/wiki/20.to.22.September.2023 notes:
|
@antoinemerle :
|
Are we implementing this? I don't see a branch created for it |
I believe we agreed to address this after finalising FT2024-1. |
https://github.com/wmo-im/tt-tdcf/wiki/2024.10.15.tt.tdcf notes: |
Do the machine-readable files also need to include OctetCount? |
Octet counts are not needed on our side EDIT: actually it is a useful validation if the count is present |
https://github.com/wmo-im/tt-tdcf/wiki/2024.11.12.tt.tdcf notes:
|
as requested in : - #219 >In GRIB2 Template definition files the octet number (from - to) for each entry in the GRIB section is given. Most of the GRIB processing software packages need the length or number of octets for each entry, which has to be calculated from the specification of "OctetNo". But with variables and repitition within some templates an automated calculation is sometimes not easy. (e.g. 37 + (ND-1)*4 + (NF-1)*4 -40 +(ND-1)*4 + (NF-1)*4) Therefore, this is a proposal to add a column with the length of each entry in the GRIB2_Template files.
Hi @SibylleK , I made the changes, the only thing I don't know how to handle is the following What should be the following length when :
|
https://github.com/wmo-im/et-data/wiki/2025.01.14.et.data notes:
|
@SibylleK @antoinemerle |
…or all templates. the script will be added in the issue #219
I applied the changes as discussed in b88e9a1 I also run a script to go through tall the existing template file. here is the script for tracing and consitency . #!/usr/bin/env python3
import os
import csv
import re
def parse_octet_count(octet_str):
"""
Return a string with the numeric length if OctetNo is simple, else "".
"""
if any(sym in octet_str for sym in ['(', ')', '+', '*', 'ND', 'NF', 'n']):
return ""
total_length = 0
parts = [p.strip() for p in octet_str.split(',')]
for p in parts:
range_match = re.match(r'^(\d+)-(\d+)$', p)
single_match = re.match(r'^(\d+)$', p)
if range_match:
start = int(range_match.group(1))
end = int(range_match.group(2))
total_length += (end - start + 1)
elif single_match:
total_length += 1
else:
return ""
return str(total_length)
def update_octet_count_inplace():
"""
For each GRIB2_Template*.csv file in the current directory,
ensure it has an 'OctetCount' column (inserted right after 'OctetNo'),
and fill that column (if empty) based on 'OctetNo'.
"""
all_files = [f for f in os.listdir('.')
if f.startswith("GRIB2_Template") and f.endswith(".csv")]
for fname in all_files:
print(f"Processing: {fname}")
tmp_name = fname + ".tmp"
with open(fname, mode="r", encoding="utf-8", newline='') as inf, \
open(tmp_name, mode="w", encoding="utf-8", newline='') as outf:
reader = csv.DictReader(inf)
original_fieldnames = reader.fieldnames[:] # make a copy
# If "OctetCount" not in columns, insert it right after "OctetNo"
if "OctetCount" not in original_fieldnames:
if "OctetNo" in original_fieldnames:
idx = original_fieldnames.index("OctetNo")
original_fieldnames.insert(idx+1, "OctetCount")
else:
# fallback: if no "OctetNo" either, just add it at the end
original_fieldnames.append("OctetCount")
writer = csv.DictWriter(
outf,
fieldnames=original_fieldnames,
delimiter=',',
quotechar='"'
)
writer.writeheader()
# For each row, fill 'OctetCount' if missing or empty
for row in reader:
for fn in original_fieldnames:
if fn not in row:
row[fn] = ""
if not row["OctetCount"]: # empty or missing
octet_no = row.get("OctetNo", "")
row["OctetCount"] = parse_octet_count(octet_no)
writer.writerow(row)
# Replace the old file with the updated one
os.replace(tmp_name, fname)
if __name__ == "__main__":
update_octet_count_inplace()
print("Done updating OctetCount in GRIB2_Template*.csv files.")
|
Dear @SibylleK and @marijanacrepulja, May I ask you to tell me if this is what we wanted to achieve. Current implementation in the current branch
example of current files :Example of Title_en,OctetNo,Contents_en,Note_en,noteIDs,codeTable,flagTable,OctetCount,Status
Identification template 1.0 - calendar definition,24,Type of calendar,(see Code table 1.6),,1.6,,1,Operational
Identification template 1.1 - paleontological offset,24-25,Number of tens of thousands of years of offset,,,,,2,Operational
Identification template 1.2 - calendar definition and paleontological offset,24,Type of calendar,(see Code table 1.6),,1.6,,1,Operational Example of the
Question :the pending question on my side is : @SibylleK and @marijanacrepulja : Do we need/want to populate this
Thanks a lot |
@antoinemerle I think we should be consistent across the .txt and the .xml files -- and neither should include the octetCount at this time. |
Hi @amilan17, I have finally updated the branch and scripts accordingly. Here is a summary of the changes:
the new behavior now to be adopted by the team should be :
In the future, maybe we would like to actually run a batch script that is verifying the value of the OctetCount pushed by any of the team member. Thanks again @amilan17 for your quick answer for any of my questions. |
* Update create_master_lists.py as requested in : - #219 >In GRIB2 Template definition files the octet number (from - to) for each entry in the GRIB section is given. Most of the GRIB processing software packages need the length or number of octets for each entry, which has to be calculated from the specification of "OctetNo". But with variables and repitition within some templates an automated calculation is sometimes not easy. (e.g. 37 + (ND-1)*4 + (NF-1)*4 -40 +(ND-1)*4 + (NF-1)*4) Therefore, this is a proposal to add a column with the length of each entry in the GRIB2_Template files. * Update create_master_lists.py remove any keys not in 'fieldnames' to avoid ValueError * xml,txt files * Update create_master_lists.py make the count to be computed when the limit is not fixed but variable * xml,txt files * run the script update_octetcount.py to update the OctetCount number for all templates. the script will be added in the issue #219 * Replace Length per octetCount and remove it from the xml * xml,txt files * remove the octetCount in the txt and from the CI/CD * update the create_master to avoid any issue while populating fields in the xml and txt * update the create_master to avoid any issue while populating fields in the xml and txt * Apply suggestions from code review * xml,txt files --------- Co-authored-by: antoineMerleEUM <[email protected]> Co-authored-by: Enrico Fucile <[email protected]> Co-authored-by: antoinemerle <[email protected]>
Details
In GRIB2 Template definition files the octet number (from - to) for each entry in the GRIB section is given.
Most of the GRIB processing software packages need the length or number of octets for each entry, which has to be calculated from the specification of "OctetNo".
But with variables and repitition within some templates an automated calculation is sometimes not easy. (e.g. 37 + (ND-1)*4 + (NF-1)*4 -40 +(ND-1)*4 + (NF-1)*4)
Therefore, this is a proposal to add a column with the length of each entry in the GRIB2_Template files.
Requestor
Sibylle Krebber, @SibylleK
The text was updated successfully, but these errors were encountered: