Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read the subsets and the name of fields after reading FlatJsonRenderer().render(bufr_message) #18

Open
denissanga opened this issue Jul 6, 2021 · 5 comments

Comments

@denissanga
Copy link

HI all I'm trying to extract some data from a bufr file but I don't understand how to read the field names and the different subsets to extract data as U-COMPONET, V-COMPONENT, PRESSURE etc after reading any bufr_message. I send in attachment an example file. thank you very much in advance
L-000-MSG4__-MPEF________-AMV______-000001___-202106171330-__.zip

@ywangd
Copy link
Owner

ywangd commented Jul 6, 2021

Have you tried the query sub-command? You can read the docs here. Let me know if this solves your problem.

@denissanga
Copy link
Author

thank you very much.

Until now I use the following code to start reading bufr:

from pybufrkit.decoder import Decoder from pybufrkit.renderer import FlatJsonRenderer from pybufrkit.mdquery import MetadataExprParser, MetadataQuerent from pybufrkit.dataquery import NodePathParser, DataQuerent from pybufrkit.decoder import generate_bufr_message SOME_BUFR_FILE = "L-000-MSG4__-MPEF________-AMV______-000001___-202106171330-__" decoder = Decoder() dataLength = [] with open(SOME_BUFR_FILE, 'rb') as ins: for bufr_message in generate_bufr_message(decoder, ins.read()): n_subsets = MetadataQuerent(MetadataExprParser()).query(bufr_message, '%n_subsets') query_result = DataQuerent(NodePathParser()).query(bufr_message, '001002') json_data = FlatJsonRenderer().render(bufr_message) dataLength.append([len(json_data), len(json_data[3][2]), [len(i) for i in json_data]]) pass # do something with the decoded message object

in json_data I obtain a list of all values for each message and I would like to have a list of all name of each variable and entract data from subsets. I try looking the link but I don't understand how to use it correctly

Thank you again
Best regards

@hautecoeur
Copy link

This is a piece of code to explain how you can read and decode the main data.
Olivier

import pandas as pd
from pybufrkit.decoder import Decoder
from pybufrkit.decoder import generate_bufr_message
from pybufrkit.renderer import FlatJsonRenderer

FILENAME = "L-000-MSG4__-MPEF________-AMV______-000001___-202106171330-__"
decoder = Decoder()

df = pd.DataFrame()

# this file is a multiple-message BUFR file
with open(FILENAME, "rb") as ins:
    for bufr_message in generate_bufr_message(decoder, ins.read()):
        json_data = FlatJsonRenderer().render(bufr_message)
        df = pd.concat([df, `pd.DataFrame(json_data[3][2])])`
        # df contains all the wind records as a matrix

# extracting (some of ) the most important fields
amv = pd.DataFrame({'latitude':df[17], 'longitude':df[18], 'pressure':df[27], 'direction':df[28], 'speed':df[29], 'u':df[30], 'v':df[31], 'channel':df[54], 'qix':df[170]})
print(amv) # there are 50346 wind records

# extracting the records for the SEVIRI channel 9 (IR 10.8)
seviri9 = amv.loc[(amv['channel']==9)]
print(seviri9) # only 11067 extracted from thermal infrared data

 # filtering the 'good' winds
goodwinds = seviri9.loc[(seviri9['speed']>2.5) & (seviri9['qix']>=80)]
print(goodwinds) # 6302 passed the selection criteria

@denissanga
Copy link
Author

thank you really much for your help. It works

@steph-ben
Copy link

steph-ben commented Jul 7, 2022

Looks really good and helpful !!! However, how did you get the column association, eg. latitude = df[17] ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants