-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v03 heads up. #9
Comments
Thank you! Is |
In theory we can release this such that |
I actually think I have most of this put together already. One question: what is the I have a listener running on |
if you connect to: broker amqps://anonymous:[email protected] You can get a sample v03.post feed. You can use it to confirm that v03 works, but we wont be posting anywhere else for a while (until v03 is fully gelled.) note that that feed includes embedding, which is a significant change from before. This feed is extremely experimental and may change at any time. it is being used to work with colleagues in the WMO to develop next gen WMO data exchange protocols. |
oh... the content type? In the C version we explitly set text/plain, in the python one there is no explicit setting, I'm guessing text/plain is the default, so basicaly we aren't using it. Do you think we should use a more specific content-type? |
note: https://stackoverflow.com/questions/477816/what-is-the-correct-json-content-type so if we went that way, it would be application/json ... |
another question... this protocol is fairly modern, so it is assumed as utf-8 in the spec. JSON is often utf-8. UTF-8 is kind of natural these days as a default charset (already the default on HTML5) so I don't think it is necessary to specify, it should be the default, and if someone wants to use something else, they should be the ones to use charset. I'm thinking about this in the context that I send millions of messages per day, so adding charset adds megabytes (12 bytes per message) of traffic per day, for no real benefit. but the question of content-type... what would be the benefit of appplication/json ? |
Thanks for the test feed—I'll point to it and try to capture a message! Regarding content type: if the message body is JSON, I'd say it's a best practice to set That said, if you're not using the field, I'll update our code to ignore it. Noting the charset in the content type is by no means required and if you're concerned about bandwidth per message, omitting it is reasonable (especially for |
I haven't seen any messages pushed yet, so I can't verify, but in any case we have an experimental |
oops... the feed was down. it is back up now. |
OK, changed the content_type to application/json in master, will take a few weeks to get into a release, and perhaps a few months to get to production. At some point the messages will just start showing up with the right content_type. |
Thank you! My branch seems to work now. The only issue I ran into was that the time format changed subtly (the addition of a |
great! |
what would be the protocol at this point, should I close the issue? |
I still need to finish up and merge in my prototype, but I'll go ahead and close the issue once I've done so. Would you mind opening a new issue once v03 is live (if ever)? |
There will certainly be an announcement on the datamart mailing list, and a period of parallel access (both versions available for a month or two) so you will certainly hear about it. The idea of the heads up is to minimize the length of the paralle period. |
update... why the heck didn't we release this three years ago? I spent a few years working with colleagues at the World Meteorological Organization, hoping to be able to merge v03 format with what they hoped to produced for pub/sub. It kept sounding like "yes, but..." and tweaks being needed here and there, but in the end, last spring they rejected it wholesale, preferring something more web service oriented, which has some conflicts with file transfer that is the focus of sarracenia. After the split, the format has continued to evolve over the past nine months in the following way: for the high performance mirroring use case, we need to transport things other than files: file removal events, directory creation, symbolic links, renames... those were formerly encoded in a conceptual overload of the checksum field, but in versions of v03 since fall 2022, are now represented using a "fileOp" field. the format is shown here: https://metpx.github.io/sarracenia/Reference/sr_post.7.html There is also a likely removal of optional fields coming: from_cluster and to_clusters will likely be elided, as they have not proven useful in deployments so far. The format will be the default for sr3... a version which has been gradually working towards a stable release for the past year, looking close. |
just to let you know, for the past year or so, we have been working on a new message payload format, as a result of limitations in the current ones and feed back from some international consultations. The current protocol version is identified by using the topic tree that starts with v02.post. Over the next year or two, we may migrate to v03.post. Differences in messages:
AMQP headers are no longer used to store key-value pairs. Instead, the message body is a JSON array. As a result, the formerly anonymous fields in the body of a v02 message are now key value pairs in the array: pubTime, baseUrl, and relPath.
for fields that are encoded, such as checksums, the encoding is changed to base64 (more compact representation)
https://github.com/MetPX/sarracenia/blob/master/doc/sr_postv3.7.rst
might still evolve slightly (new fields?) but we have done some important deployments of v03, and it is looking solid. no fire... nothing will be sprung on consumers suddenly, we haven´t looked at any migration strategy yet, but would not want to spring it on clients all of a sudden. Figured you would want to know far ahead of time.
I can supply some alternate data streams if you want a sample.
The text was updated successfully, but these errors were encountered: