-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YAML.safe_load fails when a string contains a non-existent date #262
Comments
I guess this is because Psych leans on the Date class to determine what is valid date or not. I'm not really keen on writing my own date validation logic. Do you have a suggestion for how to fix this? |
I took a brief look at the code but I don't understand yet why a valid date will not raise and exception. So it's doing validation somewhere anyway. A workaround is to simply allow the Date class. Another option would be use modified tokeniser for safe_load so it only recognises allowed classes and turns the rest into strings. |
I don't follow. A valid date will raise an exception with
Both of these are major changes. I wouldn't be willing to do that in a bugfix release. I think returning strings would be fine, but I'm definitely not in favor of allowing Dates by default. |
I mean this:
So some date parsing is apparently already happening somewhere, I just haven't been able to find it in the code, perhaps we can fix it there. I agree that allowing Date as an extra class in a bug fix release would not be appropriate, but this is a good workaround for people who have a problem and want a quick fix:
|
When you do
Note the single quotes around the date string in the resulting YAML. Since
The single quotes tell the parser "this is absolutely a string, do not check for other values". The second value isn't ambiguous when dumping the YAML, but is ambiguous when parsing. Maybe that helps? |
Yes, I just realised that. So another avenue would be to fix .to_yaml to always quote strings that look like a date, that's probably much easier to do. Perhaps we should have a .to_safe_yaml that can be used on user supplied input and can be relied upon to only create scalars. Sigh. But it's probably better to move to JSON (which I was hoping to avoid). |
Any news on this? |
What is the problem here? What I'm really after is stop Psych from parsing the date and then dumping back to YAML in a different format. I want to read the YAML, change some values and dump back to YAML. The problem is that once the date is parsed, then dumping back to YAML results in
This is why I tried whether |
I just got hit by this after the hashie library updated and they now call The difference is that on the end of the chain I can work with sanitized data. If I really want a Date object I can do the conversion myself. Or I can use it as a string, but if reading in a user supplied configuration file leads to a backtrace, than that's it. Game over. Epic fail. Given that a library like hashi is calling this means I have no control over the call to My program doesn't need to have values automagically converted, but asking every user of my library for the eternity to never ever put an unquoted date in a yaml file is impossible. |
We are running into this issue as well. A user provided String that is attempted to be serialized and put into the database is require 'psych'
original_valid_content = "---\n- >-\n test\n- >-\n 1976-01-02"
original_invalid_content = "---\n- >-\n test\n- >-\n 0000-00-00"
# Works
deserialized_content = Psych.safe_load(original_valid_content)
new_serialized_content = Psych.dump(deserialized_content)
# => "---\n- test\n- '1976-01-02'\n"
new_deserialized_content = Psych.safe_load(new_serialized_content)
# Breaks:
deserialized_content = Psych.safe_load(original_invalid_content)
new_serialized_content = Psych.dump(deserialized_content)
# => "---\n- test\n- 0000-00-00\n"
new_deserialized_content = Psych.safe_load(new_serialized_content)
# => Psych::DisallowedClass: Tried to load unspecified class: Date |
Looking at tenderlove's comment here maybe it might make most sense to be more aggressive with quoting on Ensure that a string that is not a valid date like |
Honestly, I can't think of any drawback. I'd merge a commit that does that. On a different subject, maybe in the future we should add a |
Is there any way around the issue? A date notation like |
If you want to make sure a YAML value is always treated like a string, you can cast it explicitly by prepending
|
Can anyone explain what is the problem with @najamelan's suggestion?
Users can put quotes around dates themselves, but in many cases that may prevent interoperability between Ruby/Psych-based apps (in my case gollum) and other applications that parse YAML, when the latter do support the parsing of dates. If Psych just returned a string for anything that looks like a date, the user wouldn't need to worry about how to specify their data. |
ping |
Fully agree with @dometto. The YAML spec does support "date" natively: https://yaml.org/spec/1.2.2/
It is clear that quoting or escaping the date is not the correct solution (according to the spec). Is there a plan to resolve this issue? @najamelan's suggestion of "just returns strings for anything that is deemed unsafe to be parsed." seems reasonable as it allows for the user to handle any "unsafe" cases. Thanks! |
Just a comment about the 1.2.2 spec and the timestamp example: None of the recommended schemas in YAML 1.2 provide a timestamp tag anymore, and the example 2.22 is a leftover that should probably have been removed. |
Thanks @perlpunk for clarifying the spec -- is the YAML way for handling a date/time now a string? |
@ronaldtse yes, in YAML 1.2 it would be loaded as a string, and the app can turn it into a date. |
This updates both the Dockerfile and .ruby-version to Ruby 3. The only breaking change for us is an issue with Middleman, where dates in the frontmatter cause the following error: `YAML Exception parsing api-catalogue/source/Borders/index.html.md: Tried to load unspecified class: Date` This is an issue with an underlying library: ruby/psych#262 It's fixed on Middleman's main branch, but there is no release for it yet and it's not clear if there will be one. We can work around this and use strings for dates. --- updated-dependencies: - dependency-name: ruby dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
This updates both the Dockerfile and .ruby-version to Ruby 3. The only breaking change for us is an issue with Middleman, where dates in the frontmatter cause the following error: `YAML Exception parsing api-catalogue/source/Borders/index.html.md: Tried to load unspecified class: Date` This is an issue with an underlying library: ruby/psych#262 It's fixed on Middleman's main branch, but there is no release for it yet and it's not clear if there will be one. We can work around this and use strings for dates. --- updated-dependencies: - dependency-name: ruby dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
Was dealing with a similar date parsing issue in js-yaml: nodeca/js-yaml#477, and I'm sorry to disagree with @ronaldtse, but the spec is definitely impossibly ambiguous about those timestamp examples and we can't simply use RFC 3339 (or even a faithfully minimal extension of it) to support those examples. To quote myself:
I think that a well specified standard would require not just that the date-time match the ABNF, but that it be a valid date. If we deferred to RFC3339, it says:
Which pretty clearly implies Personally, I think the horses have left the barn and the spec should suggest a reasonable middle ground. Most YAML parsers have date parsing, it's acting differently everywhere, both inbound and outbound. We create standards to avoid this (well, in theory ;-). |
It works for me http://sundivenetworks.com/archive/2021/tried-to-load-unspecified-class-time-psych-disallowedclass.html |
This updates both the Dockerfile and .ruby-version to Ruby 3. The only breaking change for us is an issue with Middleman, where dates in the frontmatter cause the following error: `YAML Exception parsing api-catalogue/source/Borders/index.html.md: Tried to load unspecified class: Date` This is an issue with an underlying library: ruby/psych#262 It's fixed on Middleman's main branch, but there is no release for it yet and it's not clear if there will be one. We can work around this and use strings for dates. --- updated-dependencies: - dependency-name: ruby dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
YAML.safe_load will raise an exception when you try to load text that happens to contain a sequence of numbers that looks like a date but is not:
Using YAML.load instead of safe_load works fine and text that contains a correct date works fine too. But this can be used to raise an exception on any application that uses YAML.safe_load on user provided text (accidentally or otherwise)
The text was updated successfully, but these errors were encountered: