-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack Overflow (72997432, 73012922) #107
Comments
Thanks! I think it should be noted that only the Loader API is affected, because it uses recursion. The Event API seems to be ok. Personally I'm not yet familiar enough with C and stack overflows, so I'm not sure if this happens simply because the recursion is too deep, or the recursive functions accidentally use too much memory and could be improved. |
Thanks for looking at this issue! There's a discussion about a similar issue in a JSON parser here, which has some possible solutions (and discussion of drawbacks of them) for recursive based parsers: nlohmann/json#832 As discussed there, a value configurable by library users (ideally with a sane default) that limits how much recursion can occur (and includes an option to disable this limit for those who prefer the old behavior) seems to strike a nice balance between security and functionality. |
The document loading API (yaml_parser_load) was susseptable to a stack overflow issue when given input data which opened many mappings and/or sequences without closing them. This was due to the use of recurion in the implementation. With this change, we avoid recursion, and maintain our own loader context stack on the heap. The loader context contains a stack of document node indexes. Each time a sequence or mapping start event is encountered, the node index corrasponding to the event is pushed to the stack. Each time a sequence or mapping end event is encountered, the corrasponding node's index is popped from the stack. The yaml_parser_load_nodes() function sits on the event stream, issuing events to the appropriate handlers by type. When an event handler function constructs a node, it needs to connect the new node to its parent (unless it's the root node). This is where the loader context stack it used to find the parent node. The way that the new node is added to the tree depends on whether the parent node is a mapping (with a yaml_node_pair_t to fill), or a sequence (with a yaml_node_item_t). Fixes: yaml#107
The document loading API (yaml_parser_load) was susseptable to a stack overflow issue when given input data which opened many mappings and/or sequences without closing them. This was due to the use of recurion in the implementation. With this change, we avoid recursion, and maintain our own loader context stack on the heap. The loader context contains a stack of document node indexes. Each time a sequence or mapping start event is encountered, the node index corrasponding to the event is pushed to the stack. Each time a sequence or mapping end event is encountered, the corrasponding node's index is popped from the stack. The yaml_parser_load_nodes() function sits on the event stream, issuing events to the appropriate handlers by type. When an event handler function constructs a node, it needs to connect the new node to its parent (unless it's the root node). This is where the loader context stack is used to find the parent node. The way that the new node is added to the tree depends on whether the parent node is a mapping (with a yaml_node_pair_t to fill), or a sequence (with a yaml_node_item_t). Fixes: yaml#107
Thanks! This was fixed by #127 and released in 0.2.3 |
The document loading API (yaml_parser_load) was susseptable to a stack overflow issue when given input data which opened many mappings and/or sequences without closing them. This was due to the use of recurion in the implementation. With this change, we avoid recursion, and maintain our own loader context stack on the heap. The loader context contains a stack of document node indexes. Each time a sequence or mapping start event is encountered, the node index corrasponding to the event is pushed to the stack. Each time a sequence or mapping end event is encountered, the corrasponding node's index is popped from the stack. The yaml_parser_load_nodes() function sits on the event stream, issuing events to the appropriate handlers by type. When an event handler function constructs a node, it needs to connect the new node to its parent (unless it's the root node). This is where the loader context stack is used to find the parent node. The way that the new node is added to the tree depends on whether the parent node is a mapping (with a yaml_node_pair_t to fill), or a sequence (with a yaml_node_item_t). Fixes: yaml#107
The document loading API (yaml_parser_load) was susseptable to a stack overflow issue when given input data which opened many mappings and/or sequences without closing them. This was due to the use of recurion in the implementation. With this change, we avoid recursion, and maintain our own loader context stack on the heap. The loader context contains a stack of document node indexes. Each time a sequence or mapping start event is encountered, the node index corrasponding to the event is pushed to the stack. Each time a sequence or mapping end event is encountered, the corrasponding node's index is popped from the stack. The yaml_parser_load_nodes() function sits on the event stream, issuing events to the appropriate handlers by type. When an event handler function constructs a node, it needs to connect the new node to its parent (unless it's the root node). This is where the loader context stack is used to find the parent node. The way that the new node is added to the tree depends on whether the parent node is a mapping (with a yaml_node_pair_t to fill), or a sequence (with a yaml_node_item_t). Fixes: yaml#107
Patch updates libyaml to the version that contains fixes of stack overflows [1]. PR in upstream with original patch: "Avoid recursion in the document loader" [2]. 1. yaml/libyaml#107 2. yaml/libyaml#127 NO_DOC=internal NO_TEST=internal
Patch updates libyaml to the version that contains fixes of stack overflows [1]. PR in upstream with original patch: "Avoid recursion in the document loader" [2]. 1. yaml/libyaml#107 2. yaml/libyaml#127 NO_DOC=internal NO_TEST=internal
Patch updates libyaml to the version that contains fixes of stack overflows [1]. PR in upstream with original patch: "Avoid recursion in the document loader" [2]. 1. yaml/libyaml#107 2. yaml/libyaml#127 NO_DOC=internal NO_TEST=internal
Patch updates libyaml to the version that contains fixes of stack overflows [1]. PR in upstream with original patch: "Avoid recursion in the document loader" [2]. 1. yaml/libyaml#107 2. yaml/libyaml#127 NO_DOC=internal NO_TEST=internal (cherry picked from commit 6aa30d0)
Patch updates libyaml to the version that contains fixes of stack overflows [1]. PR in upstream with original patch: "Avoid recursion in the document loader" [2]. 1. yaml/libyaml#107 2. yaml/libyaml#127 NO_DOC=internal NO_TEST=internal (cherry picked from commit 6aa30d0)
Patch updates libyaml to the version that contains fixes of stack overflows [1]. PR in upstream with original patch: "Avoid recursion in the document loader" [2]. 1. yaml/libyaml#107 2. yaml/libyaml#127 NO_DOC=internal NO_TEST=internal (cherry picked from commit 6aa30d0)
Patch updates libyaml to the version that contains fixes of stack overflows [1]. PR in upstream with original patch: "Avoid recursion in the document loader" [2]. 1. yaml/libyaml#107 2. yaml/libyaml#127 NO_DOC=internal NO_TEST=internal
Hello YAML team,
As part of our fuzzing efforts at Google, we have identified an issue affecting
YAML (tested with revision * master 01f3a87).
To reproduce, we are attaching Dockerfiles which compile the project with
LLVM, taking advantage of the sanitizers that it offers. More information about
how to use the attached Dockerfile can be found here:
https://docs.docker.com/engine/reference/builder/
TL;DR instructions:
mkdir project
cp Dockerfile.YAML /path/to/project/Dockerfile
docker build --no-cache /path/to/project
docker run -it image_id_from_docker_build
From another terminal, outside the container:
docker cp /path/to/attached/reproducer running_container_hostname:/fuzzing/reproducer
(reference: https://docs.docker.com/engine/reference/commandline/cp/)
And, back inside the container:
/fuzzing/repro.sh /fuzzing/reproducer
Alternatively, and depending on the bug, you could use gcc, valgrind or other
instrumentation tools to aid in the investigation. The sanitizer error that we
encountered is here:
And:
Without some kind of depth limit (even if the default behavior is "unbounded"),
library users parsing untrusted YAML files need to use workarounds such as
preprocessors that will reject invalid payloads, use separate processes for
parsing to compartmentalize impact of a crash, or prevent them from parsing
non-trusted payloads at all.
We will gladly work with you so you can successfully confirm and reproduce this
issue. Do let us know if you have any feedback surrounding the documentation.
Once you have reproduced the issue, we'd appreciate to learn your expected
timeline for an update to be released. With any fix, please attribute the report
to "Google Autofuzz project".
We are also pleased to inform you that your project is eligible for inclusion to
the OSS-Fuzz project, which can provide additional continuous fuzzing, and
encourage you to investigate integration options.
Don't hesitate to let us know if you have any questions!
Google AutoFuzz Team
artifacts_72997432.zip
artifacts_73012922.zip
The text was updated successfully, but these errors were encountered: