Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML security #22

Open
manning-ncsa opened this issue Feb 18, 2021 · 5 comments
Open

XML security #22

manning-ncsa opened this issue Feb 18, 2021 · 5 comments

Comments

@manning-ncsa
Copy link

First of all, thanks for creating and sharing this Python package. While exploring the code and beginning to write our own handler function, I noticed that according to the Python docs, there are XML vulnerabilities that should probably be secured using something like defusedxml. Have you considered using this package to parse the incoming packets?

@lpsinger
Copy link
Member

We use lxml, not the Python standard library's parser. But I do see in the defusedxml README that it has some vulnerabilities. We do pass the lxml.etree document root to the user's callback function, so switching to a different parser would be an API change.

Which of the mitigations in defusedxml can we apply without switching parsers? Which is lxml permanently vulnerable to?

@manning-ncsa
Copy link
Author

Unfortunately I do not have experience nor expertise in XML parsing and vulnerability mitigation. I do see that the defusedxml.lxml module is deprecated and is intended as an example of how you can mitigate certain vulnerabilities:

DEPRECATED The module is deprecated and will be removed in a future release.

The module acts as an example how you could protect code that uses lxml.etree. It implements a custom Element class that filters out Entity instances, a custom parser factory and a thread local storage for parser instances. It also has a check_docinfo() function which inspects a tree for internal or external DTDs and entity declarations. In order to check for entities lxml > 3.0 is required.

parse(), fromstring() RestrictedElement, GlobalParserTLS, getDefaultParser(), check_docinfo()

The relevant code seems to revolve around this function: check_docinfo()

I assume the actual risk is low, considering that the organizations generating the VOEvent notices are unlikely to send malicious XML documents, but perhaps someone that has the time and energy to tackle this will see this issue and make it more secure.

@lpsinger
Copy link
Member

I assume the actual risk is low, considering that the organizations generating the VOEvent notices are unlikely to send malicious XML documents, but perhaps someone that has the time and energy to tackle this will see this issue and make it more secure.

The problem is that GCN has no protection against MITM attacks, because it does not have any message authentication. The network traffic is sent in the clear and without signatures of any kind. An attacker could inject maliciously crafted VOEvent packets.

@lpsinger
Copy link
Member

@manning-ncsa, would you like to submit a PR to implement the mitigation described in that check_docinfo() function?

@manning-ncsa
Copy link
Author

The problem is that GCN has no protection against MITM attacks, because it does not have any message authentication. The network traffic is sent in the clear and without signatures of any kind. An attacker could inject maliciously crafted VOEvent packets.

😲

@manning-ncsa, would you like to submit a PR to implement the mitigation described in that check_docinfo() function?

I suppose in light of your previous comment I have more of a self-interest in doing so. If I do implement anything like that, I will definitely submit a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants