-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CLI] Refactor all CLI entrypoints to i.e. Clap #4045
Comments
Clap is chalk-full of unsafe code, and popularity is not code correctness. |
Not urgent, but as I've mentioned before, I urge core devs to document the guidelines for what sort of dependencies are appropriate to include in the Stacks blockchain, the criteria by which any new dependencies shall be judged etc. Right now, for the most part, it feels like people make suggestions in good faith and they are unilaterally shut down by Jude. Jude, your disdain for unsafe code is well-known. And sure, popularity might not be a benchmark for code-correctness. However, there are equally valid counters to these arguments (I won't rehash them here, but Aaron and others have made several of them in the libp2p discussion and many other discussion related to that). I do not believe you speak for all core devs in this matter. |
Rust's core libraries are also full of unsafe code... While popularity isn't a measure of code correctness, it is indicative of the number of extra eyes which have been on the project's codebase, increasing the likelihood of potentially dangerous usages of unsafe code being discovered and corrected (if not from the sheer userbase finding and reporting undefined behavior). I gave Clap as an example due to its popularity and acceptance, however there are a number of other alternatives. EDIT: I just had a quick look through Clap's codebase and all but one of their libraries actually use |
Yes, as is libc, the OS kernel, and the chipset firmware atop which the blockchain runs. However, we don't have a realistic alternative but to use these. Specific to the Rust core libraries, a plethora of the data types they support cannot be expressed in safe Rust code in the first place, because the type system itself is simply not expressive enough. So we're kinda stuck with it :P However, we are not stuck with gratuitous unsafe code.
I used to believe this, until Heartbleed was discovered in OpenSSL, which is far more widely used and gets way more eyeballs than any cryptocurrency project (as measured by usage). More generally, I don't think the question of whether or not more eyeballs increases the likelihood of discovering catastrophic safety failures like Heartbleed is supported by data -- I don't think this question has ever been rigorously and systemically studied across the open source landscape (but please share papers if you know of any!). Of particular interest would be the learning how the discovery rate of safety failures relates to the additional marginal eyeball. In the absence of this data, we need to instead focus on the correctness of the code itself.
Yes -- it's Maybe I'll fork |
@jcnelson I'm happy to work with you to remove gratuitous unsafe code from popular rust crates. I'm really good at getting stuff merged upstream, I have a number of commits in openssl and dalek. So if you can make a fork, I'll do the work to get the upstream PR through the gauntlet. |
Yeah, I was just making the obvious point that unsafe code (in general) is rather unavoidable. Heartbleed is a bit different than unsafe in Rust IMO as OpenSSL is written in an inherently unsafe language where a simple bounds check can easily go unnoticed, whereas unsafe usage in Rust is very explicit... so while I don't have any data to back it up, I do believe that with popularity/usage the number of people like yourself who raise critique to gratuitous usages of unsafe also increases since they are so easy to spot in Rust, which I want to hope leads to a reduction in such occurrences - but that's just an optimistic theory ;) Clap's unsafe usage is around But back to the issue and question at hand - the CLIs and Clap - I believe that there is room for risk-based pragmatism here as we're talking about usage at invocation of the program where any "untrusted input" is coming from something/someone who more than likely has some sort of elevated authorization. So before going and rewriting 3rd party libraries, I think the question we should answer first is "what are we trying to protect [at this stage of program execution]?". |
I just checked out the current The Looking at the comments, all of this Specifically, they define a trait We really, really should fork and fix this. It would be less effort than all of the back-and-forth. And an overall contribution to the open source community at large. If the |
@xoloki I was just about to reply to your last comment and you beat me to the punch 👊 I'm glad to hear that making the code safe is so trivial! In the unlikely case that upstream won't accept it, I'd definitely be in favor of maintaining a |
@cylewitruk I think @xoloki's discovery addresses the immediate problem at hand, but I just wanted to address this point because it underpins my overall disdain for gratuitous unsafe code:
I don't think there is any such thing as "risk-based pragmatism," because it only takes a single out-of-bound write to lead to arbitrarily bad consequences. Out-of-bounds writes don't happen in isolation -- it doesn't matter how "early" in the program execution it happens, because the write can affect how other subsystems in the process work down the road. A clever person would weaponize an out-of-bound write in the argument parser to trigger other out-of-bounds writes in the code that executes afterwards, which in turn could be weaponized to overwrite parts of the program's internal state to change its behavior. For example, the argument parser could overwrite the region in memory that corresponds to the string containing the path where blocks get stored (since this is "close to" the argument parser state in memory), which in turn would enable the attacker to deposit blocks on the victim's computer in an attacker-controlled location (such as, perhaps, the "$HOME/bin" directory). Then, the attacker could mine a block that contains shell code, and cause your node to store that shell code in a place where it could be executed. This is all hypothetical of course, but the point is that attackers often chain together seemingly-unrelated out-of-bounds reads and writes to attack the host computer. Therefore, we must prevent them at all costs -- we can't predict in advance how they'll be exploited! |
So it's actually
Do you think it's that simple? Looks like the Unix version of the trait would work pretty much ootb, but the Windows version is a little more convoluted in that it requires Tbh this isn't high-prio, we've all got a lot on our to-do lists so I would start with simply creating a ticket explaining the above and seeing what they say. But I would be all for a PR if they don't agree or want to tackle it :) |
Yeah I do understand your point and position and I don't disagree with anything you're saying - but realistically, following this mindset, I shouldn't even dare turn on my computer ;) Unsafe doesn't necessarily mean unsound just as safe doesn't necessarily mean sound. I am probably more likely to introduce a critical bug in safe Rust than unsafe Rust - I can still accidentally go index oob and crash a node (or even the network in a worst-case bug in p2p). What I mean is that, I think the safety "guarantees" of Rust can be a false sense of security as well. But I'm curious what you see as the big risk with the scenario you wrote above? I mean, I already have access to an executable environment and most probably also the config with key info - if following any sort of best practice and running the node sandboxed/containerized then I have a hard time imagining what more damage I could do via the long-winded example exploiting the Stacks node's CLI vs. via the shell. Note that I have a different stance when it comes to untrusted input from unknown sources (such as network peers). |
@cylewitruk as you noted, it would be trivial to get a safe unix clap_lex using conditional compilation, but the windows version would still be hand waving away some safety guarantees. If you read the comments in the existing code though, that's what they're doing now. It just requires unsafe code in the unix path too, which frankly is what we care about anyway. |
@xoloki @jcnelson just FYI rust-lang/rust#109698 (merged PR in rust-lang) was written by an author of Clap based on their work with Clap: clap-rs/clap#5209 (comment)
|
I'm well aware :) However, in memory-unsafe languages, a whopping 70% of bugs are memory-safety issues [1] [2]. This is independently validated by both Microsoft and Google -- the former over a 12-year period. This finding also seems to apply to the
Again, memory safety errors are often chained together and weaponized in ways that are very hard to predict, with arbitrarily bad consequences. That's not a risk I'm keen to force users to take on if it can be avoided.
First, not everyone does, and I'd rather put in a little bit more effort on the developer side of things to meet users where they are instead of requiring users to become opsec experts over night. Second, why should we defer securing the host (and its private keys) to the sandbox when the programming language can do so for us at compile-time? The latter is far more robust. Third, these sandboxes / containers are not nearly as secure as you seem to think [4]. [1] https://www.zdnet.com/article/microsoft-70-percent-of-all-security-bugs-are-memory-safety-issues/ [2] https://www.chromium.org/Home/chromium-security/memory-safety/ [3] https://dl.acm.org/doi/10.1145/3477132.3483570 [4] https://www.container-security.site/attackers/container_breakout_vulnerabilities.html |
While spending time in the code I've noticed that (I think) all CLI entrypoints are using manual parsing of command line arguments.
The code could be much cleaner and the CLI's more uniform if we refactored these to use a CLI arguments library (such as clap (155M downloads, one of the most popular libraries in the Rust ecosystem).
Not high prio, adding this as a placeholder so it exists and people can comment.
The text was updated successfully, but these errors were encountered: