-
Notifications
You must be signed in to change notification settings - Fork 672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standard grpc mTLS #3909
base: change-permissions-on-instances
Are you sure you want to change the base?
Standard grpc mTLS #3909
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## change-permissions-on-instances #3909 +/- ##
===================================================================
+ Coverage 89.11% 89.19% +0.07%
===================================================================
Files 256 256
Lines 14643 14684 +41
===================================================================
+ Hits 13049 13097 +48
+ Misses 1594 1587 -7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
5957e41
to
c26c20c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick pass with some nitpicks, didn't test functionally yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except for one small nit.
EDIT: Updated review for the new code changes.
1b18395
to
2108790
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, @georgeliao! So now, in the local environment, the GUI works fine both when upgrading from the old certificates and when starting with no certificates. But in the case of the CLI, it works when starting with no certificates, but if I upgrade from old certificates, I get the following error
E0211 14:23:49.406977684 36809 ssl_transport_security.cc:1316] Handshake failed with fatal error SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed.
And in the snap, both when starting fresh or when upgrading, I get the following
[error] [client] Caught an unhandled exception: failed to open file '/var/snap/multipass/commondata/multipassd/certificates/multipass_root_cert.pem': No such file or directory(2)
About this root certificate is unsync with server certificate issue in development environment, maybe we can add a check at here, which checks not only the existence of the certificates but also whether the root certificate is the one who signed the server certificate. If not, everything will be re-created. This check can be done by openssl c-api. However, not sure the juice worth the squeeze.
This is fixed in the latest version of the PR. |
Yeah, for the dev environment, I don't think it's worth doing the effort. We already have to remove stuff manually, so this is just an extra thing to remove. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, now both the CLI and the GUI work as expected, both locally and in the snap, both when upgrading or installing fresh. Thanks, @georgeliao!
Now, one thing to discuss with everyone: the purpose of these changes was so that we could remove our grpc patches, and that goal is accomplished by this PR. And now the CLI verifies the server certificate, which couldn't be avoided without the patches. But the GUI can avoid verifying the server certificate, using the plain vanilla grpc dart library. My question is: should the GUI also verify the server certificate, to be in line with the CLI, or should we keep the existing, functioning behavior of not having the GUI verify the server certificate?
As discussed elsewhere, let's have the GUI also adhere to the new scheme, preferably in a separate PR. |
291dc64
to
10ecc25
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @georgeliao, lots of good work in here!
I am only doing a tertiary review in this case, so I glossed over most of the scary low-level certificate stuff. @andrei-toterman and @xmkg, we're relying on you on that front 💪 If you could provide assurance that you've verified sanity of all the nitty-gritty, that would be much appreciated. All of this would be for nothing if security were somehow broken on the foundation...
Other than that, I have some proposals for path derivation and the initialization. Let me know what you think.
ac0e4ee
to
ee42308
Compare
…ter instead of raw pointer
…an alias on unique pointer type.
72123e8
to
b277f18
Compare
include/multipass/utils.h
Outdated
} | ||
else | ||
{ | ||
if (result <= 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this check will depend on the C API being used. Some of the C API's use 0 to indicate success. We can decouple the checker implementation to have more flexibility:
// tag type to distinguish checker type
struct checker_base{};
// generic enough, can be placed in utils
struct nullptr_checker : private checker_base {
bool operator()(const void* v){
return v == nullptr;
}
};
// openssl specific checker, better to be placed in a more localized file
struct ossl_checker : private checker_base{
bool operator()(int v){
return v == 1;
}
};
template <typename CheckerT = nullptr_checker, typename T>
auto check(T result, const std::string& errorMessage) -> typename std::enable_if<std::is_base_of_v<checker_base, CheckerT>, void>::type
{
if(!CheckerT{}(result)){
throw std::runtime_error{errorMessage};
}
}
usage:
int test(){
return 5;
}
void* test_2(){
return nullptr;
}
int main(void){
check<ossl_checker>(test(), "");
check<>(test_2(), "failure");
// check<int>(test(), ""); -> wont compile
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd also be good for troubleshooting if we could incorporate the error code into the thrown error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error code can be added. However, about custom checker injection. It becomes that the user have to specify the checker even within the openssl c-api return pointer or integers cases. That defies one of the purpose of the utility function, abstraction. For now, I will move the utility function into ssl_cert_provider.cpp
as a dedicated openssl c-api checker instead of trying to be a global c-api checker. Making a general checker for all c-apis (openssl, hyperv, libssh) can be done in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a very good concern you brought up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It becomes that the user have to specify the checker even within the openssl c-api return pointer or integers cases
We can create local aliases for such cases, but the gist of it is that we have to leave a customization point for the condition (e.g., a predicate or a dedicated type) if we want it to be a global utility, so different API result codes can be accommodated too.
For now, I will move the utility function into
ssl_cert_provider.cpp
as a dedicated openssl c-api checker instead of trying to be a global c-api checker.
I'm OK with that, we can migrate to the global one later if we decide to implement one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... or we can adopt a rather simple approach:
struct PreconditionFailureError : std::runtime_error {
using std::runtime_error::runtime_error;
};
void expects(bool what, const std::string& failure_msg) {
if (!what) {
throw PreconditionFailureError(failure_msg);
}
}
int test() { return 5; }
int main(void) {
expects(test() == 4, "test failed.");
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that definitely can be a candidate for a global C-api checker.
….cpp
bd1143f
to
f17700d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @georgeliao, I've reviewed the permissions part and it looks mostly good to me, provided you're in sync with @Sploder12 on it. Only a couple of things inline
src/daemon/daemon_config.cpp
Outdated
@@ -109,7 +109,7 @@ std::unique_ptr<const mp::DaemonConfig> mp::DaemonConfigBuilder::build() | |||
auto multiplexing_logger = std::make_shared<mpl::MultiplexingLogger>(std::move(logger)); | |||
mpl::set_logger(multiplexing_logger); | |||
|
|||
MP_PLATFORM.setup_permission_inheritance(); | |||
MP_PLATFORM.setup_permission_inheritance(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why keep a line of code that has no effect though?
Also, couldn't we keep this true and override selectively?
src/daemon/daemon_config.cpp
Outdated
MP_PLATFORM.set_permissions(storage_path.toStdU16String(), fs::perms::owner_all | fs::perms::others_exec); | ||
} | ||
else | ||
{ | ||
MP_PERMISSIONS.restrict_permissions(data_directory.toStdU16String()); | ||
MP_PLATFORM.set_permissions(data_directory.toStdU16String(), fs::perms::owner_all | fs::perms::others_exec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens to the "takeown" part of the replaced calls? Is it no longer necessary? is it applied somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not get this question. What is "takeown" part ? You mean setting owner only permission?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calls to restrict_permissions
, which are now replaced, used to take ownership of the files (beside setting permissions). I wonder if you really meant to skip that and, if so, why. Do we no longer need it, or are we doing the same somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, now I get the question. That is a very good point got brought up. I am also confused the fact that why we need it in the first place. Were all files and folders in the storage location owned by root user already? @Sploder12 Maybe you can share some context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If multipassd created the files and folders, they'd be owned by the root user already. So it's usually redundant and probably unneeded
Co-authored-by: Ricardo Abreu <[email protected]> Signed-off-by: George Liao <[email protected]>
…p the certificates folder and cert files.
…ts sub-folder." This reverts commit 26415bc. Revert "[qemu platform] enforce ower_all permission to network sub-folder." This reverts commit 6a1879f. Revert "[open ssh] enforce permission to ssh-keys sub-folder." This reverts commit b2afed4. Revert "[vault] enforce ower_all permission to vault sub-folder." This reverts commit 1ad954d. Revert "[daemon] enforce ower_all permission to multipassd-vm-instances.json file." This reverts commit 72f55a9.
Step1: recursively restrict everything before the creation of certificates directory, opens up the data directory only for other user read. Step 2: create certificate directory and certificates with the right permission. Or overwrite the permissions in the case of existing certificates directory and certificate files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 Good work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, looks good to me, modulo a small request inline. Thanks Jia!
src/cert/ssl_cert_provider.cpp
Outdated
const std::filesystem::path cert_path_std = cert_path.toStdU16String(); | ||
MP_PLATFORM.set_permissions(cert_path_std, | ||
std::filesystem::perms::owner_all | std::filesystem::perms::others_read); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add group_all
just to be safe. Otherwise, a user who happened to be in the group but wasn't root would be unable to read, I think.
The same reasoning would apply in all other cases where you're giving others permissions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add group_all just to be safe. Otherwise, a user who happened to be in the group but wasn't root would be unable to read, I think.
I did not get this part. Whether the user is in the group or not, he is still the other user and std::filesystem::perms::others_read
guarantees that he can read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking into some literature, I think I have a misconception and the group read permission can be added.
1a67a88
to
180db32
Compare
45a6b64
to
b4bfdce
Compare
…ssing root certificate.
MULTI-1765
Public side of https://github.com/canonical/multipass-private/pull/719
A few things to note
The
make_cert_key_pair
function originally handles the generation of both server and client key-certificate pairs . Theserver_name
parameter determines the use case, whenserver_name
is an empty string, it indicated the client key-certificate case. In the standard gRPC TLS scheme, the root key-certificate pair is added.Now, the role of the
make_cert_key_pair
function is still generating and holding server and client key-certificate pairs. However, the server certificate has evolved into a signed server certificate. As the result, the server side branch must first generate root key-certificate pair and use it to sign the newly created server certificate.Previously the constructor of
X509Cert
only handled server and client certifcate generation, distinguishing between them based on whetherserver_address
was empty. Now, with the introduction of root, server and client certifcate, the dispatch is managed by using theCertType
enumeration.Additionally, the
X509Cert
constructor has been refined to generate certificates in a standard format. Key differences between before and after include adjustments to the serial number format and the inclusion of X509v3 extensions. To inspect a certificate, you can use the following command:openssl x509 -in <cert_path> -noout -text
The certificate paths are as follows:
/root/.local/share/multipassd/certificates/localhost.pem
/home/<user name>/.local/share/multipass-client-certificate/multipass_cert.pem
/usr/local/share/ca-certificates/multipass_root_cert.pem
Snap environment considerations
In the snap environment, the root certificate is stored at
/var/snap/multipass/common/data/multipassd/certificates/multipass_root_cert.pem
Unlike
/root/
,/var/snap/
directory allows other users view its files, making this setup feasible.Certificate regeneration and migration
The root certificate and server key-certificate pairs area automatically regenerated if either is missing. This mechanism ensures a smooth server key-certificate pair migration when updating Multipass. Upon upgrading, the server startup process will automatically generates root key-certificate pair and use it to sign a fresh server certificate. Consequently, the original server key-certificate pair will be overwritten, enabling successful verification under the new standard grpc TLS.
The Multipass upgrade process should be included in the functional testing as well, both the cmd and gui clients should be tested.
Unit test adaptations
The unit tests have been modified to accommodate changes in the gRPC TLS verification process. The key adjustments include:
get_root_cert_path
to allow usage of string-based certificates.MockCertProvider
being used on both server side and client to provide key-certificate pair. In the unit testing environment, they can be the same.Following the commit history and messages is also a helpful way to understand changes that have been made.