-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector leaks SSL_CERT_FILE
and SSL_CERT_DIR
env variables to exec
processes
#18103
Comments
As a proof of concept, I implemented approach (3) above: diff --git a/src/app.rs b/src/app.rs
index 63d2a53bb..c7413270d 100644
--- a/src/app.rs
+++ b/src/app.rs
@@ -387,7 +387,9 @@ impl FinishedApplication {
}
pub fn init_global() {
- openssl_probe::init_ssl_cert_env_vars();
+ if !openssl_probe::has_ssl_cert_env_vars() {
+ openssl_probe::init_ssl_cert_env_vars();
+ }
#[cfg(not(feature = "enterprise-tests"))]
metrics::init_global().expect("metrics initialization failed"); And it works as expected. No
Defining any of the
Defining any of the
If approach (2), the ideal way, is not viable or difficult to implement, I think this approach (3) would be good enough. |
Thanks for trying that out @hhromic I agree with the assertion that Vector should execute subprocesses in the same environment it was started in rather than a modified one. Given that, option (1) seems like the "most correct" to me. Is the lack of a Absent the ability to do (1), then (2) seems more correct, but (3) seems like a definite, and smaller, improvement so I'm happy to see a PR with it if it sufficient for you. Incidentally it looks like the |
Hi @jszwedko !
Yup, unfortunately it will take a while for it to land on some package repositories, specially Debian. Nevertheless I would like to see this behaviour fixed in Vector anyway to avoid future confusions.
I investigated this more and I was wrong with option (1). It turns out that the For option (2) I actually thought of a better approach. On startup, before running
This would effectively shield If the Vector team is willing to accept such contribution, this looks like a fun challenge to pursue for me. Let me know! |
I like this as it should guard against any other environment changes by Vector as well. I'd be happy to see a PR for it if you are up to the challenge :) It will require some threading through of the environment captured at start-up to components. |
Ah, I see you already opened a PR taking a slightly different approach 😄 |
Hehe yes and no :) I already found an easy way to capture and store the original environment at Vector startup, and now I'm trying to devise how to best pass this to the This is indeed a fun challenge, but not so pleasant to work on because Vector truly takes a while to compile on my machine :( So far my plan is to propagate the original env Hashmap over this chain, but def want to do it as clean as possible: |
Aha 😄 Agreed it is a useful feature in its own right.
One common tip than can cut down compilation times a lot is to only include the components you are interacting with. For example, Otherwise that plan makes sense to me! |
Ah that's good to know. What features do I need to enable to compile I also realised that for some reason, |
I think those will be compiled even if you just disable all features with
Ah, that's odd, but 👍 to that work around 😄 |
@jszwedko after thinking this a lot more, I decided to drop the idea of copying the original environment to pass it to the The truth is that it is very unfortunate that the global truststore in OpenSSL can only be configured via process environment variables, forcing things like That being said, I think for now the best solution is what I proposed in this comment (only calling the openssl-probe function if the respective SSL variables are not already set). However, I decided on a slightly different implementation. See the linked PR. |
@dsmith3197 @jszwedko How about an opt-in environment variable (with a corresponding CLI option if you want) that specifically enables (default) or disables the
If the value is set to I thought of that variable name due to (1) this being an OpenSSL-specific functionality and (2) the existence of another OpenSSL-related env variable in Vector: Of course, this would be documented properly in the Vector environment docs including its use-case. What do you think of that idea? I can update the PR accordingly if you prefer that approach. |
I'm in favor of this approach, as I think it is the clearest in terms of the behavior. Also, it is self-documented, meaning that others will be able to discover the feature, whereas that was not true in the approach where we would conditionally probe based upon the existence of other env variables. Thanks for all your effort on this issue so far @hhromic! |
A note for the community
Problem
During the development of a data ingestion component with Vector and the
exec
source, we discovered that Vector is leaking theSSL_CERT_FILE
andSSL_CERT_DIR
environment variables (used by OpenSSL) to the child processes started byexec
. Note that these variables are only independently overwritten by Vector if not previously set in the environment.In our use case, we are using
curl
inside a shell script as theexec
process and setting a customSSL_CERT_FILE
variable in the environment. While Vector passes the correct value down to the child process, it is also still passing theSSL_CERT_DIR
variable as well. Due to an issue incurl
, if bothSSL_CERT_FILE
andSSL_CERT_DIR
are passed,curl
will not honourSSL_CERT_FILE
.While the actual bug is in
curl
by not honouringSSL_CERT_FILE
alone, I think Vector should not be leaking those variables to child processes ofexec
, and instead should run the child process using the same environment in which Vector is running.Potential Solution Discussion
This behaviour was introduced in #904, where the
openssl-probe
crate was added to Vector and theninit_ssl_cert_env_vars()
is called during Vector init to populate these variables for OpenSSL: https://github.com/vectordotdev/vector/blob/0f13b22a4cebbba000444bdb45f02bc820730a13/src/app.rs#L387C23-L392This is necessary because Vector is using the
vendored
feature of theopenssl
crate, which does not load system CAs:There are three potential approaches to address the described problem:
Instead of setting the
SSL_CERT_FILE
andSSL_CERT_DIR
variables within Vector withinit_ssl_cert_env_vars()
, Vector could use theprobe()
function inopenssl-probe
which only probes suitable SSL CA file/dir locations on the system but does not actually set any variables itself. Then, with the probing results, Vector should configure OpenSSL directly.SSL_CTX_load_verify_locations()
function.openssl
crate does not provide complete access to this function (see source code), because it only sets theCAfile
argument and leavesCApath
always set to null.For the
exec
source, Vector could unsetSSL_CERT_FILE
orSSL_CERT_DIR
if and only if they were set byinit_ssl_cert_env_vars()
before launching the child process.The
openssl-probe
crate has atry_init_ssl_cert_env_vars()
function to help with this.A potential problem with this approach is that Vector would need to propagate this information (if the variables were set or not by
openssl-probe
) down to theexec
source implementation.Vector can conditionally call
init_ssl_cert_env_vars()
if and only ifSSL_CERT_FILE
andSSL_CERT_DIR
are not already present in the environment. This equivalent to Vector fully honouring those variables.The
openssl-probe
crate has ahas_ssl_cert_env_vars()
function to easily accomplish this.This approach would still leak
SSL_CERT_FILE
andSSL_CERT_DIR
intoexec
child proceses, but only if none of theme was already set in the environment. This is better than leaking one or the other partially.While approach (2) is the optimal solution, approach (3) could be a good-enough solution in the meantime.
Configuration
Version
vector 0.31.0 (x86_64-unknown-linux-gnu 0f13b22 2023-07-06 13:52:34.591204470)
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response
The text was updated successfully, but these errors were encountered: