-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import 'time/tzdata' in service or collector #9991
Comments
This also came up in open-telemetry/opentelemetry-collector-contrib#32479 |
I would be inclined to load this in something like the If we don't want that level of granularity, I would suggest putting this inside the builder and offering a setting to disable it with a nice warning about knowing what you're doing by toggling the setting. |
As I see it, there are several ways to solve this:
Personally, I think (2) is the best option here. The timezone database is frequently updated and it should be kept separate from the binary for that reason. In cases where a system database exists (most of the time outside docker), compiling a 400kb database into the binary is wasted space that will never be used. |
This is similar to a combination (3) but I think introduces an unnecessary level of indirection. Components can just as easily import I also thought of a 5th option (sort of a combo of 2 and 3) which is to document this issue and leave it to component owners to make their own decision if they want their component to be the one that bloats the binary size or if they want to depend on the host. This would also likely require the published docker images to have the tz database though. edit: I still like option (2) the best |
The conversation linked in the OP states that this is an issue on windows. I added this comment there, but I think it is also relevant to this conversation so I'll paste it here:
If this is an issue for windows only it can also be done as a build flag in the release pipeline. Windows builds could include it by adding the This could be combined with solution (2) above to solve it everywhere with minimal impact to binary sizes while still ensuring tz data is available for our users. |
Fair point on the binary size. I agree distros should have a choice.
I don't think is does anything for to the original issue I mentioned, where a custom build of the collector is running into strange behaviors because one component is loading the database on its own. I think any solution or recommendation we come up with should be geared towards finding the database consistently regardless of which components are included.
I should clarify that my suggestion to import
The problem here again is that it's not just a matter of binary size, but that if a user adds/removes a component to their builder manifest, it should not change the behavior of another component.
This is pretty user unfriendly but I think it's compatible with @evan-bradley's suggestion to provide an option in the builder manifest. Essentially, we would make a best effort to give users full control, while also giving them an easy way to build in a default if they want it. Additionally, I think any approach we choose should be prescriptive to component authors that they SHOULD NOT import |
Sorry if it wasn't obvious but I would say in the case of (2) we would prescribe that components don't import the DB. I think it's a documentation issue. I see it as very similar to the certs. We don't compile those in for very similar reasons, even though it would be theoretically possible. I think if it was my choice this is what I would do:
This would accomplish:
|
This all seems reasonable to me except (3). Adding an entire layer to the build process which attempts to analyze code and generate additional code based on findings sounds fragile and burdensome. Easy to find documentation with expected error messages and remediation options sounds reasonable though. |
i think i'm aligned with Dan here. Opt-out. This makes sure that we're functioning well on all systems by default but we give users and option to say, I really don't need this and I want to spare some bytes. From usability perspective opt-out makes more sense to me. |
I'm fine with either opt-in or out. If the tz data is included in the build but present on the host, it will just be ignored. A bit wasteful, but I think an ok tradeoff for being sure the default config works in as many places as possible. It is just one more collector "tuning" step that a user may do if they want. |
Just as another point to consider, the |
It sounds like we're all in agreement on putting it in the builder to make it easily accessible to users. I'm going to suggest including it as an opt-out flag in the builder since it's only a .5% increase in binary size, looking at a recent core binary. I'd have consensus before possibly adding ~400 KB to most Collector binaries. cc @open-telemetry/collector-approvers @open-telemetry/collector-contrib-approvers 👍 Include an enabled by default option in the builder to pass the |
I also prefer to have the database on the container image.
Our container images should be seen as a recipe for what to include in distributions, and downstream/custom distributions should mirror that, similar to what we do with the CAs used with TLS. |
Is there an easy way to include it on the official image? And to keep it updated over time? (Maybe an alpine package that we can copy files from?) |
I don't think only including it in the image is sufficient as users can use the collector we release in other formats. If we don't put it in ocb, it needs to be in all the artifacts we release, not only the docker image. |
I think this was mentioned above: on Unix systems at least tzdata is provided by the operating system, and Go will pick it up automatically, so it's not a problem. On Windows we don't (yet?) release a package per-se, that would be the only place where this would be a problem. I see tzdata as an external dependency, similar to the CA certificate bundle. It has a lot of similarities (updated relatively infrequently, commonly used, usually provided by the operating system...). We have other components that need external dependencies for which our stance is document (e.g. open-telemetry/opentelemetry-collector-releases/issues/462, Docker components where you need to mount the Docker socket, JMX receiver...). Virtually every single component uses CA certificate bundle, so we add it to the container by default, and depend on the operating system for it in other cases. If we see this as something as common/essential as the certificate bundle, then we should do something similar: add it to the Docker image. I don't see how the situation is different from this prior case, and why we should do something different. |
That's my thinking as well, and why I think including the database on the container images should be sufficient.
+1 |
just a small note, from my experience, claims like from end user perspective (and this is a question as it's not clear to me), when building a collector of my own or using binary that was released how do I know there's something wrong before it's misbehaving on test in better case or prod (because i haven't caught any error in test). |
If you are interested on an upstream Windows MSI package, we have open-telemetry/opentelemetry-collector-releases/issues/157 to track this. I should have linked to this issue explicitly on my message, sorry. Once we come to a conclusion on tzdata, if the outcome is to have tzdata included on the packages, I can add a note to the collector-releases issue so that we ensure to provide tzdata/solve this problem in some other way on Windows. Not sure if this fully addresses your concern, let me know if not.
Ideally every component:
Sometimes it's hard to detect if the requirements are not met (e.g. impossible to determine if it's a temporary or permanent error), and sometimes validation is hard to do without actually running the component, but I think this is a summary of what we have historically strived for. |
It is a simple matter to build it into windows binaries using goreleaser with a flag though without affecting the linux/mac builds. I'm still of the opinion that it belongs as a host package for all reasons already mentioned. FROM alpine:3.19 as certs
RUN apk --update add ca-certificates
FROM alpine:3.19 as zones
RUN apk --update add tzdata
FROM scratch
COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY --from=zones /usr/share/zoneinfo /usr/share/zoneinfo |
I would be okay with keeping the database inside container images and not baking it into the binary by default, but if we take that route, could we offer tooling to handle this, through the builder or elsewhere?
I agree with what @jpkrohling said here, but I think the process of going from an OCB-generated binary to an image isn't straightforward to someone who just wants simple defaults. Our goreleaser config for our Kubernetes distribution, for example, includes a lot of flags, architectures, etc. that may not be useful to a standard user. I would say the same for our Docker image: it's not immediately obvious how much of this is a situation where upstream needs to cover edge cases or whether it's good for me as an end-user to also build into my distro. Overall, I'd like to make it obvious when to include these files in custom distributions. Could OCB output optional Docker and Goreleaser files? Or do we expect users will want to deeply customize these and documentation on our build configuration files will suffice? |
It could output a Dockerfile, but not sure about goreleaser templates. While also possible, I'm not sure we'd get much value from it. |
Discussion from the 2024-04-24 SIG meeting:
|
Is your feature request related to a problem? Please describe.
Components sometimes (often indirectly) make use of Go's
time.LoadLocation
. This depends on an IANA Time Zone Database, which as the documentation forLoadLocation
describes, may be provided in various ways.Importantly, there is at least one mechanism by which one component can affect the behavior of another. See open-telemetry/opentelemetry-collector-contrib#32506.
Describe the solution you'd like
Go 1.15 introduced the
time/tzdata
package, which can basically serve as a default when finding a database using all other mechanisms is unsuccessful. I think we should import this package at a high level in the collector so that all components which rely ontime.LoadLocation
have the same default behavior.Describe alternatives you've considered
Import
time/tzdata
in contrib'stimeutils
package, which solves the immediate issue but leaves us open to future recurrences of the same problem depending on component implementations.The text was updated successfully, but these errors were encountered: