-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there an option to create hostport service using operator #252
Comments
Sorry, but could you give an example of the service YAML you need? Not quite sure what a service with hostport means. |
@jpkrohling - If I am not wrong @sanjaygopinath89 is looking to create NodePort service. |
Wouldn't that make the node redirect the traffic to a non-determined pod that backs this service, not necessarily the one located in the same node? That kind of defeats the purpose of Daemonsets... |
Correct me if I am wrong - you would be sending traces of apps running within cluster right - why cant we use a cluster IP service? https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#communicating-with-daemon-pods |
Correct. Note that a Daemonset is a resource telling Kubernetes to run N replicas per node. Agents are typically created as Daemonsets to receive data that is generated by workloads in the same node. In that case, you'd know the IP for the collector that is responsible for receiving your data: it's the same IP as the node. The only thing left to discover is the port. You can use the downward API to tell the pods which agent to use (field https://www.jaegertracing.io/docs/1.22/operator/#installing-the-agent-as-daemonset If you don't need to have N collector instances for every Kubernetes node in your cluster, you should use Deployments instead, and Kubernetes will schedule the best nodes to run the collector pods. You then discover the collector instances by referencing the Service name that is backed by the pods, and Kubernetes will decide which pod ends up receiving the connection. It might or not be a collector in the same node. In other words: DaemonSets are good for agents when you don't want to use agents as sidecars. Otherwise, use Deployments instead. |
We are looking for an option to open "hostPort" in daemonset configuration, we used to give below port configuration in daemonset to open hostport (Not in service , I gave it by mistake in my initial comment ) `
We used to make this configuration in daemonset directly , when we use opentelemetry operator what changes we need to make in the configuration so that "hostPort" will be open in the daemonset. Below given is the sample configuration for our daemonset(not given the entire collector config ).
How we can change specify the hostport ..? |
@sanjaygopinath89, Are you trying to reach the agent (daemonset) from outside cluster - using NodeIP:hostport ? Are you pushing traces from applications which are not present in current cluster? Basically if you running any application on HostPort, you are binding with one of the ports on worker node and you can reach the application from external world using workerNodeIP:hostPort. This is generally done, if one doesn't have an ingress-controller setup. I am not sure this would be needed for opentelemetry setup. Traces would be sent by apps running inside cluster only right, not from external world. Correct me if I am wrong. |
@bhiravabhatla We are sending the trace from inside the cluster only . How we can make sure that a trace generated from a application reach the opentelemetry collector agent (daemonset) in same host. Currently by using hostip:port (as described by @jpkrohling is by using field status.hostIP ) the load from one host reach the same host otel-agent . If we use service endpoint for sending trace it will reach any one of the pod of the daemonset rt ..? We are expecting high load, so If we want to distribute the load , any other approach we can follow ? |
This is correct. If you use a cluster IP service, kube-proxy would load balance and there is no guarantee that the trace would reach the pod running in the same node as the application.
Let me ask you this. If we want to do this, then you have to configure the agent IP:Port before deploying the applications, That would be different for different nodes - so you would have to know which worker node exactly your application pod would be scheduled, how are we planning to achieve this.
This wont be the case unless you mention the hostPort while deploying the daemonset right. correct me if I am wrong. |
The best way to distribute load is to have collectors as sidecars so that you have one collector per pod. Those sidecars would then make a connection to a central collector cluster ( |
@jpkrohling If we have to use "sidecar" to handle high load. Can you give a good use case for deploying collector as a Daemonset ..? @jpkrohling @bhiravabhatla can we conclude that there is no option to get the Hostport open on a Daemonset when deployed using opentelemety operator |
Yep, my bad. I missed this. I don't see any reason why we cant have hostPort supported in the operator. @jpkrohling any specific reason not to do it. I see Jaeger operator supporting this using AgentSpec.Options Used while creating Daemonset here - |
I have nothing against the
Both modes can handle a high load, but if the connection between your collector and the backend uses a long-lived connection (gRPC, Thrift, HTTP/2, ...), then you need good load balancing on the client-side, otherwise, a few backend nodes might end up with too much traffic, while others are idle. Or worse, you might end up having more replicas of your backend than you have nodes, meaning that your throughput is limited by the number of nodes you have. If you do have enough nodes in your cluster and have no multi-tenancy needs, DaemonSets might be better, as there are fewer agents to manage: if you have 1000 pods, you might potentially have 1000 agents, each causing an overhead of a few MBs. Besides, a configuration change to your agent won't trigger a new deployment of your workload, as would be the case with sidecars. Again: imagine rolling out 1000 new pods just because you changed the agent configuration. |
@jpkrohling - If we decide to support hostPort in daemonsets. I think we already have the code to extract the ports from receiver config - which we are using for building service.
We would need to modify Container.go to include ContainerPorts for daemonsets.
I could work on this, but have few other issues I signed up for. I can pick this up after the other issues are merged. Thoughts? |
Sounds reasonable to me. Given that you pretty much gave the solution here, I'll label this as good-first-issue. If nobody volunteers, you can pick this up later. |
Sure @jpkrohling. @sanjaygopinath89, would you be interested in raising a PR for this. Please feel free to. |
@sanjaygopinath89 you are creating single point of failure (i.e when the otel collector agent on that host is down none of the applications on that host will be able to forward the traces) when k8s is providing you HA while using daemonsets. Also exposing port on host is not recommended from security perspective. Curious why do you think connections across host isn't scalable to handle high load and connections within same host is scalable. |
@jpkrohling, any thoughts regarding this topic? Should we proceed? |
I personally would not use a host port with my deployments or daemonsets. But if there are users interested enough about this to the point of contributing code to the project to support this, then I'd accept a contribution, given there are no bad side-effects of this (ie, this should be disabled by default). |
I've read that ideally each node should run an agent that should receive traces from applications running in the same node in order to add metadata about the node. I'm not sure if this makes sense inside Kubernetes, but there are some articles promoting this pattern: https://opensearch.org/blog/distributed-tracing-pipeline-with-opentelemetry/ To me, this sounds more like a solution for servers that aren't part of an orchestrated cluster, but would be nice to get more input here. @jpkrohling What's the differece between deployment(agent)+deployment(gateway) and sidecar(agent)+deployment(gateway)? If the node doesn't matter, then why use a sidecar instead of just another deployment for the agent? |
I am also interested in and advice which deployment mode should be used when. Here are already good suggestions, which could be written somewhere down in the docs so we can point to it from the readme maybe? I am sure there are more people out there which will have this question, especially when they are new to the whole observability topic like me. |
@jaronoff97, @frzifus, I'd like to see your opinions about it. |
I went with deployment+deployment and it works just fine. No need to make things more complex than they need to be. |
@jaronoff97, should we close this issue? |
Yes, i think for now we should close this in favor of this issue which is a bit more descriptive about why we want what we do. |
We are implementing opentelemetry collector as Agent(Daemonset) and collector(Deployment). For daemonset we need to open Hostport , we are using hostip:PORT to send traces to the agent.
https://github.com/open-telemetry/opentelemetry-operator/blob/main/config/crd/bases/opentelemetry.io_opentelemetrycollectors.yaml
As in this CRD , there is no option to give "Hostport".
Please let us know if there is an option for it , or is there a plan to implement it in future ..?
The text was updated successfully, but these errors were encountered: