Is there an option to create hostport service using operator #252

sanjaygopinath89 · 2021-04-22T06:47:06Z

We are implementing opentelemetry collector as Agent(Daemonset) and collector(Deployment). For daemonset we need to open Hostport , we are using hostip:PORT to send traces to the agent.

https://github.com/open-telemetry/opentelemetry-operator/blob/main/config/crd/bases/opentelemetry.io_opentelemetrycollectors.yaml

As in this CRD , there is no option to give "Hostport".

Please let us know if there is an option for it , or is there a plan to implement it in future ..?

jpkrohling · 2021-04-22T08:31:30Z

service with Hostport

Sorry, but could you give an example of the service YAML you need? Not quite sure what a service with hostport means.

bhiravabhatla · 2021-04-22T10:19:58Z

@jpkrohling - If I am not wrong @sanjaygopinath89 is looking to create NodePort service.

jpkrohling · 2021-04-22T13:08:05Z

Wouldn't that make the node redirect the traffic to a non-determined pod that backs this service, not necessarily the one located in the same node? That kind of defeats the purpose of Daemonsets...

bhiravabhatla · 2021-04-23T03:28:06Z

Correct me if I am wrong - you would be sending traces of apps running within cluster right - why cant we use a cluster IP service?

https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#communicating-with-daemon-pods

jpkrohling · 2021-04-23T10:16:39Z

Correct. Note that a Daemonset is a resource telling Kubernetes to run N replicas per node. Agents are typically created as Daemonsets to receive data that is generated by workloads in the same node. In that case, you'd know the IP for the collector that is responsible for receiving your data: it's the same IP as the node. The only thing left to discover is the port. You can use the downward API to tell the pods which agent to use (field status.hostIP), like this example here from the Jaeger Operator:

https://www.jaegertracing.io/docs/1.22/operator/#installing-the-agent-as-daemonset

If you don't need to have N collector instances for every Kubernetes node in your cluster, you should use Deployments instead, and Kubernetes will schedule the best nodes to run the collector pods. You then discover the collector instances by referencing the Service name that is backed by the pods, and Kubernetes will decide which pod ends up receiving the connection. It might or not be a collector in the same node.

In other words: DaemonSets are good for agents when you don't want to use agents as sidecars. Otherwise, use Deployments instead.

sanjaygopinath89 · 2021-05-04T13:04:53Z

@jpkrohling @bhiravabhatla

We are looking for an option to open "hostPort" in daemonset configuration, we used to give below port configuration in daemonset to open hostport (Not in service , I gave it by mistake in my initial comment )

`
ports:

containerPort: 55681
hostPort: 55681
name: sfx-receiver
protocol: TCP`

We used to make this configuration in daemonset directly , when we use opentelemetry operator what changes we need to make in the configuration so that "hostPort" will be open in the daemonset.

Below given is the sample configuration for our daemonset(not given the entire collector config ).

apiVersion: opentelemetry.io/v1alpha1 kind: OpenTelemetryCollector metadata: name: otel-agent spec: mode: daemonset config: | receivers: otlp: protocols: grpc: endpoint: "0.0.0.0:55683"

How we can change specify the hostport ..?

bhiravabhatla · 2021-05-04T13:39:42Z

@sanjaygopinath89, Are you trying to reach the agent (daemonset) from outside cluster - using NodeIP:hostport ?

Are you pushing traces from applications which are not present in current cluster?

Basically if you running any application on HostPort, you are binding with one of the ports on worker node and you can reach the application from external world using workerNodeIP:hostPort. This is generally done, if one doesn't have an ingress-controller setup.

I am not sure this would be needed for opentelemetry setup. Traces would be sent by apps running inside cluster only right, not from external world. Correct me if I am wrong.

sanjaygopinath89 · 2021-05-04T18:45:38Z

@bhiravabhatla We are sending the trace from inside the cluster only . How we can make sure that a trace generated from a application reach the opentelemetry collector agent (daemonset) in same host. Currently by using hostip:port (as described by @jpkrohling is by using field status.hostIP ) the load from one host reach the same host otel-agent . If we use service endpoint for sending trace it will reach any one of the pod of the daemonset rt ..?

We are expecting high load, so If we want to distribute the load , any other approach we can follow ?

bhiravabhatla · 2021-05-05T01:55:31Z

If we use service endpoint for sending trace it will reach any one of the pod of the daemonset rt ..?

This is correct. If you use a cluster IP service, kube-proxy would load balance and there is no guarantee that the trace would reach the pod running in the same node as the application.

How we can make sure that a trace generated from a application reach the opentelemetry collector agent (daemonset) in same host

Let me ask you this. If we want to do this, then you have to configure the agent IP:Port before deploying the applications, That would be different for different nodes - so you would have to know which worker node exactly your application pod would be scheduled, how are we planning to achieve this.

@jpkrohling

In that case, you'd know the IP for the collector that is responsible for receiving your data: it's the same IP as the node.

This wont be the case unless you mention the hostPort while deploying the daemonset right. correct me if I am wrong.

jpkrohling · 2021-05-05T08:50:04Z

We are expecting high load, so If we want to distribute the load , any other approach we can follow ?

The best way to distribute load is to have collectors as sidecars so that you have one collector per pod. Those sidecars would then make a connection to a central collector cluster (Deployment instead of DaemonSet), with several replicas, exporting data to your backends.

sanjaygopinath89 · 2021-05-13T13:54:49Z

@bhiravabhatla

Let me ask you this. If we want to do this, then you have to configure the agent IP:Port before deploying the applications, That would be different for different nodes - so you would have to know which worker node exactly your application pod would be scheduled, how are we planning to achieve this.

If we use an env configuration like given below , we will be able to get the host ip of the host on which the pod is running. We just have to add the default port to it , so we can use this from application to send trace to the daemonset agent(pod) running on the same host .
`env:
- name: K8S_HOST_IP
  valueFrom:
  fieldRef:
  fieldPath: status.hostIP
  `

@jpkrohling If we have to use "sidecar" to handle high load. Can you give a good use case for deploying collector as a Daemonset ..?

@jpkrohling @bhiravabhatla can we conclude that there is no option to get the Hostport open on a Daemonset when deployed using opentelemety operator

bhiravabhatla · 2021-05-13T23:58:45Z

@sanjaygopinath89

If we use an env configuration like given below , we will be able to get the host ip of the host on which the pod is running. We just have to add the default port to it , so we can use this from application to send trace to the daemonset agent(pod) running on the same host .

Yep, my bad. I missed this.

I don't see any reason why we cant have hostPort supported in the operator. @jpkrohling any specific reason not to do it.

I see Jaeger operator supporting this using AgentSpec.Options

https://github.com/jaegertracing/jaeger-operator/blob/03f5722996b4462c70cc21b3dfb972ef32073755/pkg/apis/jaegertracing/v1/jaeger_types.go#L452

https://github.com/jaegertracing/jaeger-operator/blob/03f5722996b4462c70cc21b3dfb972ef32073755/pkg/apis/jaegertracing/v1/options.go#L11

Used while creating Daemonset here -

https://github.com/jaegertracing/jaeger-operator/blob/03f5722996b4462c70cc21b3dfb972ef32073755/pkg/deployment/agent.go#L55-L59

https://github.com/jaegertracing/jaeger-operator/blob/03f5722996b4462c70cc21b3dfb972ef32073755/pkg/deployment/agent.go#L128-L155

jpkrohling · 2021-05-14T08:34:24Z

I don't see any reason why we cant have hostPort supported in the operator. @jpkrohling any specific reason not to do it.

I have nothing against the hostPort in the DaemonSet. Note that this started with hostPort Service, which did sound odd to me.

If we have to use "sidecar" to handle high load. Can you give a good use case for deploying collector as a Daemonset ..?

Both modes can handle a high load, but if the connection between your collector and the backend uses a long-lived connection (gRPC, Thrift, HTTP/2, ...), then you need good load balancing on the client-side, otherwise, a few backend nodes might end up with too much traffic, while others are idle. Or worse, you might end up having more replicas of your backend than you have nodes, meaning that your throughput is limited by the number of nodes you have. If you do have enough nodes in your cluster and have no multi-tenancy needs, DaemonSets might be better, as there are fewer agents to manage: if you have 1000 pods, you might potentially have 1000 agents, each causing an overhead of a few MBs. Besides, a configuration change to your agent won't trigger a new deployment of your workload, as would be the case with sidecars. Again: imagine rolling out 1000 new pods just because you changed the agent configuration.

bhiravabhatla · 2021-05-15T03:52:28Z

@jpkrohling - If we decide to support hostPort in daemonsets. I think we already have the code to extract the ports from receiver config - which we are using for building service.

opentelemetry-operator/pkg/collector/adapters/config_to_ports.go

Line 35 in 463e014

    
           func ConfigToReceiverPorts(logger logr.Logger, config map[interface{}]interface{}) ([]corev1.ServicePort, error) {

We would need to modify Container.go to include ContainerPorts for daemonsets.

opentelemetry-operator/pkg/collector/container.go

Line 29 in 463e014

    
           func Container(cfg config.Config, logger logr.Logger, otelcol v1alpha1.OpenTelemetryCollector) corev1.Container {

I could work on this, but have few other issues I signed up for. I can pick this up after the other issues are merged. Thoughts?

jpkrohling · 2021-05-17T08:24:00Z

Sounds reasonable to me. Given that you pretty much gave the solution here, I'll label this as good-first-issue. If nobody volunteers, you can pick this up later.

bhiravabhatla · 2021-05-17T09:08:07Z

Sure @jpkrohling. @sanjaygopinath89, would you be interested in raising a PR for this. Please feel free to.

chhetripradeep · 2021-09-15T18:42:43Z

Currently by using hostip:port is by using field status.hostIP the load from one host reach the same host otel-agent .

@sanjaygopinath89 you are creating single point of failure (i.e when the otel collector agent on that host is down none of the applications on that host will be able to forward the traces) when k8s is providing you HA while using daemonsets. Also exposing port on host is not recommended from security perspective.

Curious why do you think connections across host isn't scalable to handle high load and connections within same host is scalable.

yuriolisa · 2022-03-22T12:39:53Z

@jpkrohling, any thoughts regarding this topic? Should we proceed?

jpkrohling · 2022-03-22T13:52:26Z

I personally would not use a host port with my deployments or daemonsets. But if there are users interested enough about this to the point of contributing code to the project to support this, then I'd accept a contribution, given there are no bad side-effects of this (ie, this should be disabled by default).

AndresPineros · 2022-12-29T23:13:19Z

I've read that ideally each node should run an agent that should receive traces from applications running in the same node in order to add metadata about the node. I'm not sure if this makes sense inside Kubernetes, but there are some articles promoting this pattern:

https://opensearch.org/blog/distributed-tracing-pipeline-with-opentelemetry/

To me, this sounds more like a solution for servers that aren't part of an orchestrated cluster, but would be nice to get more input here.

@jpkrohling What's the differece between deployment(agent)+deployment(gateway) and sidecar(agent)+deployment(gateway)? If the node doesn't matter, then why use a sidecar instead of just another deployment for the agent?

616b2f · 2023-05-11T11:11:54Z

I am also interested in and advice which deployment mode should be used when. Here are already good suggestions, which could be written somewhere down in the docs so we can point to it from the readme maybe?

I am sure there are more people out there which will have this question, especially when they are new to the whole observability topic like me.

yuriolisa · 2023-06-22T10:17:27Z

@jaronoff97, @frzifus, I'd like to see your opinions about it.

AndresPineros · 2023-06-22T15:15:32Z

I went with deployment+deployment and it works just fine.

No need to make things more complex than they need to be.

yuriolisa · 2023-10-19T15:44:33Z

@jaronoff97, should we close this issue?

jaronoff97 · 2023-10-19T15:48:04Z

Yes, i think for now we should close this in favor of this issue which is a bit more descriptive about why we want what we do.

sanjaygopinath89 changed the title ~~Is there an option to create hostport open service using operator~~ Is there an option to create hostport service using operator Apr 22, 2021

jpkrohling added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels May 17, 2021

rakyll mentioned this issue Sep 12, 2021

Document how to expose the collector #427

Closed

This was referenced Jul 25, 2022

[Enhancement] Option to expose collector outside of cluster #1002

Closed

Expose collector outside of cluster #902

Closed

jaronoff97 closed this as completed Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there an option to create hostport service using operator #252

Is there an option to create hostport service using operator #252

sanjaygopinath89 commented Apr 22, 2021 •

edited

Loading

jpkrohling commented Apr 22, 2021

bhiravabhatla commented Apr 22, 2021

jpkrohling commented Apr 22, 2021 •

edited

Loading

bhiravabhatla commented Apr 23, 2021 •

edited

Loading

jpkrohling commented Apr 23, 2021

sanjaygopinath89 commented May 4, 2021

bhiravabhatla commented May 4, 2021

sanjaygopinath89 commented May 4, 2021

bhiravabhatla commented May 5, 2021

jpkrohling commented May 5, 2021

sanjaygopinath89 commented May 13, 2021

bhiravabhatla commented May 13, 2021 •

edited

Loading

jpkrohling commented May 14, 2021

bhiravabhatla commented May 15, 2021

jpkrohling commented May 17, 2021

bhiravabhatla commented May 17, 2021

chhetripradeep commented Sep 15, 2021

yuriolisa commented Mar 22, 2022

jpkrohling commented Mar 22, 2022

AndresPineros commented Dec 29, 2022 •

edited

Loading

616b2f commented May 11, 2023 •

edited

Loading

yuriolisa commented Jun 22, 2023

AndresPineros commented Jun 22, 2023

yuriolisa commented Oct 19, 2023

jaronoff97 commented Oct 19, 2023

Is there an option to create hostport service using operator #252

Is there an option to create hostport service using operator #252

Comments

sanjaygopinath89 commented Apr 22, 2021 • edited Loading

jpkrohling commented Apr 22, 2021

bhiravabhatla commented Apr 22, 2021

jpkrohling commented Apr 22, 2021 • edited Loading

bhiravabhatla commented Apr 23, 2021 • edited Loading

jpkrohling commented Apr 23, 2021

sanjaygopinath89 commented May 4, 2021

bhiravabhatla commented May 4, 2021

sanjaygopinath89 commented May 4, 2021

bhiravabhatla commented May 5, 2021

jpkrohling commented May 5, 2021

sanjaygopinath89 commented May 13, 2021

bhiravabhatla commented May 13, 2021 • edited Loading

jpkrohling commented May 14, 2021

bhiravabhatla commented May 15, 2021

jpkrohling commented May 17, 2021

bhiravabhatla commented May 17, 2021

chhetripradeep commented Sep 15, 2021

yuriolisa commented Mar 22, 2022

jpkrohling commented Mar 22, 2022

AndresPineros commented Dec 29, 2022 • edited Loading

616b2f commented May 11, 2023 • edited Loading

yuriolisa commented Jun 22, 2023

AndresPineros commented Jun 22, 2023

yuriolisa commented Oct 19, 2023

jaronoff97 commented Oct 19, 2023

sanjaygopinath89 commented Apr 22, 2021 •

edited

Loading

jpkrohling commented Apr 22, 2021 •

edited

Loading

bhiravabhatla commented Apr 23, 2021 •

edited

Loading

bhiravabhatla commented May 13, 2021 •

edited

Loading

AndresPineros commented Dec 29, 2022 •

edited

Loading

616b2f commented May 11, 2023 •

edited

Loading