Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netreap dont be reapplying the labels #20

Closed
iamredbull opened this issue Jul 14, 2023 · 1 comment · Fixed by #21
Closed

Netreap dont be reapplying the labels #20

iamredbull opened this issue Jul 14, 2023 · 1 comment · Fixed by #21
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@iamredbull
Copy link

iamredbull commented Jul 14, 2023

Before host restart:

image
image
image

After host restart:

Netreap dont be reapplying the labels after restart host.
image
image
image

Netreap debug logs:

2023-07-14T12:21:38.298Z	DEBUG	netreap/main.go:124	Starting node reaper
2023-07-14T12:21:38.298Z	DEBUG	reapers/nodes.go:107	Beginning reconciliation
2023-07-14T12:21:38.298Z	DEBUG	reapers/nodes.go:108	Getting nomad node list
2023-07-14T12:21:38.303Z	DEBUG	reapers/nodes.go:119	Finished constructing list of all nodesnodesmap[ax51-host131:{} cn6-host48:{} cpx31-host58:{}]
2023-07-14T12:21:38.303Z	DEBUG	reapers/nodes.go:121	Fetching cilium nodes from consul
2023-07-14T12:21:38.308Z	DEBUG	netreap/main.go:135	Starting endpoint reaper
2023-07-14T12:21:38.308Z	DEBUG	reapers/endpoints.go:155	Starting reconciliation
2023-07-14T12:21:38.310Z	DEBUG	reapers/endpoints.go:169	Finished fetching service list, constructing set of IP addresses from servicesservice_list[{nomad-clients} {nomad-servers} {consul} {netreap}]
2023-07-14T12:21:38.312Z	INFO	reapers/nodes.go:56	Waiting for leader election
2023-07-14T12:21:38.318Z	DEBUG	reapers/endpoints.go:203	Finished generating current IP list. Fetching endpoints from ciliumip_listmap[]
2023-07-14T12:21:38.320Z	DEBUG	reapers/endpoints.go:211	Checking all endpoints
2023-07-14T12:21:38.320Z	DEBUG	reapers/endpoints.go:219	Endpoint is not an init service, skipping	{"labels": ["reserved:host"]}
2023-07-14T12:21:38.320Z	DEBUG	reapers/endpoints.go:219	Endpoint is not an init service, skipping	{"labels": ["reserved:health"]}
2023-07-14T12:21:38.320Z	DEBUG	reapers/endpoints.go:265	Finished reconciliationnum_errors0
2023-07-14T12:21:38.324Z	DEBUG	netreap/main.go:146	starting policy poller
2023-07-14T12:21:38.324Z	INFO	policy_poller	policy/policy.go:41	starting Consul watch for key: netreap.io/policy
2023-07-14T12:21:38.326Z	INFO	policy_poller	policy/policy.go:98	loaded new policy
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:93	Got 21 job events. Handling...
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.330Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.413Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-07-14T12:21:38.413Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.413Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.698Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-07-14T12:21:38.698Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.698Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.914Z	DEBUG	reapers/endpoints.go:93	Got 3 job events. Handling...
2023-07-14T12:21:38.914Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.914Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:38.914Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.174Z	DEBUG	reapers/endpoints.go:93	Got 3 job events. Handling...
2023-07-14T12:21:39.174Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.174Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.174Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.467Z	DEBUG	reapers/endpoints.go:93	Got 3 job events. Handling...
2023-07-14T12:21:39.467Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.467Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.467Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.713Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-07-14T12:21:39.713Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.713Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.977Z	DEBUG	reapers/endpoints.go:93	Got 4 job events. Handling...
2023-07-14T12:21:39.977Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.977Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.977Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:39.977Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:40.600Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-07-14T12:21:40.600Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:40.600Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:44.210Z	DEBUG	reapers/endpoints.go:93	Got 1 job events. Handling...
2023-07-14T12:21:44.210Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of NodeRegistration
2023-07-14T12:21:50.058Z	DEBUG	reapers/endpoints.go:93	Got 4 job events. Handling...
2023-07-14T12:21:50.058Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.058Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.058Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.058Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.737Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-07-14T12:21:50.737Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-07-14T12:21:50.737Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated

In some cases, some jobs re-tagged, but not all:
image

In order for jobs to get tags again, and sometimes ip, you need to stop & start the job again:

Before restart job:
image
After restart job:
image
image
Netreap logs:
image

Cilium & Neatreap deployed from this guide https://cosmonic.com/blog/engineering/netreap-a-practical-guide-to-running-cilium-in-nomad. I think that this behavior of netreap is not entirely correct. Please tell me what is the reason for this behavior and how can I fix it? @deverton @protochron

Cilium - v1.13.4
Netreap - v0.1.0

@iamredbull
Copy link
Author

I noticed such a moment, when starting a nomad-job with several groups, only one of the groups receives the label.
Nomad-job:

job "example-job" {
    
  datacenters = ["dc1"]
  namespace = "dedicated"

  constraint {
     attribute = "${attr.unique.consul.name}"
     operator  = "="
     value     = "cn6-host48"
  }
  
  meta = {
    "example.com/app_name" = "service-echo"
  }

  group "http-echo-group" {
    network {

      mode = "cni/cilium"

      dns {
        servers = ["172.17.0.1"]
      }
      
    }

    restart {
        attempts = 3
        interval = "15m"
        delay = "20s"
        mode = "fail"
    }   

    service {
      name         = "http-echo"
      port         = "80"
      tags         = ["http-echo"]
      address_mode = "alloc"
    }

    task "http-echo" {
      driver = "docker"

      config {
        image  = "hashicorp/http-echo"
          args = [
            "--text=hello world",
            "--listen=:80"
          ]
        auth_soft_fail = true
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }

  group "network-multitool-group" {
    network {
      dns {
        servers = ["172.17.0.1"]
      }
      mode = "cni/cilium"
    }

    restart {
        attempts = 3
        interval = "15m"
        delay = "20s"
        mode = "fail"
    }   

    service {
      name         = "network-multitool"
      port         = "80"
      tags         = ["network-multitool"]
      address_mode = "alloc"
    }

    task "network-multitool" {
      driver = "docker"
      config {
        image          = "wbitt/network-multitool"
      
        auth_soft_fail = true
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

Cilium endpoint list:

ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])               IPv6   IPv4            STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                   
141        Enabled            Enabled           4          reserved:health                                  172.16.171.94   ready   
1418       Disabled           Disabled          1          reserved:host                                                    ready   
1535       Enabled            Enabled           5          reserved:init                                    172.16.6.61     ready   
2641       Enabled            Enabled           28939      netreap:nomad.job_id=example-job                 172.16.44.243   ready   
                                                           netreap:nomad.namespace=dedicated                                        
                                                           nomad:example.com/app_name=service-echo                                  
                                                           reserved:init                                                            

Netreap-job logs:

2023-08-01T09:58:39.847Z	DEBUG	netreap/main.go:124	Starting node reaper
2023-08-01T09:58:39.847Z	DEBUG	reapers/nodes.go:107	Beginning reconciliation
2023-08-01T09:58:39.847Z	DEBUG	reapers/nodes.go:108	Getting nomad node list
2023-08-01T09:58:39.865Z	DEBUG	reapers/nodes.go:119	Finished constructing list of all nodes	{"nodes": {"cn6-host48":{},"cpx31-host58":{}}}
2023-08-01T09:58:39.866Z	DEBUG	reapers/nodes.go:121	Fetching cilium nodes from consul
2023-08-01T09:58:39.902Z	DEBUG	netreap/main.go:135	Starting endpoint reaper
2023-08-01T09:58:39.902Z	DEBUG	reapers/endpoints.go:155	Starting reconciliation
2023-08-01T09:58:39.911Z	DEBUG	reapers/endpoints.go:169	Finished fetching service list, constructing set of IP addresses from servicesservice_list[{consul} {netreap} {nomad-clients} {nomad-servers}]
2023-08-01T09:58:39.918Z	INFO	reapers/nodes.go:56	Waiting for leader election
2023-08-01T09:58:39.945Z	DEBUG	reapers/endpoints.go:203	Finished generating current IP list. Fetching endpoints from cilium	{"ip_list": {}}
2023-08-01T09:58:39.949Z	DEBUG	reapers/endpoints.go:211	Checking all endpoints
2023-08-01T09:58:39.949Z	DEBUG	reapers/endpoints.go:219	Endpoint is not an init service, skipping	{"labels": ["reserved:health"]}
2023-08-01T09:58:39.949Z	DEBUG	reapers/endpoints.go:219	Endpoint is not an init service, skipping	{"labels": ["reserved:host"]}
2023-08-01T09:58:39.949Z	DEBUG	reapers/endpoints.go:265	Finished reconciliation	{"num_errors": 0}
2023-08-01T09:58:39.982Z	DEBUG	netreap/main.go:146	starting policy poller
2023-08-01T09:58:39.983Z	INFO	policy_poller	policy/policy.go:41	starting Consul watch for key: netreap.io/policy
2023-08-01T09:58:39.988Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-08-01T09:58:39.988Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:58:39.988Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:58:39.994Z	INFO	policy_poller	policy/policy.go:98	loaded new policy
2023-08-01T09:58:40.261Z	DEBUG	reapers/endpoints.go:93	Got 3 job events. Handling...
2023-08-01T09:58:40.261Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:58:40.261Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:58:40.261Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:30.969Z	DEBUG	elector/mod.go:108	Unable to acquire lock. Retrying up to 6 times
2023-08-01T09:59:33.305Z	DEBUG	reapers/endpoints.go:93	Got 1 job events. Handling...
2023-08-01T09:59:33.307Z	DEBUG	reapers/endpoints.go:416	Job was empty	{"event_type": "JobDeregistered"}
2023-08-01T09:59:33.384Z	DEBUG	reapers/endpoints.go:93	Got 1 job events. Handling...
2023-08-01T09:59:33.384Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of EvaluationUpdated
2023-08-01T09:59:40.982Z	DEBUG	elector/mod.go:115	Lock retry 1 did not succeed
2023-08-01T09:59:49.182Z	DEBUG	reapers/endpoints.go:93	Got 1 job events. Handling...
2023-08-01T09:59:49.182Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-08-01T09:59:49.183Z	DEBUG	reapers/endpoints.go:416	Job was empty	{"event_type": "JobRegistered"}
2023-08-01T09:59:49.209Z	DEBUG	reapers/endpoints.go:327	Fetching services from consul for job	{"job_id": "example-job", "retry_num": 1}
2023-08-01T09:59:49.210Z	DEBUG	reapers/endpoints.go:327	Fetching services from consul for job	{"job_id": "example-job", "retry_num": 1}
2023-08-01T09:59:49.218Z	DEBUG	reapers/endpoints.go:334	Did not find a ready service in consul	{"job_id": "example-job", "retry_num": 1}
2023-08-01T09:59:49.218Z	DEBUG	reapers/endpoints.go:334	Did not find a ready service in consul	{"job_id": "example-job", "retry_num": 1}
2023-08-01T09:59:49.483Z	DEBUG	reapers/endpoints.go:93	Got 5 job events. Handling...
2023-08-01T09:59:49.483Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.483Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.483Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.483Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.483Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of PlanResult
2023-08-01T09:59:49.536Z	DEBUG	reapers/endpoints.go:93	Got 1 job events. Handling...
2023-08-01T09:59:49.536Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of EvaluationUpdated
2023-08-01T09:59:50.295Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-08-01T09:59:50.295Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:50.295Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:50.600Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-08-01T09:59:50.600Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:50.600Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:50.993Z	DEBUG	elector/mod.go:115	Lock retry 2 did not succeed
2023-08-01T09:59:51.218Z	DEBUG	reapers/endpoints.go:327	Fetching services from consul for job	{"job_id": "example-job", "retry_num": 2}
2023-08-01T09:59:51.218Z	DEBUG	reapers/endpoints.go:327	Fetching services from consul for job	{"job_id": "example-job", "retry_num": 2}
2023-08-01T09:59:51.228Z	DEBUG	reapers/endpoints.go:344	Found services for new jobjob_idexample-job
2023-08-01T09:59:51.228Z	DEBUG	reapers/endpoints.go:356	Finding related cilium endpoint for job	{"job_id": "example-job"}
2023-08-01T09:59:51.228Z	DEBUG	reapers/endpoints.go:344	Found services for new jobjob_idexample-job
2023-08-01T09:59:51.228Z	DEBUG	reapers/endpoints.go:356	Finding related cilium endpoint for job	{"job_id": "example-job"}
2023-08-01T09:59:51.840Z	DEBUG	reapers/endpoints.go:93	Got 3 job events. Handling...
2023-08-01T09:59:51.840Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:51.840Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T09:59:51.840Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.033Z	DEBUG	elector/mod.go:115	Lock retry 3 did not succeed
2023-08-01T10:00:01.751Z	DEBUG	reapers/endpoints.go:93	Got 3 job events. Handling...
2023-08-01T10:00:01.751Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.751Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.751Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.998Z	DEBUG	reapers/endpoints.go:93	Got 3 job events. Handling...
2023-08-01T10:00:01.998Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.998Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:01.998Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:00:02.986Z	DEBUG	reapers/endpoints.go:93	Got 1 job events. Handling...
2023-08-01T10:00:02.986Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdateDesiredStatus
2023-08-01T10:00:03.242Z	DEBUG	reapers/endpoints.go:93	Got 3 job events. Handling...
2023-08-01T10:00:03.242Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of PlanResult
2023-08-01T10:00:03.242Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of PlanResult
2023-08-01T10:00:03.242Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of PlanResult
2023-08-01T10:00:03.334Z	DEBUG	reapers/endpoints.go:93	Got 1 job events. Handling...
2023-08-01T10:00:03.334Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of EvaluationUpdated
2023-08-01T10:00:11.044Z	DEBUG	elector/mod.go:115	Lock retry 4 did not succeed
2023-08-01T10:00:21.077Z	DEBUG	elector/mod.go:115	Lock retry 5 did not succeed
2023-08-01T10:00:31.097Z	DEBUG	elector/mod.go:115	Lock retry 6 did not succeed
2023-08-01T10:00:31.097Z	DEBUG	elector/mod.go:117	Never acquired lock after retry

The second group remains in the init state.

But if I restart Netreap in the cluster, both groups will immediately get the label.
Cilium endpoint list:

ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])               IPv6   IPv4            STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                   
141        Enabled            Enabled           4          reserved:health                                  172.16.171.94   ready   
1418       Disabled           Disabled          1          reserved:host                                                    ready   
1535       Enabled            Enabled           28939      netreap:nomad.job_id=example-job                 172.16.6.61     ready   
                                                           netreap:nomad.namespace=dedicated                                        
                                                           nomad:example.com/app_name=service-echo                                  
                                                           reserved:init                                                            
2641       Enabled            Enabled           28939      netreap:nomad.job_id=example-job                 172.16.44.243   ready   
                                                           netreap:nomad.namespace=dedicated                                        
                                                           nomad:example.com/app_name=service-echo                                  
                                                           reserved:init                                                            

Netreap-job logs:

2023-08-01T10:05:21.560Z	DEBUG	netreap/main.go:124	Starting node reaper
2023-08-01T10:05:21.561Z	DEBUG	reapers/nodes.go:107	Beginning reconciliation
2023-08-01T10:05:21.561Z	DEBUG	reapers/nodes.go:108	Getting nomad node list
2023-08-01T10:05:21.578Z	DEBUG	reapers/nodes.go:119	Finished constructing list of all nodes	{"nodes": {"cn6-host48":{},"cpx31-host58":{}}}
2023-08-01T10:05:21.578Z	DEBUG	reapers/nodes.go:121	Fetching cilium nodes from consul
2023-08-01T10:05:21.617Z	DEBUG	netreap/main.go:135	Starting endpoint reaper
2023-08-01T10:05:21.618Z	DEBUG	reapers/endpoints.go:155	Starting reconciliation
2023-08-01T10:05:21.626Z	DEBUG	reapers/endpoints.go:169	Finished fetching service list, constructing set of IP addresses from servicesservice_list[{network-multitool} {nomad-clients} {nomad-servers} {consul} {http-echo} {netreap}]
2023-08-01T10:05:21.628Z	INFO	reapers/nodes.go:56	Waiting for leader election
2023-08-01T10:05:21.674Z	DEBUG	reapers/endpoints.go:203	Finished generating current IP list. Fetching endpoints from cilium	{"ip_list": {"172.16.212.128":{"ID":"df8a0bec-b718-d91e-9f8d-0e5ef3b7e077","Namespace":""},"172.16.242.70":{"ID":"52777fb2-ac22-749a-f709-57a5ecddb881","Namespace":""}}}
2023-08-01T10:05:21.680Z	DEBUG	reapers/endpoints.go:211	Checking all endpoints
2023-08-01T10:05:21.680Z	DEBUG	reapers/endpoints.go:219	Endpoint is not an init service, skipping	{"labels": ["netreap:nomad.job_id=example-job","netreap:nomad.namespace=dedicated","nomad:example.com/app_name=service-echo"]}
2023-08-01T10:05:21.680Z	DEBUG	reapers/endpoints.go:219	Endpoint is not an init service, skipping	{"labels": ["reserved:host"]}
2023-08-01T10:05:21.680Z	DEBUG	reapers/endpoints.go:219	Endpoint is not an init service, skipping	{"labels": ["reserved:health"]}
2023-08-01T10:05:21.680Z	DEBUG	reapers/endpoints.go:222	Checking if endpoint still exists	{"endpoint_id": 1500}
2023-08-01T10:05:21.680Z	DEBUG	reapers/endpoints.go:227	Got ip	{"ip": {"ipv4":"172.16.212.128"}}
2023-08-01T10:05:21.680Z	DEBUG	reapers/endpoints.go:250	Found an endpoint missing labels. Updating with current job labels	{"endpoint_id": 1500}
2023-08-01T10:05:21.705Z	DEBUG	reapers/endpoints.go:265	Finished reconciliation	{"num_errors": 0}
2023-08-01T10:05:21.740Z	DEBUG	netreap/main.go:146	starting policy poller
2023-08-01T10:05:21.740Z	INFO	policy_poller	policy/policy.go:41	starting Consul watch for key: netreap.io/policy
2023-08-01T10:05:21.746Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-08-01T10:05:21.746Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:05:21.747Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:05:21.752Z	INFO	policy_poller	policy/policy.go:98	loaded new policy
2023-08-01T10:05:21.753Z	DEBUG	reapers/endpoints.go:93	Got 2 job events. Handling...
2023-08-01T10:05:21.753Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:05:21.753Z	DEBUG	reapers/endpoints.go:104	Ignoring Job event with type of AllocationUpdated
2023-08-01T10:06:09.807Z	DEBUG	elector/mod.go:108	Unable to acquire lock. Retrying up to 6 times
2023-08-01T10:06:19.817Z	DEBUG	elector/mod.go:115	Lock retry 1 did not succeed
2023-08-01T10:06:29.831Z	DEBUG	elector/mod.go:115	Lock retry 2 did not succeed
2023-08-01T10:06:39.847Z	DEBUG	elector/mod.go:115	Lock retry 3 did not succeed
2023-08-01T10:06:49.879Z	DEBUG	elector/mod.go:115	Lock retry 4 did not succeed
2023-08-01T10:06:59.898Z	DEBUG	elector/mod.go:115	Lock retry 5 did not succeed

Maybe this is a bug or am I doing something wrong? In my cases, nomad jobs most often consist of several groups. Please take a look @deverton @protochron

Netreap - 0.1.2 also 0.1.0
Cilium - 1.13.4
Nomad - v1.5.6
Consul - v1.14.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant