Skip to content
This repository has been archived by the owner on Apr 2, 2023. It is now read-only.

Scheduling failed due to Duplicate key HostOffer #442

Closed
lenhattan86 opened this issue May 14, 2022 · 1 comment
Closed

Scheduling failed due to Duplicate key HostOffer #442

lenhattan86 opened this issue May 14, 2022 · 1 comment

Comments

@lenhattan86
Copy link
Collaborator

In our clusters we observe that we got duplicated host offer key error and aurora is unable to proceed with other tasks.

W0513 00:15:35.981 [TaskGroupBatchWorker, TaskSchedulerImpl] Task scheduling unexpectedly failed, will be retried java.lang.IllegalStateException: Duplicate key HostOffer{offer=id {
  value: "704c1042-f056-4582-b0b4-30231ca4ce96-O11593022"
}
framework_id {
  value: "9f48d831-63e7-4556-86ab-463a69389e4d-0000"
}
agent_id {
  value: "704c1042-f056-4582-b0b4-30231ca4ce96-S1890"
}
hostname: "******"
resources {
  name: "ports"
  type: RANGES
  ranges {
    range {
      begin: 10000
      end: 10150
    }
  }
  role: "*"
}

, hostAttributes=IHostAttributes{host=*******, attributes=[IAttribute{name=hostname, values=[*****]}, IAttribute{name=az, values=[us-central1-b]}, IAttribute{name=dedicated, values=[test/onboard]}, IAttribute{name=host, values=[10.180.21.192]}, IAttribute{name=nodeID, values=[autoscaler-gp]}], mode=NONE, slaveId=704c1042-f056-4582-b0b4-30231ca4ce96-S1890}, nonZeroCpuAndMem=true}
            at java.util.stream.Collectors.lambda$throwingMerger$0(Collectors.java:133)
            at java.util.HashMap.merge(HashMap.java:1254)
            at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1320)
            at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
            at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
            at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
            at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
            at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
            at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
            at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
            at io.github.aurora.scheduler.offers.HttpOfferSetImpl.processResponse(HttpOfferSetImpl.java:334)
            at io.github.aurora.scheduler.offers.HttpOfferSetImpl.getOrdered(HttpOfferSetImpl.java:265)
            at org.apache.aurora.scheduler.offers.HostOffers.getAllMatching(HostOffers.java:176)
            at org.apache.aurora.scheduler.offers.OfferManagerImpl.getAllMatching(OfferManagerImpl.java:173)
            at org.apache.aurora.scheduler.scheduling.TaskAssignerImpl.lambda$findMatches$4(TaskAssignerImpl.java:234)
            at java.lang.Iterable.forEach(Iterable.java:75)
            at org.apache.aurora.scheduler.scheduling.TaskAssignerImpl.findMatches(TaskAssignerImpl.java:224)
            at org.apache.aurora.scheduler.scheduling.TaskAssignerImpl.maybeAssign(TaskAssignerImpl.java:260)
            at io.github.aurora.scheduler.scheduling.ProbabilisticPriorityAssigner.maybeAssign(ProbabilisticPriorityAssigner.java:105)
            at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
            at org.apache.aurora.scheduler.scheduling.TaskSchedulerImpl.scheduleTasks(TaskSchedulerImpl.java:154)
            at org.apache.aurora.scheduler.scheduling.TaskSchedulerImpl.schedule(TaskSchedulerImpl.java:108)
            at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
            at org.apache.aurora.scheduler.scheduling.TaskGroups$1.lambda$run$0(TaskGroups.java:174)
            at org.apache.aurora.scheduler.BatchWorker$Work.apply(BatchWorker.java:117)
            at org.apache.aurora.scheduler.BatchWorker.lambda$processBatch$3(BatchWorker.java:210)
            at org.apache.aurora.scheduler.storage.Storage$MutateWork$NoResult.apply(Storage.java:146)
            at org.apache.aurora.scheduler.storage.Storage$MutateWork$NoResult.apply(Storage.java:141)
            at org.apache.aurora.scheduler.storage.durability.DurableStorage.lambda$doInTransaction$0(DurableStorage.java:202)
            at org.apache.aurora.scheduler.storage.mem.MemStorage.write(MemStorage.java:96)
            at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
            at org.apache.aurora.scheduler.storage.durability.DurableStorage.doInTransaction(DurableStorage.java:201)
            at org.apache.aurora.scheduler.storage.durability.DurableStorage.write(DurableStorage.java:224)
            at org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.write(CallOrderEnforcingStorage.java:132)
            at org.apache.aurora.scheduler.BatchWorker.processBatch(BatchWorker.java:207)
            at org.apache.aurora.scheduler.BatchWorker.run(BatchWorker.java:199)
            at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:66)
            at com.google.common.util.concurrent.Callables$4.run(Callables.java:119)
            at java.lang.Thread.run(Thread.java:748)

expected: aurora skip this error and try to schedule other tasks.

@lenhattan86
Copy link
Collaborator Author

fixed this issue by #445

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant