IllegalArgumentException: Values less than -1 bytes are not supported on DiskThresholdDecider #48380

nachogiljaldo · 2019-10-23T10:36:03Z

Elasticsearch version (bin/elasticsearch --version): 7.4.0

Plugins installed: [repository-s3]

JVM version (java -version):

[root@c1be5a9fd961 /]# /elasticsearch/jdk/bin/java --version
openjdk 13 2019-09-17
OpenJDK Runtime Environment AdoptOpenJDK (build 13+33)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 13+33, mixed mode, sharing)

OS version (uname -a if on a Unix-like system):

Linux c1be5a9fd961 4.15.0-1027-aws #27-Ubuntu SMP Fri Nov 2 15:14:20 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

running on ESS

Description of the problem including expected versus actual behavior:
Suddenly during a plan migration, plans started to fail due to the exception:

[instance-0000000104] unexpected failure during [cluster_reroute(async_shard_fetch)], current state version [1212336]
java.lang.IllegalArgumentException: Values less than -1 bytes are not supported: -192978829196b
    at org.elasticsearch.common.unit.ByteSizeValue.<init>(ByteSizeValue.java:72) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.common.unit.ByteSizeValue.<init>(ByteSizeValue.java:67) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDecider.canRemain(DiskThresholdDecider.java:312) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.routing.allocation.decider.AllocationDeciders.canRemain(AllocationDeciders.java:108) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator$Balancer.decideMove(BalancedShardsAllocator.java:668) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator$Balancer.moveShards(BalancedShardsAllocator.java:628) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator.allocate(BalancedShardsAllocator.java:123) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:405) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:370) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.routing.BatchedRerouteService$1.execute(BatchedRerouteService.java:112) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324) ~[elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219) [elasticsearch-7.4.0.jar:7.4.0]
    at org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:73) [elasticsearch-7.4.0.jar:7.4.0]

instance-0000000104 is the master node

Steps to reproduce:

Sorry, it's the first and only time I have seen this, so I have no steps.

Provide logs (if relevant):
The exception are up there.

Additionally, after @ywelsch suggestion, I enabled the DEBUG level on org.elasticsearch.cluster.routing.allocation.decider which revealed:

[instance-0000000106] less than the required 0b free bytes threshold (-194290383982 bytes free) on node aIG71toLQTeY-FIvHziBag, shard cannot remain

I identified that node using the _cluster/state and restarted it and the problem seems to be gone.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-10-23T11:42:03Z

Pinging @elastic/es-distributed (:Distributed/Allocation)

Today it is possible that the total size of all relocating shards exceeds the total amount of free disk space. For instance, this may be caused by another user of the same disk increasing their disk usage, or may be due to how Elasticsearch double-counts relocations that are nearly complete particularly if there are many concurrent relocations in progress. The `DiskThresholdDecider` treats negative free space similarly to zero free space, but it then fails when rendering the messages that explain its decision. This commit fixes its handling of negative free space. Fixes elastic#48380

Today it is possible that the total size of all relocating shards exceeds the total amount of free disk space. For instance, this may be caused by another user of the same disk increasing their disk usage, or may be due to how Elasticsearch double-counts relocations that are nearly complete particularly if there are many concurrent relocations in progress. The `DiskThresholdDecider` treats negative free space similarly to zero free space, but it then fails when rendering the messages that explain its decision. This commit fixes its handling of negative free space. Fixes #48380

nachogiljaldo added >bug v7.4.0 labels Oct 23, 2019

ywelsch added the :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) label Oct 23, 2019

DaveCTurner mentioned this issue Oct 23, 2019

Handle negative free disk space in deciders #48392

Merged

DaveCTurner closed this as completed in #48392 Oct 23, 2019

This was referenced Feb 3, 2020

[meta] 7.6 release elastic/elasticsearch-net#4340

Closed

[meta] 7.6 release elastic/elasticsearch-net#4341

Closed

mfussenegger mentioned this issue Feb 28, 2020

cannot rejoin: java.lang.IllegalArgumentException: Values less than -1 bytes are not supported crate/crate#9710

Closed

s-nel mentioned this issue May 14, 2020

IllegalArgumentException from node stats #56739

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IllegalArgumentException: Values less than -1 bytes are not supported on DiskThresholdDecider #48380

IllegalArgumentException: Values less than -1 bytes are not supported on DiskThresholdDecider #48380

nachogiljaldo commented Oct 23, 2019

elasticmachine commented Oct 23, 2019

IllegalArgumentException: Values less than -1 bytes are not supported on DiskThresholdDecider #48380

IllegalArgumentException: Values less than -1 bytes are not supported on DiskThresholdDecider #48380

Comments

nachogiljaldo commented Oct 23, 2019

elasticmachine commented Oct 23, 2019