"whereis" output can be wrong when the storage node is slow #1090

vstax · 2018-07-30T14:00:50Z

Situation: in cluster of 6 nodes running 1.4.2 a new node (node02) is getting recovered in takeover operation (node09 -> node02). This appears in log of the node02:

[E] [email protected]  2018-07-30 16:32:20.639683 +0300    1532957540leo_storage_handler_object:get/4  191 [{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,3952335},{etag,13886495139313085388585308446383676488},{cause,"Recover failure"}]

To check what's going on I execute whereis and it is very slow to respond, after some time it produces the output:

[root@leofs-mgr01 ~]# leofs-adm whereis production/parts/used/23/03/80/36038005/1.jpg
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

I execute whereis again and get this output:

[root@leofs-mgr01 ~]# leofs-adm whereis production/parts/used/23/03/80/36038005/1.jpg
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300       
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |

There are two problems. First (minor) is that object didn't get recovered on node02 yet - however I suspect that it will be recovered at some point in the future (maybe it's in some queue on node02 right now which will be processed later, or maybe it will be pushed to this node by node05 or node08 in the future, I have no idea). The queues on all nodes (except for node02) are filled with millions of messages as part of recovery process at the moment so maybe read-repair just didn't have time to work yet.

Second, main problem is that first output of whereis was lying. I'm pretty sure that the object was present on node05 at that moment, it's just that request from manager to check its state failed. The output of "whereis" should clearly differentiate between missing object and inability to contact node (like if it's not running at all), or timed out request and produce different output for these two cases.

Just in case, logs related to this object (the timeout I had during first "whereis" operation is probably there somewhere as well). Error logs from node02 (nothing in info logs):

[W]	[email protected]	2018-07-30 16:31:50.638010 +0300	1532957510leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W]	[email protected]	2018-07-30 16:31:56.281774 +0300	1532957516leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W]	[email protected]	2018-07-30 16:32:20.638857 +0300	1532957540leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[E]	[email protected]	2018-07-30 16:32:20.639683 +0300	1532957540leo_storage_handler_object:get/4	191	[{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,3952335},{etag,13886495139313085388585308446383676488},{cause,"Recover failure"}]
[W]	[email protected]	2018-07-30 16:32:26.90911 +0300	1532957546	leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W]	[email protected]	2018-07-30 16:32:26.282894 +0300	1532957546leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[E]	[email protected]	2018-07-30 16:32:26.283730 +0300	1532957546leo_storage_handler_object:get/4	191	[{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,57890444},{etag,13886495139313085388585308446383676488},{cause,"Recover failure"}]
[W]	[email protected]	2018-07-30 16:32:56.91867 +0300	1532957576	leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[E]	[email protected]	2018-07-30 16:32:56.92533 +0300	1532957576	leo_storage_handler_object:get/3	159	[{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,35201166},{cause,"Recover failure"}]
[W]	[email protected]	2018-07-30 16:33:10.514741 +0300	1532957590leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W]	[email protected]	2018-07-30 16:33:40.515824 +0300	1532957620leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[E]	[email protected]	2018-07-30 16:33:40.516669 +0300	1532957620leo_storage_handler_object:get/3	159	[{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,70841817},{cause,"Recover failure"}]

Error logs from node05:

[W] [email protected]  2018-07-30 16:31:20.790656 +0300    1532957480leo_storage_handler_object:read_and_repair_3/3    1434    [{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W] [email protected]  2018-07-30 16:31:26.435556 +0300    1532957486leo_storage_handler_object:read_and_repair_3/3    1434    [{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W] [email protected]  2018-07-30 16:31:56.243559 +0300    1532957516leo_storage_handler_object:read_and_repair_3/3    1434    [{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W] [email protected]  2018-07-30 16:32:40.666584 +0300    1532957560leo_storage_handler_object:read_and_repair_3/3    1434    [{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]

Info logs from node05:

[I] [email protected]  2018-07-30 16:34:14.196621 +0300    1532957654null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,143404}]}
[I] [email protected]  2018-07-30 16:34:29.497739 +0300    1532957669null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,153253}]}
[I] [email protected]  2018-07-30 16:34:44.65305 +0300 1532957684  null:null   0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,167629}]}
[I] [email protected]  2018-07-30 16:34:44.93413 +0300 1532957684  null:null   0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,137848}]}
[I] [email protected]  2018-07-30 16:34:44.129532 +0300    1532957684null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,123462}]}
[I] [email protected]  2018-07-30 16:34:44.143682 +0300    1532957684null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,102028}]}
[I] [email protected]  2018-07-30 16:34:44.169612 +0300    1532957684null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,93501}]}
[I] [email protected]  2018-07-30 16:34:44.184279 +0300    1532957684null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,91182}]}

Info logs from node08 (nothing in error logs):

[I] [email protected]  2018-07-30 16:31:44.46812 +0300 1532957504  null:null   0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,83260}]}
[I] [email protected]  2018-07-30 16:32:17.765815 +0300    1532957537null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,111333}]}
[I] [email protected]  2018-07-30 16:32:31.449430 +0300    1532957551null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,100660}]}
[I] [email protected]  2018-07-30 16:32:31.450507 +0300    1532957551null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,95209}]}
[I] [email protected]  2018-07-30 16:32:31.451198 +0300    1532957551null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,95017}]}
[I] [email protected]  2018-07-30 16:32:59.654923 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,98865}]}
[I] [email protected]  2018-07-30 16:32:59.656474 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,93414}]}
[I] [email protected]  2018-07-30 16:32:59.657418 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,93222}]}
[I] [email protected]  2018-07-30 16:32:59.661405 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,78997}]}
[I] [email protected]  2018-07-30 16:32:59.665188 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,63422}]}
[I] [email protected]  2018-07-30 16:33:12.936779 +0300    1532957592null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,62271}]}
[I] [email protected]  2018-07-30 16:33:12.972406 +0300    1532957592null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,40859}]}
[I] [email protected]  2018-07-30 16:33:12.997936 +0300    1532957592null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,32331}]}
[I] [email protected]  2018-07-30 16:33:13.263 +0300   1532957593  null:null   0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,19107}]}

The text was updated successfully, but these errors were encountered:

mocchira · 2018-08-02T02:51:45Z

@yosukehara

Second, main problem is that first output of whereis was lying. I'm pretty sure that the object was present on node05 at that moment, it's just that request from manager to check its state failed. The output of "whereis" should clearly differentiate between missing object and inability to contact node (like if it's not running at all), or timed out request and produce different output for these two cases.

IMHO, this should be fixed. how do leofs-adm whereis show those differences?
Now in any failure cases, whereis output the value only at the node column and empty at other columns as shown below

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

Since the additional error column is so lengthy and outputting the error detail in an existing column looks weird, How about showing the error detail at the bottom apart from the table like below?

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

[Failure Nodes]
[email protected]: Connection timeout.
[email protected]: File not found.

yosukehara · 2018-08-02T04:46:00Z

@mocchira I agree to clarify the result of leofs-adm whereis command:

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

[Failure Nodes]
[email protected]: Connection timeout.
[email protected]: File not found.

vstax · 2018-08-02T09:09:08Z

I have a suggestion, I think that output

[Failure Nodes]
[email protected]: File not found.

is confusing. Because it's in the list of failures and looks like some kind of internal error ("File not found" just sounds like a very typical error where some actual file is missing on the filesystem), so the real meaning - that the object is just missing on that node - is a bit hard to grasp.

I suggest either not to list "File not found" in "Failure Nodes" at all - just keep the old output, all empty fields mean that the object is not present. Or at least keep it separate from "Failure Nodes" and use some other wording, for example

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

[email protected]: Object missing.
[Failure Nodes]
[email protected]: Connection timeout.

mocchira · 2018-08-02T11:46:01Z

@vstax Thanks for your suggestion.

is confusing. Because it's in the list of failures and looks like some kind of internal error ("File not found" just sounds like a very typical error where some actual file is missing on the filesystem), so the real meaning - that the object is just missing on that node - is a bit hard to grasp.

Make sense to me.

I suggest either not to list "File not found" in "Failure Nodes" at all - just keep the old output, all empty fields mean that the object is not present. Or at least keep it separate from "Failure Nodes" and use some other wording, for example

The former sounds good to me.

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

[Failure Nodes]
[email protected]: Connection timeout.

@yosukehara Thoughts?

yosukehara · 2018-08-21T08:08:08Z

@mocchira Sorry for replying so late. LGTM.

mocchira added survey Priority-HIGH v1.4 labels Jul 31, 2018

mocchira self-assigned this Jul 31, 2018

mocchira added this to the 1.4.3 milestone Jul 31, 2018

mocchira mentioned this issue Aug 22, 2018

Fixes for 1.4.3 #1111

Merged

yosukehara closed this as completed in d3fa0a3 Aug 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"whereis" output can be wrong when the storage node is slow #1090

"whereis" output can be wrong when the storage node is slow #1090

vstax commented Jul 30, 2018

mocchira commented Aug 2, 2018 •

edited

Loading

yosukehara commented Aug 2, 2018

vstax commented Aug 2, 2018

mocchira commented Aug 2, 2018

yosukehara commented Aug 21, 2018

"whereis" output can be wrong when the storage node is slow #1090

"whereis" output can be wrong when the storage node is slow #1090

Comments

vstax commented Jul 30, 2018

mocchira commented Aug 2, 2018 • edited Loading

yosukehara commented Aug 2, 2018

vstax commented Aug 2, 2018

mocchira commented Aug 2, 2018

yosukehara commented Aug 21, 2018

mocchira commented Aug 2, 2018 •

edited

Loading