Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"whereis" output can be wrong when the storage node is slow #1090

Closed
vstax opened this issue Jul 30, 2018 · 5 comments
Closed

"whereis" output can be wrong when the storage node is slow #1090

vstax opened this issue Jul 30, 2018 · 5 comments
Assignees
Milestone

Comments

@vstax
Copy link
Contributor

vstax commented Jul 30, 2018

Situation: in cluster of 6 nodes running 1.4.2 a new node (node02) is getting recovered in takeover operation (node09 -> node02). This appears in log of the node02:

[E] [email protected]  2018-07-30 16:32:20.639683 +0300    1532957540leo_storage_handler_object:get/4  191 [{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,3952335},{etag,13886495139313085388585308446383676488},{cause,"Recover failure"}]

To check what's going on I execute whereis and it is very slow to respond, after some time it produces the output:

[root@leofs-mgr01 ~]# leofs-adm whereis production/parts/used/23/03/80/36038005/1.jpg
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

I execute whereis again and get this output:

[root@leofs-mgr01 ~]# leofs-adm whereis production/parts/used/23/03/80/36038005/1.jpg
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300       
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |

There are two problems. First (minor) is that object didn't get recovered on node02 yet - however I suspect that it will be recovered at some point in the future (maybe it's in some queue on node02 right now which will be processed later, or maybe it will be pushed to this node by node05 or node08 in the future, I have no idea). The queues on all nodes (except for node02) are filled with millions of messages as part of recovery process at the moment so maybe read-repair just didn't have time to work yet.

Second, main problem is that first output of whereis was lying. I'm pretty sure that the object was present on node05 at that moment, it's just that request from manager to check its state failed. The output of "whereis" should clearly differentiate between missing object and inability to contact node (like if it's not running at all), or timed out request and produce different output for these two cases.

Just in case, logs related to this object (the timeout I had during first "whereis" operation is probably there somewhere as well). Error logs from node02 (nothing in info logs):

[W]	[email protected]	2018-07-30 16:31:50.638010 +0300	1532957510leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W]	[email protected]	2018-07-30 16:31:56.281774 +0300	1532957516leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W]	[email protected]	2018-07-30 16:32:20.638857 +0300	1532957540leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[E]	[email protected]	2018-07-30 16:32:20.639683 +0300	1532957540leo_storage_handler_object:get/4	191	[{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,3952335},{etag,13886495139313085388585308446383676488},{cause,"Recover failure"}]
[W]	[email protected]	2018-07-30 16:32:26.90911 +0300	1532957546	leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W]	[email protected]	2018-07-30 16:32:26.282894 +0300	1532957546leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[E]	[email protected]	2018-07-30 16:32:26.283730 +0300	1532957546leo_storage_handler_object:get/4	191	[{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,57890444},{etag,13886495139313085388585308446383676488},{cause,"Recover failure"}]
[W]	[email protected]	2018-07-30 16:32:56.91867 +0300	1532957576	leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[E]	[email protected]	2018-07-30 16:32:56.92533 +0300	1532957576	leo_storage_handler_object:get/3	159	[{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,35201166},{cause,"Recover failure"}]
[W]	[email protected]	2018-07-30 16:33:10.514741 +0300	1532957590leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W]	[email protected]	2018-07-30 16:33:40.515824 +0300	1532957620leo_storage_handler_object:read_and_repair_3/3	1434	[{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[E]	[email protected]	2018-07-30 16:33:40.516669 +0300	1532957620leo_storage_handler_object:get/3	159	[{from,gateway},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{req_id,70841817},{cause,"Recover failure"}]

Error logs from node05:

[W] [email protected]  2018-07-30 16:31:20.790656 +0300    1532957480leo_storage_handler_object:read_and_repair_3/3    1434    [{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W] [email protected]  2018-07-30 16:31:26.435556 +0300    1532957486leo_storage_handler_object:read_and_repair_3/3    1434    [{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W] [email protected]  2018-07-30 16:31:56.243559 +0300    1532957516leo_storage_handler_object:read_and_repair_3/3    1434    [{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]
[W] [email protected]  2018-07-30 16:32:40.666584 +0300    1532957560leo_storage_handler_object:read_and_repair_3/3    1434    [{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{cause,timeout}]

Info logs from node05:

[I] [email protected]  2018-07-30 16:34:14.196621 +0300    1532957654null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,143404}]}
[I] [email protected]  2018-07-30 16:34:29.497739 +0300    1532957669null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,153253}]}
[I] [email protected]  2018-07-30 16:34:44.65305 +0300 1532957684  null:null   0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,167629}]}
[I] [email protected]  2018-07-30 16:34:44.93413 +0300 1532957684  null:null   0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,137848}]}
[I] [email protected]  2018-07-30 16:34:44.129532 +0300    1532957684null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,123462}]}
[I] [email protected]  2018-07-30 16:34:44.143682 +0300    1532957684null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,102028}]}
[I] [email protected]  2018-07-30 16:34:44.169612 +0300    1532957684null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,93501}]}
[I] [email protected]  2018-07-30 16:34:44.184279 +0300    1532957684null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,91182}]}

Info logs from node08 (nothing in error logs):

[I] [email protected]  2018-07-30 16:31:44.46812 +0300 1532957504  null:null   0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,83260}]}
[I] [email protected]  2018-07-30 16:32:17.765815 +0300    1532957537null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,111333}]}
[I] [email protected]  2018-07-30 16:32:31.449430 +0300    1532957551null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,100660}]}
[I] [email protected]  2018-07-30 16:32:31.450507 +0300    1532957551null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,95209}]}
[I] [email protected]  2018-07-30 16:32:31.451198 +0300    1532957551null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,95017}]}
[I] [email protected]  2018-07-30 16:32:59.654923 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,98865}]}
[I] [email protected]  2018-07-30 16:32:59.656474 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,93414}]}
[I] [email protected]  2018-07-30 16:32:59.657418 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,93222}]}
[I] [email protected]  2018-07-30 16:32:59.661405 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,78997}]}
[I] [email protected]  2018-07-30 16:32:59.665188 +0300    1532957579null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,63422}]}
[I] [email protected]  2018-07-30 16:33:12.936779 +0300    1532957592null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,62271}]}
[I] [email protected]  2018-07-30 16:33:12.972406 +0300    1532957592null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,40859}]}
[I] [email protected]  2018-07-30 16:33:12.997936 +0300    1532957592null:null 0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,get},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,32331}]}
[I] [email protected]  2018-07-30 16:33:13.263 +0300   1532957593  null:null   0   {module,"leo_object_storage_msg_collector"},{function,"notify/4"},{line,107},{body,[{cause,"slow operation"},{method,head},{key,<<"production/parts/used/23/03/80/36038005/1.jpg">>},{processing_time,19107}]}
@mocchira mocchira self-assigned this Jul 31, 2018
@mocchira mocchira added this to the 1.4.3 milestone Jul 31, 2018
@mocchira
Copy link
Member

mocchira commented Aug 2, 2018

@yosukehara

Second, main problem is that first output of whereis was lying. I'm pretty sure that the object was present on node05 at that moment, it's just that request from manager to check its state failed. The output of "whereis" should clearly differentiate between missing object and inability to contact node (like if it's not running at all), or timed out request and produce different output for these two cases.

IMHO, this should be fixed. how do leofs-adm whereis show those differences?
Now in any failure cases, whereis output the value only at the node column and empty at other columns as shown below

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

Since the additional error column is so lengthy and outputting the error detail in an existing column looks weird, How about showing the error detail at the bottom apart from the table like below?

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

[Failure Nodes]
[email protected]: Connection timeout.
[email protected]: File not found.

@yosukehara
Copy link
Member

@mocchira I agree to clarify the result of leofs-adm whereis command:

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

[Failure Nodes]
[email protected]: Connection timeout.
[email protected]: File not found.

@vstax
Copy link
Contributor Author

vstax commented Aug 2, 2018

I have a suggestion, I think that output

[Failure Nodes]
[email protected]: File not found.

is confusing. Because it's in the list of failures and looks like some kind of internal error ("File not found" just sounds like a very typical error where some actual file is missing on the filesystem), so the real meaning - that the object is just missing on that node - is a bit hard to grasp.

I suggest either not to list "File not found" in "Failure Nodes" at all - just keep the old output, all empty fields mean that the object is not present. Or at least keep it separate from "Failure Nodes" and use some other wording, for example

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

[email protected]: Object missing.
[Failure Nodes]
[email protected]: Connection timeout.

@mocchira
Copy link
Member

mocchira commented Aug 2, 2018

@vstax Thanks for your suggestion.

is confusing. Because it's in the list of failures and looks like some kind of internal error ("File not found" just sounds like a very typical error where some actual file is missing on the filesystem), so the real meaning - that the object is just missing on that node - is a bit hard to grasp.

Make sense to me.

I suggest either not to list "File not found" in "Failure Nodes" at all - just keep the old output, all empty fields mean that the object is not present. Or at least keep it separate from "Failure Nodes" and use some other wording, for example

The former sounds good to me.

-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node           |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when  
-------+-------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | [email protected]      | 888f50b0ec29fca904f07d77d0823543     |       119K |   a72711203d | false          |              0 | 5702b35286880  | 2018-07-04 15:02:41 +0300
       | [email protected]      |                                      |            |              |                |                |                |
       | [email protected]      |                                      |            |              |                |                |                |

[Failure Nodes]
[email protected]: Connection timeout.

@yosukehara Thoughts?

@yosukehara
Copy link
Member

@mocchira Sorry for replying so late. LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants