Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[infiniband] collect info from all active ports of all IB devices #1262

Closed

Conversation

pmoravec
Copy link
Contributor

@pmoravec pmoravec commented Apr 3, 2018

currently just the very first active port is checked

Resolves: #1262

Signed-off-by: Pavel Moravec [email protected]


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname [email protected]?

@pmoravec
Copy link
Contributor Author

pmoravec commented Apr 3, 2018

Thanks Hongang Li for initial version of the patch.

@bmr-cymru
Copy link
Member

I'm not really in favour of this approach if the root cause is a deficiency in the tools that we are calling.

If this is confusing to users of sos, how can it not be confusing for users who are running these tools themselves?

@pmoravec
Copy link
Contributor Author

pmoravec commented Apr 3, 2018

I dont know if/how confusing the behaviour is, or if it is worth fallback behaviour when not providing options. See e.g. https://linux.die.net/man/8/infiniband-diags and "Multiple port/Multiple CA support:" paragraph.

It makes sense to collect that info from all active ports - it is up to discussion if sos or infiniband tools shall provide the "get data from all ports" feature.

s = open(IB_SYS_DIR + ca + "/ports/" + port + "/state")
state = s.readline()
s.close()
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s is not closed in the event of an exception in s.readline().

p = open(IB_SYS_DIR + ca + "/ports/" + port +
"/link_layer")
link_layer = p.readline()
p.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p is not closed in the event of an exception in p.readline().

@bmr-cymru
Copy link
Member

It just seems like a bad design choice in these tools: pretty much all system level tools of this kind that I can think of defaults to printing information for all devices when no option is given:

lsblk, lsscsi, ifconfig, ip, lvs, etc. etc. ...

What I really dislike here is the need to poke around /sys in order to discover the list of parameters to pass to the commands. The structure of sys does occasionally change and we don't have great infrastructure to deal with that at the moment.

If there's no choice but to work around this in sos then we will but it seems sub-optimal for both users and support that there is not a single, simple command that just lists all the available ports.

@bmr-cymru
Copy link
Member

(but the exception handling in the current proposed patch does need to be fixed).

Copy link
Member

@bmr-cymru bmr-cymru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me take a look at this tomorrow: I don't think we'll see changes in the upstream IB tools given how longstanding this behaviour is (although I still think that it is a defect: it's crazy to not give users a simple way to list the devices and ports on their machine).

currently just the very first active port is checked

Resolves: sosreport#1262

Signed-off-by: Pavel Moravec <[email protected]>
@pmoravec pmoravec force-pushed the sos-pmoravec-infiniband-all-ports branch from 40de592 to b5b9965 Compare April 16, 2018 15:42
@bmr-cymru bmr-cymru closed this in 0eb9153 May 22, 2018
davemulford pushed a commit to davemulford/sos that referenced this pull request Jun 15, 2018
currently just the very first active port is checked

Resolves: sosreport#1262

Signed-off-by: Pavel Moravec <[email protected]>
Signed-off-by: Bryn M. Reeves <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants