Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thermalctld: Add support for fans on non-CPU modules #555

Conversation

patrickmacarthur
Copy link
Contributor

@patrickmacarthur patrickmacarthur commented Oct 30, 2024

Description

This adds support to the show platform fans command to show fans that are on modules.

Motivation and Context

In the current Arista chassis model, the chassis fans are returned by Module.get_all_fans() as opposed to FanDrawer.get_all_fans(), which currently thermalctld makes no provision for. This change allows fans that are under the modules to be listed in the command output.

How Has This Been Tested?

This has been tested internally on a chassis, and the fan output now includes all fans on the chassis as opposed to just PSU fans:

admin@cmp206:~$ show platform fan
  Drawer    LED     FAN    Speed    Direction    Presence    Status          Timestamp
--------  -----  ------  -------  -----------  ----------  --------  -----------------
     N/A    off  fan0/1      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    off  fan0/2      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    red  fan0/3      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    red  fan0/4      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    red  fan0/5      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    red  fan0/6      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    red  fan0/7      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    red  fan0/8      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    off  fan1/1      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    off  fan1/2      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    off  fan1/3      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    off  fan1/4      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    off  fan1/5      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    off  fan1/6      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    off  fan1/7      29%      exhaust     Present        OK  20241030 15:54:08
     N/A    off  fan1/8      29%      exhaust     Present        OK  20241030 15:54:08
....
     N/A    off  psu2/1      49%       intake     Present        OK  20241030 15:54:08
     N/A    off  psu4/1      44%       intake     Present        OK  20241030 15:54:08
     N/A    off  psu6/1      44%       intake     Present        OK  20241030 15:54:08
     N/A    off  psu8/1      46%       intake     Present        OK  20241030 15:54:09

Additional Information (Optional)

Platform library support change sonic-net/sonic-buildimage#20929 should be merged before this change.

@abdosi
Copy link
Contributor

abdosi commented Oct 30, 2024

@bmridul and @mlok-nokia can you help check this . does this need sonic change or we should push this into platform implementation.

Copy link
Contributor

@gechiang gechiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gechiang
Copy link
Contributor

Just want to higlight that this change has a dependency of this PR mentioned by Author in the PR description:
Platform library support change sonic-net/sonic-buildimage#20603 should be merged before this change.

@rlhui
Copy link

rlhui commented Nov 27, 2024

Just want to higlight that this change has a dependency of this PR mentioned by Author in the PR description: Platform library support change sonic-net/sonic-buildimage#20603 should be merged before this change.

@bmridul and @mlok-nokia can you help check this . does this need sonic change or we should push this into platform implementation.

@bmridul , @mlok-nokia ping again

@spilkey-cisco
Copy link
Contributor

I like this change! But at the same time, I think support should be added to include fans from all possible locations, not just add Modules. Similar to Module, fans may exist directly on the chassis, and not in a fan drawer. Can a type and handling for that be added in this PR as well?

@patrickmacarthur
Copy link
Contributor Author

I like this change! But at the same time, I think support should be added to include fans from all possible locations, not just add Modules. Similar to Module, fans may exist directly on the chassis, and not in a fan drawer. Can a type and handling for that be added in this PR as well?

This already exists today; any fan in a fan drawer on the chassis is already displayed. This PR is just adding missing output for fans that are attached to modules.

@spilkey-cisco
Copy link
Contributor

I like this change! But at the same time, I think support should be added to include fans from all possible locations, not just add Modules. Similar to Module, fans may exist directly on the chassis, and not in a fan drawer. Can a type and handling for that be added in this PR as well?

This already exists today; any fan in a fan drawer on the chassis is already displayed. This PR is just adding missing output for fans that are attached to modules.

I'm referring specifically to fans not in a physical fan drawer (adding CHASSIS as a type in addition to DRAWER). Today, vendors must configure some notion of a 'logical' fan drawer to house fans connected directly to the chassis (not in a physical fan drawer), essentially treating the chassis itself as a fan drawer. This could be further confused by a chassis that could have both physical fan drawers with fans, and fans directly connected to the chassis without fan drawers. This feels like an unnecessary limitation, but perhaps there is some reason I'm missing as to why this should not be done?

except Exception as e:
self.log_warning('Failed to update fan status - {}'.format(repr(e)))

for module_index, module in enumerate(self.chassis.get_all_modules()):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thermals first check is_chassis_system before looping over module details, should that be done here as well? Or it's perhaps an unnecessary check there? https://github.com/sonic-net/sonic-platform-daemons/blob/master/sonic-thermalctld/scripts/thermalctld#L599 That also looks at PSUs connected to the modules, which may themselves have fans.

The PR title describes this as only for non-CPU modules, but it looks like that is not actually a limitation; won't this get all fans in any module where they are configured? Not necessarily a bad thing, but worth noting.

@mlok-nokia
Copy link
Contributor

mlok-nokia commented Nov 29, 2024

Just want to higlight that this change has a dependency of this PR mentioned by Author in the PR description: Platform library support change sonic-net/sonic-buildimage#20603 should be merged before this change.

@bmridul and @mlok-nokia can you help check this . does this need sonic change or we should push this into platform implementation.

@bmridul , @mlok-nokia ping again

PR

Just want to higlight that this change has a dependency of this PR mentioned by Author in the PR description: Platform library support change sonic-net/sonic-buildimage#20603 should be merged before this change.

@bmridul and @mlok-nokia can you help check this . does this need sonic change or we should push this into platform implementation.

@bmridul , @mlok-nokia ping again

I think these 2 PR can be pushed independently. PR20603 is just Arista's platform specified code.

@gechiang gechiang requested a review from prgeor December 2, 2024 23:52
@patrickmacarthur
Copy link
Contributor Author

I like this change! But at the same time, I think support should be added to include fans from all possible locations, not just add Modules. Similar to Module, fans may exist directly on the chassis, and not in a fan drawer. Can a type and handling for that be added in this PR as well?

This already exists today; any fan in a fan drawer on the chassis is already displayed. This PR is just adding missing output for fans that are attached to modules.

I'm referring specifically to fans not in a physical fan drawer (adding CHASSIS as a type in addition to DRAWER). Today, vendors must configure some notion of a 'logical' fan drawer to house fans connected directly to the chassis (not in a physical fan drawer), essentially treating the chassis itself as a fan drawer. This could be further confused by a chassis that could have both physical fan drawers with fans, and fans directly connected to the chassis without fan drawers. This feels like an unnecessary limitation, but perhaps there is some reason I'm missing as to why this should not be done?

Discussed during chassis meeting, this should be implemented in a follow-up PR, as it widens the scope to affect not just chassis but also fixed systems.

@gechiang gechiang requested a review from assrinivasan December 4, 2024 21:05
@gechiang
Copy link
Contributor

gechiang commented Dec 4, 2024

@assrinivasan please help take a quick look of this PR content to see if you have any concerns especially for non-chassis platforms while @prgeor is out of office.
Thanks!

@rlhui rlhui merged commit 60e7224 into sonic-net:master Dec 6, 2024
5 checks passed
@bingwang-ms
Copy link

@patrickmacarthur Can you please confirm this change will not break the CLI in single ASIC platform?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

8 participants