Skip to content

May 2018 Committee meeting

jhansonhpe edited this page May 4, 2018 · 3 revisions

May 4, 2018 Here is the agenda for today's Power API Committee Meeting at 9AM Pacific:

  1. Specification – one ticket (#3) to discuss, see https://github.com/pwrapi/powerapi_spec/pulls/
  2. Update from HPE on SLURM and discussion of Resource Managers and interoperability with Power API and potential API interfaces
  3. AOB

Attendees

Ryan Grant, SNL
Jeff Hanson, HPE
Sid Jana, Intel
Steve Martin, Cray
Matt Kappel, Cray
Ram Nagappan, Intel
Barry Rountree, LLNL
Kevin Pedretti, SNL
Andrew Younge, SNL
Todd Rosendahl, IBM
Steve Leak, NERSC

General updates - moving forward on the spec. Please submit new tickets, especially Cray with reporting. Steve asked in this was the statistics stuff. Ryan and Cray will work it up.

Discussion on pull request #3 Ryan explains it came from using power api on application. Function was difficult to use. Without a failure the return would be a success or a warning that the buffer wasn't long enough. Led to a guessing game for how long the buffer is. Proposal is fix this. User is told what to allocate to avoid the guessing game. Barry asked is there ever a situation where the user would want a truncated name. Steve M thought it might be in an interface with limited space. Barry would rather have the interface truncate rather then power api. Barry what if I pass in just a pointer and have the library do the allocation. User has to free later. Steve M said pointer makes it ugly as the library has to have frees for where it allocates. Ryan said the spec was trying to allow more language bindings. Barry said it seems weird that you are passing in a buffer without knowing how big it would have to be. So add another function to get the size. Steve M gave more details on the idea. No existing code has to change. Barry can we have both? Kevin recalled some complexity. Steve M thought the meta data functions could give length. Kevin thought that might be true and will go look. Ryan said Lee (from SNL) said the current method leads to a lot of extra code. Ryan thinks extra function in Barry's idea will give Lee what he wants. Maybe get rid of the warning because a correctly sized buffer will be used. Note - warnings are > 0 as the return code. Failure is -1. Kevin writes "I don't think there's any way to get the object name length via metadata currently". Ryan asked are you sure. Kevin explored some more live in the source code. He says a metadata item would need to be added. He likes the new function idea better. Steve M is this the only place or is this just the first one to annoy. Kevin thinks this might be the only place and will look. Ryan said should we keep the warning. Matt wants to leave the warning on the basis of getting more information. Steve M wants to leave existing code working. Ryan will prototype new function for review on next meeting.

We (HPE) presented the power api design for SLURM. Jeff to find out if this document can be released. Jeff asked whom from Cray talked to SchedMD. It was Steve M. He says that the CAPMC model might be usable as is for interface to power api calls. There is a defined RESTful api already. Make a full power api backend as a prototype. Ryan said HPE might want to state more strongly that we are looking for SLURM api stability. Ryan thinks SchedMD ought to be interested in a power api plugin. Ryan asked would there be interest in the community. Sid said he thinks they would be interested. Sid asked Steve what is the stack. When slurm wants to do power capping they call into CAPMC for nodes they care about. Steve knew about power api when they wrote CAPMC so they have compatible names for a lot of it. To be power api friendly but to do so in a restful interface. Use CAPMC to do the actual settings. At least similar to the flow chart HPE showed. Steve M. explained how the looping is done. Keep the nodes grouped so they have relatively similar power. Sid asked about Steve for paper. Kevin said SLURM lacks an interface to the node level hence power api could do that. Ryan said API has no specialized interface as is. Core would seem to be useful already. Sid asked for HPE SGI power method and how Altair uses it. Jeff to provide.

https://cug.org/proceedings/cug2015_proceedings/includes/files/pap132.pdf

Steve M. will find a pointer to the CAPMC design. Kevin said a restful interface could be done if needed. Jeff said the current altair to hpe 8600 power method is a python api call.

https://pubs.cray.com/content/S-2553/CLE%206.0.UP06/capmc-api-documentation