Support FI_ADDR_GNIX enough for bootstrapping #7

sungeunchoi · 2015-02-02T18:27:43Z

No description provided.

hppritcha · 2015-02-06T22:31:02Z

What I have for FI_ADDR_GNIX for now is here.

A filled in struct of gni_ep_name would be returned from
fi_getname for a gni provider ep of type FI_EP_RDM.
It would also be what would be filled in to the fi_info
returned from a call to fi_getinfo(non-null node,...),
in which case it would be the address of the "service" at the
remote node.

The structure would contain the minimum needed to
use the GNI PostData API to contact a process for more
information on how to setup an SMSG connection etc.

The tricky thing is the cmd id. This will need to be obtained
in the case of fi_getinfo by some kind of nameservice
mechanism.

Here's an idea I had for a nameservice for jobs launched
by aprun:

first "rank" on the node - we can determine that using alps
lli calls, fires off a "nameserver thread" which creates a
cdm with cdm_id 0, and attaches to nic. It would then post
one or more wildcard datagrams.
the first rank would also create a file in TMPDIR of a known
name, and add an entry into the file alps_rank:cdm_id
(and any other info we might want to add up to size of
a GNI post datagram payload).
other ranks would open the file and write their data in
about rank and cdm_id.
when a rank at a remote node call fi_getinfo with the
nodename of a target node, a string in service being
something like "alpsjob:apid:job_rank", the gni
provider at that node would send a GNI Post datagram
to the target node/cdm_id/correct cookie, and wait
for response.
nameserver thread on target node would wake up
from blocking GNI_EpPostData wait, crack the incoming
datagram for the request, read the info in the local
file in TMPDIR, and return it using a second GNI_EpPostData.

@ztiffany
@bturrubiates
@sungeunchoi

sungeunchoi · 2015-02-06T23:00:01Z

Couple questions on the name service stuff.

Can we rely on a shared file system being available?
Can this be made launcher independent? This will need to work with slurm also.

hppritcha · 2015-02-06T23:15:41Z

Good question, we could enhance ugni/kgni to have a scratchpad in the
kernel.
I though gni already used something like this for user space
synchronization.

I'm pretty sure that owing to the need to hide native slurm from
craypich/craypmi
that the entire alps lli and alps util interface was carried forward in the
slurm
nativization effort.

I have an account on tiger - still need to login in though - I thought tiger
runs native slurm.

2015-02-06 16:00 GMT-07:00 Sung-Eun Choi [email protected]:

Couple questions on the name service stuff.

Can we rely on a shared file system being available?

Can this be made launcher independent? This will need to work with
slurm also.

—
Reply to this email directly or view it on GitHub
#7 (comment)
.

fixed HPCX <=v1.9.7 support (#7) Signed-off-by: Sannikov, Alexander <[email protected]> Signed-off-by: Dmitry Gladkov <[email protected]>

Here is the deadlock scenario: #0 0x00007fed3a439495 in pthread_spin_lock () #1 0x00007fed37ad7cfd in fastlock_acquire () #2 0x00007fed37ad80a4 in psmx2_lock () #3 0x00007fed37ad8361 in psmx2_am_trx_ctxt_handler_ext () #4 0x00007fed37b084e7 in psmx2_am_trx_ctxt_handler_0 () #5 0x00007fed373c08c5 in self_am_short_request () #6 0x00007fed3739bf83 in __psm2_am_request_short () #7 0x00007fed37ad84ee in psmx2_trx_ctxt_disconnect_peers () A lock has been held in psmx2_trx_ctxt_disconnect_peers before psm2_am_request_short is called. While making progress inside this function, the execution is redirected to the AM handler due to the arrival of an incoming disconnection request. The AM handler tries to acquire the same lock that has already been held and reaches a deadlock. Fix by avoiding calling psm2_am_request_short while holding the lock. Signed-off-by: Jianxin Xiong <[email protected]>

hppritcha self-assigned this Feb 2, 2015

sungeunchoi closed this as completed Jun 9, 2015

hppritcha mentioned this issue Jun 29, 2015

Provider variable changes break gnitest #258

Closed

bcernohous mentioned this issue Sep 3, 2015

After FI_WRITE the send cq is null unless FI_SEND was used on bind #329

Closed

tenbrugg mentioned this issue Jun 27, 2016

Running mini apps with 1k ranks and OpenMPI causes seg fault #883

Open

epaulson10 mentioned this issue Feb 10, 2017

Crash in GASNet testalign running large number of asynchronous RMA read requests. #1199

Closed

jshimek mentioned this issue Apr 25, 2017

prov/gni: Deadlock in __gnix_wait_nic_prog_thread_fn on gnix_nic_list_lock #1337

Closed

ofi-cray-test pushed a commit that referenced this issue Aug 30, 2017

OFI/MLX: fixed warnings reported by GCC 7.1 and updated documentation.

ee512f3

fixed HPCX <=v1.9.7 support (#7) Signed-off-by: Sannikov, Alexander <[email protected]> Signed-off-by: Dmitry Gladkov <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support FI_ADDR_GNIX enough for bootstrapping #7

Support FI_ADDR_GNIX enough for bootstrapping #7

sungeunchoi commented Feb 2, 2015

hppritcha commented Feb 6, 2015

sungeunchoi commented Feb 6, 2015

hppritcha commented Feb 6, 2015

Support FI_ADDR_GNIX enough for bootstrapping #7

Support FI_ADDR_GNIX enough for bootstrapping #7

Comments

sungeunchoi commented Feb 2, 2015

hppritcha commented Feb 6, 2015

sungeunchoi commented Feb 6, 2015

hppritcha commented Feb 6, 2015