Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial build support for GNI provider #2

Merged
merged 2 commits into from
Jan 29, 2015

Conversation

sungeunchoi
Copy link

@hppritcha
Changes to enable gni build.

First commit is for changes need to build (not in prov/gni). Second commit is the stubs.

hppritcha added a commit that referenced this pull request Jan 29, 2015
Initial build support for GNI provider
@hppritcha hppritcha merged commit 2b5bd03 into ofi-cray:master Jan 29, 2015
@sungeunchoi sungeunchoi deleted the initial-stubs branch February 3, 2015 01:12
ofi-cray-test pushed a commit that referenced this pull request Aug 5, 2017
=================================================================
==849267== ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff4caa7230 at pc 0x7ffdf8608687 bp 0x7fff4caa71b0 sp 0x7fff4caa71a0
READ of size 8 at 0x7fff4caa7230 thread T0
    #0 0x7ffdf8608686 in fi_tostr_ libfabric-current/src/fi_tostr.c:618
    #1 0x402f3a in run_test_set ofi/libfabric-current/fabtest/unit/size_left_test.c:262
    #2 0x403457 in main libfabric-current/fabtest/unit/size_left_test.c:317
    #3 0x7ffdf4819b14 in __libc_start_main (/usr/lib64/libc.so.6+0x21b14)
    #4 0x401988 in _start (libfabric-1.4.0/ofi_inst/bin/fi_size_left_test+0x401988)
Address 0x7fff4caa7230 is located at offset 32 in frame <run_test_set> of T0's stack:
  This frame has 2 object(s):
    [32, 36) 'ep_type'
    [96, 104) 'info'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow libfabric-current/src/fi_tostr.c:618 fi_tostr_
Shadow bytes around the buggy address:
  0x10006994cdf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10006994ce00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10006994ce10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10006994ce20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10006994ce30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x10006994ce40: 00 00 f1 f1 f1 f1[04]f4 f4 f4 f2 f2 f2 f2 00 f4
  0x10006994ce50: f4 f4 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
  0x10006994ce60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10006994ce70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10006994ce80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10006994ce90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:     fa
  Heap righ redzone:     fb
  Freed Heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==849267== ABORTING

Signed-off-by: Sylvain Didelot <[email protected]>
ofi-cray-test pushed a commit that referenced this pull request Dec 13, 2017
Here is the deadlock scenario:

  #0  0x00007fed3a439495 in pthread_spin_lock ()
  #1  0x00007fed37ad7cfd in fastlock_acquire ()
  #2  0x00007fed37ad80a4 in psmx2_lock ()
  #3  0x00007fed37ad8361 in psmx2_am_trx_ctxt_handler_ext ()
  #4  0x00007fed37b084e7 in psmx2_am_trx_ctxt_handler_0 ()
  #5  0x00007fed373c08c5 in self_am_short_request ()
  #6  0x00007fed3739bf83 in __psm2_am_request_short ()
  #7  0x00007fed37ad84ee in psmx2_trx_ctxt_disconnect_peers ()

A lock has been held in psmx2_trx_ctxt_disconnect_peers before
psm2_am_request_short is called. While making progress inside
this function, the execution is redirected to the AM handler
due to the arrival of an incoming disconnection request. The AM
handler tries to acquire the same lock that has already been
held and reaches a deadlock.

Fix by avoiding calling psm2_am_request_short while holding the lock.

Signed-off-by: Jianxin Xiong <[email protected]>
ofi-cray-test pushed a commit that referenced this pull request Sep 6, 2018
Fix crash in uber test:

#0  0x00000000004036fb in fts_cur_info (series=<value optimized out>,
    info=0x60c920) at complex/ft_config.c:613
#1  0x0000000000402568 in ft_fw_client (argc=<value optimized out>,
    argv=<value optimized out>) at complex/ft_main.c:412
#2  main (argc=<value optimized out>, argv=<value optimized out>)
    at complex/ft_main.c:511

The opts strings may be overrun.

Set the default destination port, so that the test can run.

Signed-off-by: Sean Hefty <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants