Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise an exception if remote open fails #4069

Merged
merged 7 commits into from
Mar 7, 2024

Conversation

eisenhauer
Copy link
Member

No description provided.

eisenhauer and others added 4 commits March 3, 2024 15:58
Code extracted from:

    https://github.com/GTkorvo/EVPath.git

at commit c52dee24e25f25468459ce8760870e21a6a4bf63 (master).

Upstream Shortlog
-----------------
# By EVPath Upstream
* upstream-EVPath:
  EVPath 2024-03-03 (c52dee24)
@eisenhauer eisenhauer requested a review from pnorbert March 3, 2024 21:30
@eisenhauer
Copy link
Member Author

This removes the "Errno = " printf (EVPath upstream) and adds an exception if we try to open a file on the remote server and fail for any reason. At the moment, there is no difference between failing to contact a remote server, the file not existing, not being readable, etc. All result in the same exception. We can fine-tune.

@pnorbert
Copy link
Contributor

pnorbert commented Mar 3, 2024

I am not sure this is what I wanted. bpls -l will fail to list the content without a valid remote connection

@pnorbert
Copy link
Contributor

pnorbert commented Mar 3, 2024

Maybe throw the error in PerformGetRemote?

@eisenhauer
Copy link
Member Author

Possible of course, but requires a few more hoops to jump through. Currently we don't get to PerformRemoteGets() unless we have a remote connection. m_remote.Init() is the only place where we know if we should try to get a connection. So, rather than using the m_remote to keep state, we'd have to track more info, like if we were supposed to have a remote connection, but don't. Or I suppose we could not do the remote open when we do it, but instead do it on-demand or something... Thoughts?

@pnorbert
Copy link
Contributor

pnorbert commented Mar 3, 2024

I favor on-demand connection. Just opening and using metadata does not require remote connection and probably would not want it with lots of files/campaign files. Imagine a single campaign file with files from multiple locations. A simple bpls would initiate multiple connections even if the user has no intention to get data from all files.

@eisenhauer
Copy link
Member Author

OK, let me give that a shot tomorrow...

@eisenhauer
Copy link
Member Author

OK, this PR pushes the Open() back to PerformGets() and it checks to see that PerformGets() has something to do before it'll bother with the open. It works fine with the existing tests. BUT, those tests were really placeholders for a more detailed test suite. We should probably beef that up at some point...

Copy link
Contributor

@anagainaru anagainaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity, why is the open taking a layout?

We don't care right now, but we might want GPU-aware for remote reading in the future so I just want to understand. I'm going to let Norbert review.

@eisenhauer
Copy link
Member Author

Just out of curiosity, why is the open taking a layout?

We don't care right now, but we might want GPU-aware for remote reading in the future so I just want to understand. I'm going to let Norbert review.

I assume you're talking about the row-vs-column major stuff. This comes in to ADIOS from the bindings and represents the programmatic context of the client. I believe that the server needs to know that too, because the system's behavior depends upon this, so on the client we don't always open as if we are in C++ (despite the server being a c++ client). Instead we open a file as if we were the remote client...

@anagainaru
Copy link
Contributor

That makes sense, the row/column major should stay at the program level and the buffer layout is dealt with at every individual Get, not in the open.

@eisenhauer eisenhauer merged commit 269f1f7 into ornladios:master Mar 7, 2024
39 checks passed
@eisenhauer eisenhauer deleted the ErrorOnRemote branch March 7, 2024 20:11
vicentebolea pushed a commit to vicentebolea/ADIOS2 that referenced this pull request Apr 3, 2024
Raise an exception if remote open fails
dmitry-ganyushin added a commit to dmitry-ganyushin/ADIOS2 that referenced this pull request Apr 16, 2024
* upstream/master:
  Fix links to tutorial materials (ornladios#4086)
  BlockIndex.Evaluate() made branch based on BP5 vs BP4. To support CampaignReader engine, decision is made based on whether MinBlocksInfo is supported by engine.
  Update documentation for 2.10 changes to the GPU-backend (ornladios#4083)
  Add test for single string attribute vs string array attribute with a single element
  Raise an exception if remote open fails (ornladios#4069)
  - Python: fix for scalar reading. If a global value has 1 step (i.e. always in streaming), read returns a 0-dim numpy array (single value). If the variable has multiple steps (only in ReadRandomAccess mode), read returns a 1-dim numpy array even if the step selection is a single step. This way, read of a certain variable always results in the same type of array no matter the number of steps selected. - Python: fix for string attributes: return a string, not a list of one element which is a string, to be consistent with string global values and with other APIs.
  format more
  format
  Python: add the same treatment to attributes as to variables before: return scalars (0-dim ndarray) for single value attributes.

# Conflicts:
#	source/adios2/engine/bp5/BP5Reader.cpp
#	source/adios2/engine/bp5/BP5Reader.h
dmitry-ganyushin added a commit to dmitry-ganyushin/ADIOS2 that referenced this pull request Apr 16, 2024
* origin/adios-xrootd: (164 commits)
  Fixes for FreeBSD, including upstream (ornladios#4138)
  Bump version to v2.10.0
  Setting the derived variable support OFF by default
  Add the CURL function to derived variables (ornladios#4114)
  Add -f file1 [file2...] option to process adios files from a list instead of a campaign recording
  DataPlane Configuration changes (ornladios#4121)
  update doc
  Add attribute support to campaign
  Add a random string to each database name to avoid collision when running multiple applications in the same directory at the same time. Fixes issues with CI that runs ctest in parallel
  Initialize ADIOS::m_UserOption before tentaively calling ProcessUserConfig()
  - set ACA version to  0.1 - remove debug prints - add doc on Campaign Management in Advanced section - change static struct of UserOptions to class member of ADIOS class to make it work with gcc 4.8
  used size_t not int for map indexing to avoid type conversion
  fix remote server test, use the new binary name
  clang-format
  Use the name of the campaign in the cache path to avoid name collision
  rename remote_server to adios2_remote_server
  bug fix: the order of entries in bpdataset table is undefined but the campaign data reader relied on calculating the index as if it was sorted by the insertion order. Use a map instead to store the rowid and use that as index for the bpfile elements.
  Use yaml parser in campaign manager python script
  change a long variable to int64_t to avoid size confusion on windows
  do not include unistd.h
  Fix compiler error Remove extra file not needed
  Add support for user options in ~/.config/adios2/adios2.yaml Currently supported options:
  cmake: add sqlite3 and zlib dep in adios2 cmake pkg
  Different names for MPI and Serial tests (ornladios#4118)
  EVpath upstream to make NO_RDMA more robust (ornladios#4116)
  Warnings (ornladios#4113)
  Don't use assert() in tests (ornladios#4108)
  - Only add campaign store to file name if that is not absolute path - list command supports second argument as path
  flake8 fixes
  Update campaign manager script to handle config file, time in nanosecond format, and avoiding conflict when updating database
  dill 2024-03-12 (ebc98c4d) (ornladios#4091)
  Don't run derived test in MPI mode, it's not written for that (ornladios#4104)
  Fix static blosc2 build (ornladios#4093)
  ci: add ccache job summary (ornladios#4101)
  Fix typo in fortran.rst (ornladios#4102)
  WIP: Make Fortran tests fail with a non-zero exit code (ornladios#4097)
  Bison 3.8 Parser (ornladios#4062)
  Do not create adios-campaign/ unless there is something to record
  Add setup for Aurora (load adios2 as e4s package)
  Completely hide derived variables in C API if not enabled. Print warning inside Fortran F2C function.
  adios2_define_derived_variable C/Fortran API. C is compiled conditionally, like the C++ API. The Fortran function is always available, it will print a WARNING if derived variable is not supported. Added Fortran test for magnitude().
  Fix links to tutorial materials (ornladios#4086)
  BlockIndex.Evaluate() made branch based on BP5 vs BP4. To support CampaignReader engine, decision is made based on whether MinBlocksInfo is supported by engine.
  Update documentation for 2.10 changes to the GPU-backend (ornladios#4083)
  Add test for single string attribute vs string array attribute with a single element
  - Python: fix for scalar reading. If a global value has 1 step (i.e. always in streaming), read returns a 0-dim numpy array (single value). If the variable has multiple steps (only in ReadRandomAccess mode), read returns a 1-dim numpy array even if the step selection is a single step. This way, read of a certain variable always results in the same type of array no matter the number of steps selected. - Python: fix for string attributes: return a string, not a list of one element which is a string, to be consistent with string global values and with other APIs.
  format more
  format
  Python: add the same treatment to attributes as to variables before: return scalars (0-dim ndarray) for single value attributes.
  Raise an exception if remote open fails (ornladios#4069)
  Fortran bindings for memory space related functions (ornladios#4077)
  consolidate (ornladios#4078)
  Making the Detect memory space available regardless of the backend used
  Testing code for the C bindings with memory space API
  Adding c bindings for getting the shape of a variable based on a memory space
  Adding c bindings for setting and getting the memory space
  Small typo fixes
  - Restructure python API doc in separate main topic, add working example to it. - What's new for 2.10 - Usage on DOE machines
  Fix Reord to use MinBlocksInfo where appropriate (ornladios#4071)
  Using the correct flag to detect the CUDA backend in ZFP
  clang-format fix
  fixed warning
  added support to read back from H5T_STRING VARIABLES it turns out strings written out through h5py are all variable strings
  format
  fixes to still be able to build with gcc 4.8.2. Needed for OLCF DTN nodes
  Add minmax and shape functions to CampaignReader, so that per-block info is complete when listing campaign archives
  do not flush io and adios in read mode. BP5 reader does not like it.
  Campaign engine is recognized by file extension.
  WIP. Changed the name of the campaign config file.
  Added reading of configuration parametyers from ./config/adios2
  Use GetEstimatedSize in encryption operator plugin
  Allow plugin operators to take advantage of the estimated size API
  Setting the derived variable support OFF by default
  Add the CURL function to derived variables (ornladios#4114)
  Add -f file1 [file2...] option to process adios files from a list instead of a campaign recording
  DataPlane Configuration changes (ornladios#4121)
  update doc
  Add attribute support to campaign
  Add a random string to each database name to avoid collision when running multiple applications in the same directory at the same time. Fixes issues with CI that runs ctest in parallel
  Initialize ADIOS::m_UserOption before tentaively calling ProcessUserConfig()
  - set ACA version to  0.1 - remove debug prints - add doc on Campaign Management in Advanced section - change static struct of UserOptions to class member of ADIOS class to make it work with gcc 4.8
  used size_t not int for map indexing to avoid type conversion
  fix remote server test, use the new binary name
  clang-format
  Use the name of the campaign in the cache path to avoid name collision
  rename remote_server to adios2_remote_server
  bug fix: the order of entries in bpdataset table is undefined but the campaign data reader relied on calculating the index as if it was sorted by the insertion order. Use a map instead to store the rowid and use that as index for the bpfile elements.
  Use yaml parser in campaign manager python script
  change a long variable to int64_t to avoid size confusion on windows
  do not include unistd.h
  Fix compiler error Remove extra file not needed
  Add support for user options in ~/.config/adios2/adios2.yaml Currently supported options:
  cmake: add sqlite3 and zlib dep in adios2 cmake pkg
  Different names for MPI and Serial tests (ornladios#4118)
  EVpath upstream to make NO_RDMA more robust (ornladios#4116)
  Warnings (ornladios#4113)
  Don't use assert() in tests (ornladios#4108)
  - Only add campaign store to file name if that is not absolute path - list command supports second argument as path
  flake8 fixes
  Update campaign manager script to handle config file, time in nanosecond format, and avoiding conflict when updating database
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants