Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation for mpio driver with selection I/O. #3058

Closed
wants to merge 46 commits into from
Closed
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
3168cae
Implementation for mpio driver with selection I/O.
vchoi-hdfgroup Jun 6, 2023
d298b10
Committing clang-format changes
github-actions[bot] Jun 6, 2023
02adecf
Correct CI errors.
vchoi-hdfgroup Jun 6, 2023
6180c41
Merge branch 'mpio_develop' of https://github.com/vchoi-hdfgroup/hdf5…
vchoi-hdfgroup Jun 6, 2023
96e11a5
Committing clang-format changes
github-actions[bot] Jun 6, 2023
e655a8f
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 7, 2023
045b313
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 8, 2023
a64ca62
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 10, 2023
d383ad3
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 13, 2023
ae4cf79
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 15, 2023
28d3774
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 18, 2023
7359426
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 20, 2023
7c3c746
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 23, 2023
9c1f825
Fix use_vector on read/write_vector() != NULL and skip_vector_cb.
vchoi-hdfgroup Jun 26, 2023
046a9eb
Committing clang-format changes
github-actions[bot] Jun 26, 2023
ec50a83
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 26, 2023
11e9fe6
1) Restore link_piece_collective_io() back to its original state in H…
vchoi-hdfgroup Jun 27, 2023
40e572a
Merge branch 'mpio_develop' of https://github.com/vchoi-hdfgroup/hdf5…
vchoi-hdfgroup Jun 27, 2023
5fb29a5
Committing clang-format changes
github-actions[bot] Jun 27, 2023
5bbdadd
Correct errors made for previous checkin.
vchoi-hdfgroup Jun 27, 2023
bbf78af
Merge branch 'mpio_develop' of https://github.com/vchoi-hdfgroup/hdf5…
vchoi-hdfgroup Jun 27, 2023
fbf7f47
Committing clang-format changes
github-actions[bot] Jun 27, 2023
5bb536f
Correct CI errors from running autogeh.sh.
vchoi-hdfgroup Jun 27, 2023
6a37764
Merge branch 'mpio_develop' of https://github.com/vchoi-hdfgroup/hdf5…
vchoi-hdfgroup Jun 27, 2023
564fe53
Committing clang-format changes
github-actions[bot] Jun 27, 2023
4c0fe6c
This is the fix for a failure from one of the tests in testpar/t_2Gio…
vchoi-hdfgroup Jun 27, 2023
d62be34
Merge branch 'mpio_develop' of https://github.com/vchoi-hdfgroup/hdf5…
vchoi-hdfgroup Jun 27, 2023
61df96f
Committing clang-format changes
github-actions[bot] Jun 27, 2023
081ebf6
This is the fix for the test failure from running testpar/t_pread.c
vchoi-hdfgroup Jun 27, 2023
a90f382
Committing clang-format changes
github-actions[bot] Jun 27, 2023
3a12e6c
This is the fix for a failure from dataset_writeAll() test in testpar…
vchoi-hdfgroup Jun 27, 2023
35f9fed
Merge branch 'HDFGroup:develop' into mpio_develop
vchoi-hdfgroup Jun 27, 2023
7052ecd
Fix CI errors.
vchoi-hdfgroup Jun 27, 2023
1c4d042
Merge branch 'mpio_develop' of https://github.com/vchoi-hdfgroup/hdf5…
vchoi-hdfgroup Jun 27, 2023
d03c4b1
Committing clang-format changes
github-actions[bot] Jun 27, 2023
e962b8e
Corrections to the previous fix and commit message:
vchoi-hdfgroup Jun 28, 2023
3d3b88f
Merge branch 'mpio_develop' of https://github.com/vchoi-hdfgroup/hdf5…
vchoi-hdfgroup Jun 28, 2023
9e47124
Committing clang-format changes
github-actions[bot] Jun 28, 2023
5ca9b55
Merge branch 'develop' into mpio_develop
vchoi-hdfgroup Jun 28, 2023
d34769c
Committing clang-format changes
github-actions[bot] Jun 28, 2023
48d3304
Change HDassert, HDqsort, HDfprintf to assert, qsort, and fprintf acc…
vchoi-hdfgroup Jun 28, 2023
853fed6
Committing clang-format changes
github-actions[bot] Jun 28, 2023
93cd897
Add note about HDF5_VOL_CONNECTOR to tools usage (#3159)
mattjala Jun 28, 2023
ebd95ae
Rename HDsystem() to system() (#3197)
derobins Jun 28, 2023
0e306c2
Merge branch 'develop' into mpio_develop
vchoi-hdfgroup Jun 30, 2023
1f8903d
Change HDmalloc and HDfree to malloc and free.
vchoi-hdfgroup Jun 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
242 changes: 68 additions & 174 deletions src/H5Dmpio.c
Original file line number Diff line number Diff line change
Expand Up @@ -1447,26 +1447,27 @@ H5D__link_piece_collective_io(H5D_io_info_t *io_info, int mpi_rank)
H5D__link_piece_collective_io(H5D_io_info_t *io_info, int H5_ATTR_UNUSED mpi_rank)
#endif
{
MPI_Datatype chunk_final_mtype; /* Final memory MPI datatype for all chunks with selection */
hbool_t chunk_final_mtype_is_derived = FALSE;
MPI_Datatype chunk_final_ftype; /* Final file MPI datatype for all chunks with selection */
hbool_t chunk_final_ftype_is_derived = FALSE;
MPI_Datatype chunk_final_mtype; /* Final memory MPI datatype for all chunks with selection */
hbool_t chunk_final_mtype_is_derived = FALSE;
MPI_Datatype chunk_final_ftype; /* Final file MPI datatype for all chunks with selection */
hbool_t chunk_final_ftype_is_derived = FALSE;

H5D_storage_t ctg_store; /* Storage info for "fake" contiguous dataset */
MPI_Datatype *chunk_mtype = NULL;
MPI_Datatype *chunk_ftype = NULL;
MPI_Aint *chunk_file_disp_array = NULL;
MPI_Aint *chunk_mem_disp_array = NULL;
hbool_t *chunk_mft_is_derived_array =
NULL; /* Flags to indicate each chunk's MPI file datatype is derived */
hbool_t *chunk_mbt_is_derived_array =
NULL; /* Flags to indicate each chunk's MPI memory datatype is derived */
int *chunk_mpi_file_counts = NULL; /* Count of MPI file datatype for each chunk */
int *chunk_mpi_mem_counts = NULL; /* Count of MPI memory datatype for each chunk */
int mpi_code; /* MPI return code */

int mpi_code; /* MPI return code */
H5D_mpio_actual_chunk_opt_mode_t actual_chunk_opt_mode = H5D_MPIO_LINK_CHUNK;
H5D_mpio_actual_io_mode_t actual_io_mode = 0;
size_t i; /* Local index variable */
herr_t ret_value = SUCCEED;

H5S_t **file_spaces = NULL;
H5S_t **mem_spaces = NULL;
haddr_t *addrs = NULL;
H5_flexible_const_ptr_t *bufs = NULL;
size_t *src_type_sizes = NULL;
size_t *dst_type_sizes = NULL;
hbool_t io_op_write;

herr_t ret_value = SUCCEED;

FUNC_ENTER_PACKAGE

Expand All @@ -1492,6 +1493,7 @@ H5D__link_piece_collective_io(H5D_io_info_t *io_info, int H5_ATTR_UNUSED mpi_ran
{
hsize_t mpi_buf_count; /* Number of MPI types */
size_t num_chunk; /* Number of chunks for this process */
int size_i;

H5D_piece_info_t *piece_info;

Expand Down Expand Up @@ -1527,164 +1529,74 @@ H5D__link_piece_collective_io(H5D_io_info_t *io_info, int H5_ATTR_UNUSED mpi_ran
HDqsort(io_info->sel_pieces, io_info->pieces_added, sizeof(io_info->sel_pieces[0]),
H5D__cmp_piece_addr);

/* Allocate chunking information */
if (NULL == (chunk_mtype = (MPI_Datatype *)H5MM_malloc(num_chunk * sizeof(MPI_Datatype))))
HGOTO_ERROR(H5E_DATASET, H5E_CANTALLOC, FAIL,
"couldn't allocate chunk memory datatype buffer")
if (NULL == (chunk_ftype = (MPI_Datatype *)H5MM_malloc(num_chunk * sizeof(MPI_Datatype))))
HGOTO_ERROR(H5E_DATASET, H5E_CANTALLOC, FAIL, "couldn't allocate chunk file datatype buffer")
if (NULL == (chunk_file_disp_array = (MPI_Aint *)H5MM_malloc(num_chunk * sizeof(MPI_Aint))))
HGOTO_ERROR(H5E_DATASET, H5E_CANTALLOC, FAIL,
"couldn't allocate chunk file displacement buffer")
if (NULL == (chunk_mem_disp_array = (MPI_Aint *)H5MM_calloc(num_chunk * sizeof(MPI_Aint))))
HGOTO_ERROR(H5E_DATASET, H5E_CANTALLOC, FAIL,
"couldn't allocate chunk memory displacement buffer")
if (NULL == (chunk_mpi_mem_counts = (int *)H5MM_calloc(num_chunk * sizeof(int))))
HGOTO_ERROR(H5E_DATASET, H5E_CANTALLOC, FAIL, "couldn't allocate chunk memory counts buffer")
if (NULL == (chunk_mpi_file_counts = (int *)H5MM_calloc(num_chunk * sizeof(int))))
HGOTO_ERROR(H5E_DATASET, H5E_CANTALLOC, FAIL, "couldn't allocate chunk file counts buffer")
if (NULL == (chunk_mbt_is_derived_array = (hbool_t *)H5MM_calloc(num_chunk * sizeof(hbool_t))))
HGOTO_ERROR(H5E_DATASET, H5E_CANTALLOC, FAIL,
"couldn't allocate chunk memory is derived datatype flags buffer")
if (NULL == (chunk_mft_is_derived_array = (hbool_t *)H5MM_calloc(num_chunk * sizeof(hbool_t))))
HGOTO_ERROR(H5E_DATASET, H5E_CANTALLOC, FAIL,
"couldn't allocate chunk file is derived datatype flags buffer")

/* save lowest file address */
ctg_store.contig.dset_addr = io_info->sel_pieces[0]->faddr;

#ifdef OUT
/* save base mem addr of piece for read/write */
base_buf_addr = io_info->sel_pieces[0]->dset_info->buf;

#ifdef H5Dmpio_DEBUG
H5D_MPIO_DEBUG(mpi_rank, "before iterate over selected pieces\n");
#endif

/* Obtain MPI derived datatype from all individual pieces */
/* Iterate over selected pieces for this process */
for (i = 0; i < num_chunk; i++) {
hsize_t *permute_map = NULL; /* array that holds the mapping from the old,
out-of-order displacements to the in-order
displacements of the MPI datatypes of the
point selection of the file space */
hbool_t is_permuted = FALSE;

/* Assign convenience pointer to piece info */
piece_info = io_info->sel_pieces[i];

/* Obtain disk and memory MPI derived datatype */
/* NOTE: The permute_map array can be allocated within H5S_mpio_space_type
* and will be fed into the next call to H5S_mpio_space_type
* where it will be freed.
*/
if (H5S_mpio_space_type(piece_info->fspace, piece_info->dset_info->type_info.src_type_size,
&chunk_ftype[i], /* OUT: datatype created */
&chunk_mpi_file_counts[i], /* OUT */
&(chunk_mft_is_derived_array[i]), /* OUT */
TRUE, /* this is a file space,
so permute the
datatype if the point
selections are out of
order */
&permute_map, /* OUT: a map to indicate the
permutation of points
selected in case they
are out of order */
&is_permuted /* OUT */) < 0)
HGOTO_ERROR(H5E_DATASPACE, H5E_BADTYPE, FAIL, "couldn't create MPI file type")

/* Sanity check */
if (is_permuted)
HDassert(permute_map);
if (H5S_mpio_space_type(piece_info->mspace, piece_info->dset_info->type_info.dst_type_size,
&chunk_mtype[i], &chunk_mpi_mem_counts[i],
&(chunk_mbt_is_derived_array[i]), FALSE, /* this is a memory
space, so if the file
space is not
permuted, there is no
need to permute the
datatype if the point
selections are out of
order*/
&permute_map, /* IN: the permutation map
generated by the
file_space selection
and applied to the
memory selection */
&is_permuted /* IN */) < 0)
HGOTO_ERROR(H5E_DATASPACE, H5E_BADTYPE, FAIL, "couldn't create MPI buf type")
/* Sanity check */
if (is_permuted)
HDassert(!permute_map);

/* Piece address relative to the first piece addr
* Assign piece address to MPI displacement
* (assume MPI_Aint big enough to hold it) */
chunk_file_disp_array[i] = (MPI_Aint)piece_info->faddr - (MPI_Aint)ctg_store.contig.dset_addr;

if (io_info->op_type == H5D_IO_OP_WRITE) {
chunk_mem_disp_array[i] =
(MPI_Aint)piece_info->dset_info->buf.cvp - (MPI_Aint)base_buf_addr.cvp;
}
else if (io_info->op_type == H5D_IO_OP_READ) {
chunk_mem_disp_array[i] =
(MPI_Aint)piece_info->dset_info->buf.vp - (MPI_Aint)base_buf_addr.vp;
}
} /* end for */
if (NULL == (file_spaces = H5MM_malloc(num_chunk * sizeof(H5S_t *))))
HGOTO_ERROR(H5E_RESOURCE, H5E_CANTALLOC, FAIL, "memory allocation failed for file space list")
if (NULL == (mem_spaces = H5MM_malloc(num_chunk * sizeof(H5S_t *))))
HGOTO_ERROR(H5E_RESOURCE, H5E_CANTALLOC, FAIL,
"memory allocation failed for memory space list")

/* Create final MPI derived datatype for the file */
if (MPI_SUCCESS !=
(mpi_code = MPI_Type_create_struct((int)num_chunk, chunk_mpi_file_counts,
chunk_file_disp_array, chunk_ftype, &chunk_final_ftype)))
HMPI_GOTO_ERROR(FAIL, "MPI_Type_create_struct failed", mpi_code)
if (NULL == (addrs = H5MM_malloc(num_chunk * sizeof(haddr_t))))
HGOTO_ERROR(H5E_RESOURCE, H5E_CANTALLOC, FAIL,
"memory allocation failed for piece address list")

if (MPI_SUCCESS != (mpi_code = MPI_Type_commit(&chunk_final_ftype)))
HMPI_GOTO_ERROR(FAIL, "MPI_Type_commit failed", mpi_code)
chunk_final_ftype_is_derived = TRUE;
if (NULL == (bufs = H5MM_malloc(num_chunk * sizeof(H5_flexible_const_ptr_t))))
HGOTO_ERROR(H5E_RESOURCE, H5E_CANTALLOC, FAIL,
"memory allocation failed for bufs buffer list")

/* Create final MPI derived datatype for memory */
if (MPI_SUCCESS !=
(mpi_code = MPI_Type_create_struct((int)num_chunk, chunk_mpi_mem_counts, chunk_mem_disp_array,
chunk_mtype, &chunk_final_mtype)))
HMPI_GOTO_ERROR(FAIL, "MPI_Type_create_struct failed", mpi_code)
if (MPI_SUCCESS != (mpi_code = MPI_Type_commit(&chunk_final_mtype)))
HMPI_GOTO_ERROR(FAIL, "MPI_Type_commit failed", mpi_code)
chunk_final_mtype_is_derived = TRUE;

/* Free the file & memory MPI datatypes for each chunk */
for (i = 0; i < num_chunk; i++) {
if (chunk_mbt_is_derived_array[i])
if (MPI_SUCCESS != (mpi_code = MPI_Type_free(chunk_mtype + i)))
HMPI_DONE_ERROR(FAIL, "MPI_Type_free failed", mpi_code)

if (chunk_mft_is_derived_array[i])
if (MPI_SUCCESS != (mpi_code = MPI_Type_free(chunk_ftype + i)))
HMPI_DONE_ERROR(FAIL, "MPI_Type_free failed", mpi_code)
} /* end for */
if (NULL == (src_type_sizes = H5MM_malloc(num_chunk * sizeof(size_t))))
HGOTO_ERROR(H5E_RESOURCE, H5E_CANTALLOC, FAIL,
"memory allocation failed for src type size list")
if (NULL == (dst_type_sizes = H5MM_malloc(num_chunk * sizeof(size_t))))
HGOTO_ERROR(H5E_RESOURCE, H5E_CANTALLOC, FAIL,
"memory allocation failed for dst type size list")
}

/* We have a single, complicated MPI datatype for both memory & file */
mpi_buf_count = (hsize_t)1;
} /* end if */
else { /* no selection at all for this process */
ctg_store.contig.dset_addr = 0;
#ifdef H5Dmpio_DEBUG
H5D_MPIO_DEBUG(mpi_rank, "before iterate over selected pieces\n");
#endif
for (i = 0; i < num_chunk; i++) {
piece_info = io_info->sel_pieces[i];

file_spaces[i] = piece_info->fspace;
mem_spaces[i] = piece_info->mspace;
addrs[i] = piece_info->faddr;
src_type_sizes[i] = piece_info->dset_info->type_info.src_type_size;
dst_type_sizes[i] = piece_info->dset_info->type_info.dst_type_size;
if (io_info->op_type == H5D_IO_OP_WRITE)
bufs[i].cvp = piece_info->dset_info->buf.cvp;
else if (io_info->op_type == H5D_IO_OP_READ)
bufs[i].vp = piece_info->dset_info->buf.vp;
}

/* just provide a valid mem address. no actual IO occur */
base_buf_addr = io_info->dsets_info[0].buf;
io_op_write = (io_info->op_type == H5D_IO_OP_WRITE) ? TRUE : FALSE;

/* Set the MPI datatype */
chunk_final_ftype = MPI_BYTE;
chunk_final_mtype = MPI_BYTE;
if (H5FD_selection_build_types(io_op_write, (uint32_t)num_chunk, file_spaces, mem_spaces, addrs, bufs,
src_type_sizes, dst_type_sizes, &chunk_final_ftype,
&chunk_final_ftype_is_derived, &chunk_final_mtype,
&chunk_final_mtype_is_derived, &size_i, &base_buf_addr) < 0)
HGOTO_ERROR(H5E_DATASET, H5E_CANTGET, FAIL, "couldn't build type for MPI-IO")

/* No chunks selected for this process */
mpi_buf_count = (hsize_t)0;
} /* end else */
/* We have a single, complicated MPI datatype for both memory & file */
mpi_buf_count = (hsize_t)size_i;

#ifdef H5Dmpio_DEBUG
H5D_MPIO_DEBUG(mpi_rank, "before coming to final collective I/O");
#endif
/* Set up the base storage address for this piece */
io_info->store_faddr = ctg_store.contig.dset_addr;
io_info->base_maddr = base_buf_addr;
/* For num_chunk == 0,
* no selection at all for this process
* just provide a valid mem address. no actual IO occur
*/

io_info->store_faddr = (num_chunk == 0) ? 0 : ctg_store.contig.dset_addr;
io_info->base_maddr = (num_chunk == 0) ? io_info->dsets_info[0].buf : base_buf_addr;

/* Perform final collective I/O operation */
if (H5D__final_collective_io(io_info, mpi_buf_count, chunk_final_ftype, chunk_final_mtype) < 0)
Expand All @@ -1697,24 +1609,6 @@ H5D__link_piece_collective_io(H5D_io_info_t *io_info, int H5_ATTR_UNUSED mpi_ran
ret_value);
#endif

/* Release resources */
if (chunk_mtype)
H5MM_xfree(chunk_mtype);
if (chunk_ftype)
H5MM_xfree(chunk_ftype);
if (chunk_file_disp_array)
H5MM_xfree(chunk_file_disp_array);
if (chunk_mem_disp_array)
H5MM_xfree(chunk_mem_disp_array);
if (chunk_mpi_mem_counts)
H5MM_xfree(chunk_mpi_mem_counts);
if (chunk_mpi_file_counts)
H5MM_xfree(chunk_mpi_file_counts);
if (chunk_mbt_is_derived_array)
H5MM_xfree(chunk_mbt_is_derived_array);
if (chunk_mft_is_derived_array)
H5MM_xfree(chunk_mft_is_derived_array);

/* Free the MPI buf and file types, if they were derived */
if (chunk_final_mtype_is_derived && MPI_SUCCESS != (mpi_code = MPI_Type_free(&chunk_final_mtype)))
HMPI_DONE_ERROR(FAIL, "MPI_Type_free failed", mpi_code)
Expand Down
Loading