-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When is the next release ? #3708
Comments
Not sure what @roystgnr prefers, but IMO we could start a release branch now. I was doing somewhat regular releases for a while there, but it wasn't clear that anyone was actually using them... as far as I know most big projects are doing some form of a git submodule for libmesh at this time, and updating it as needed. |
Speaking as a package manager/software install for large HPC clusters, we very much prefer having releases (with sane versioning schemes) than having commit hashes and having to make up our own release scheme. All software on HPC cluster is installed through modules, and having a sane versioning scheme with actual releases makes it a lot simpler for everyone. |
It would be a good time to start a release branch, IMHO. |
Any idea when we might see a new release? I see there's now a Is Edit: Okay, I see that there is actually a 1.7.2 tag, but no release was made on GitHub. |
There was an issue with the 1.7.2 tag where the version number was not incremented correctly before the tag was created. So, there won't be a 1.7.2 release tarball, but we can make a 1.7.3 release tarball. Note, though, that not much has been changed since 1.7.1 was tagged, although a lot of time has gone by. After discussing things a bit with @roystgnr, our plan is to tag a 1.8.0 release and create tarballs once we get a passing devel -> master merge here. |
Would you mind trying out one of the 1.7.3 release tarballs here? As I mentioned before, it won't be hugely different from 1.7.1, but if there's something wrong with the files, it would be good to know before we start in on the 1.8.x series. |
Thanks for the 1.7.3 release. When I try to build it, it fails because it can't find the
and sure enough, I find several
Nothing in the output of "./configure" that would explain this. Full configure line:
What is the most recent version of HDF5 that you have tested with libMesh?
These went away when I replaced it with hdf5/1.10.11. I would like to avoid having to install another version of HDF5 because users can't have two different hdf5 modules (versions) loaded at the same time and there are so many other modules that depend on HDF5 (and are using our default version). |
And this doesn't happen when you try building the 1.7.1 tar archive? Nothing has changed with our bundled Exodus between releases 1.7.1 and 1.7.3, so I don't know why it would work for one but not the other. I found an exodus_config.h file in our source tree (./contrib/exodusii/v8.11/exodus/sierra/exodus_config.h) but it doesn't contain anything important (just a bunch of defines inside
I'm using HDF 1.10.0 myself.
Those errors are coming from the bundled NetCDF sources, which are on v4.9.2. So, in order to use newer HDF5 versions, we'd first have to update our internal NetCDF to something newer. |
OK, I confirmed that I get the same error as you regarding the We'll look into a fix for this and create a new release if/when we find one. |
We realized there was already a fix for this issue on master (69d99e0), it just needed to be cherry-picked to the 1.7.x release branch. I have now done that and created the new tarballs here, would you mind giving that a try? Unfortunately I don't think we can fix the "does not work with recent HDF" issue on this old release branch, and likely not on 1.8.x either since we are trying to tag that very soon, but we'll try and keep it on the radar for future releases. |
The |
I tested running
and 23 others, all errors come from the call to Thanks, |
I'm currently in the "make check" phase for my build of When two days ago I first noticed the crashes (compiling with HDF5 1.10.11) I looked through the make check logs and didn't see any failing tests. |
I recompiled in dbg mode, reran the failing tests, and this time, I got more useful information, it looks like there is some issue with the way that we have configured NetCDF on the 1.7.x branch that prevents it from writing "netcdf-4" format files, which I believe is what we try to do when HDF is enabled:
|
For me it suddenly fails with:
I don't even reach the EXODUS tests. Do the optimization tests need a lot of memory? I'm running on a build cluster and the build job has 8 cores and about 28 GiB of memory available. |
OK, sorry for the red herring. I'm pretty sure the examples run after the unit tests, though, so if you made it to them you likely passed all the unit testing. The optimization_ex2 test runs on a 10x10 mesh by default, it should not require much memory at all. On my system it runs in about 1-2 seconds, but I noted that it doesn't converge well with the default arguments that we are using and the solve stops because it reaches the maximum number of function evaluations:
This is with PETSc 3.17, I guess you might see other behavior with different versions of PETSc. |
Thanks to the help of a colleague I now know that the crash is related to PetSC. I'm just not sure whether it's due to the fact that I'm trying to build against Also:
I noticed that you already have a 1.8.0 branch that includes netcdf-4.9.2 (the same version as our current default). Next week I'm going to create a tarball from that branch and test it. Oliver
|
How do I tell |
Are you talking about in the 1.7.x release series? I don't think we bundled NetCDF 4.9.2 with libmesh at that time.
There was an attempt to let the use select a system NetCDF installation in the past, but it wasn't merged. I don't recall exactly why, but I don't think it worked with all our supported platforms. I think it would be a good improvement, but don't have time to work on it myself.
No idea, that would probably be a good bug report for the NetCDF/HDF people if you can narrow down exactly what the issue is. |
No, today I was giving I'll try again tomorrow. |
On master and |
Turns out that it was only the netCDF tests that were failing with HDF5-1.12.0 or newer. I've ported that patch so that it can be applied to libMesh-1.7.5 and now the netcdf-tests all pass even when using our current default module HDF5/1.14.2. Patch libMesh's NetCDF tests for compatibility with HDF5 >= 1.12.
The below patch has been taken from [1] and modified so that it can be
applied to the netcdf-c-4.6.2 sources included with libMesh.
[1]: https://github.com/Unidata/netcdf-c/commit/9f9b125028b28d8e94f2c990c8d92a7df76fde78
diff --git a/contrib/netcdf/v4/h5_test/tst_h_atts3.c b/contrib/netcdf/v4/h5_test/tst_h_atts3.c
index 7976821b0..4fb672798 100644
--- a/contrib/netcdf/v4/h5_test/tst_h_atts3.c
+++ b/contrib/netcdf/v4/h5_test/tst_h_atts3.c
@@ -46,7 +46,11 @@ main()
hid_t file_typeid1[NUM_OBJ], native_typeid1[NUM_OBJ];
hid_t file_typeid2, native_typeid2;
hsize_t num_obj;
+#if H5_VERSION_GE(1,12,0)
+ H5O_info2_t obj_info;
+#else
H5O_info_t obj_info;
+#endif
char obj_name[STR_LEN + 1];
hsize_t dims[1] = {ATT_LEN}; /* netcdf attributes always 1-D. */
struct s1
@@ -148,8 +152,14 @@ main()
for (i = 0; i < num_obj; i++)
{
/* Get the name, and make sure this is a type. */
+
+#if H5_VERSION_GE(1,12,0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
+ i, &obj_info, H5O_INFO_BASIC, H5P_DEFAULT) < 0) ERR;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
i, &obj_info, H5P_DEFAULT) < 0) ERR;
+#endif
if (H5Lget_name_by_idx(grpid, ".", H5_INDEX_NAME, H5_ITER_INC, i,
obj_name, STR_LEN + 1, H5P_DEFAULT) < 0) ERR;
if (obj_info.type != H5O_TYPE_NAMED_DATATYPE) ERR;
@@ -267,8 +277,13 @@ main()
for (i = 0; i < num_obj; i++)
{
/* Get the name, and make sure this is a type. */
+#if H5_VERSION_GE(1,12,0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
+ i, &obj_info, H5O_INFO_BASIC, H5P_DEFAULT) < 0) ERR;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
i, &obj_info, H5P_DEFAULT) < 0) ERR;
+#endif
if (H5Lget_name_by_idx(grpid, ".", H5_INDEX_NAME, H5_ITER_INC, i,
obj_name, STR_LEN + 1, H5P_DEFAULT) < 0) ERR;
if (obj_info.type != H5O_TYPE_NAMED_DATATYPE) ERR;
diff --git a/contrib/netcdf/v4/h5_test/tst_h_atts4.c b/contrib/netcdf/v4/h5_test/tst_h_atts4.c
index 6228dd661..d70f4a497 100644
--- a/contrib/netcdf/v4/h5_test/tst_h_atts4.c
+++ b/contrib/netcdf/v4/h5_test/tst_h_atts4.c
@@ -49,7 +49,11 @@ main()
hid_t file_typeid1[NUM_OBJ_2], native_typeid1[NUM_OBJ_2];
hid_t file_typeid2, native_typeid2;
hsize_t num_obj;
+#if H5_VERSION_GE(1,12,0)
+ H5O_info2_t obj_info;
+#else
H5O_info_t obj_info;
+#endif
char obj_name[STR_LEN + 1];
hsize_t dims[1] = {ATT_LEN}; /* netcdf attributes always 1-D. */
struct s1
@@ -139,8 +143,13 @@ main()
for (i = 0; i < num_obj; i++)
{
/* Get the name, and make sure this is a type. */
+#if H5_VERSION_GE(1,12,0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
+ i, &obj_info, H5O_INFO_BASIC, H5P_DEFAULT) < 0) ERR;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
i, &obj_info, H5P_DEFAULT) < 0) ERR;
+#endif
if (H5Lget_name_by_idx(grpid, ".", H5_INDEX_NAME, H5_ITER_INC, i,
obj_name, STR_LEN + 1, H5P_DEFAULT) < 0) ERR;
if (obj_info.type != H5O_TYPE_NAMED_DATATYPE) ERR;
diff --git a/contrib/netcdf/v4/h5_test/tst_h_compounds2.c b/contrib/netcdf/v4/h5_test/tst_h_compounds2.c
index 2f885a57a..9707b801d 100644
--- a/contrib/netcdf/v4/h5_test/tst_h_compounds2.c
+++ b/contrib/netcdf/v4/h5_test/tst_h_compounds2.c
@@ -48,7 +48,11 @@ main()
hsize_t dims[1];
hsize_t num_obj, i_obj;
char obj_name[STR_LEN + 1];
+#if H5_VERSION_GE(1,12,0)
+ H5O_info2_t obj_info;
+#else
H5O_info_t obj_info;
+#endif
hid_t fapl_id, fcpl_id;
htri_t equal;
char file_in[STR_LEN * 2];
@@ -131,8 +135,13 @@ main()
if (H5Gget_num_objs(grpid, &num_obj) < 0) ERR;
for (i_obj = 0; i_obj < num_obj; i_obj++)
{
+#if H5_VERSION_GE(1, 12, 0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
+ i_obj, &obj_info, H5O_INFO_BASIC, H5P_DEFAULT) < 0) ERR;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
i_obj, &obj_info, H5P_DEFAULT) < 0) ERR;
+#endif
if (H5Lget_name_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
i_obj, obj_name, STR_LEN + 1, H5P_DEFAULT) < 0) ERR;
@@ -194,8 +203,13 @@ main()
if (H5Gget_num_objs(grpid, &num_obj) < 0) ERR;
for (i_obj = 0; i_obj < num_obj; i_obj++)
{
+#if H5_VERSION_GE(1,12,0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC, i_obj, &obj_info,
+ H5O_INFO_BASIC, H5P_DEFAULT) < 0) ERR;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC, i_obj, &obj_info,
H5P_DEFAULT) < 0) ERR;
+#endif
if (H5Lget_name_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC, i_obj, obj_name,
STR_LEN + 1, H5P_DEFAULT) < 0) ERR;
diff --git a/contrib/netcdf/v4/h5_test/tst_h_files4.c b/contrib/netcdf/v4/h5_test/tst_h_files4.c
index eef3e1608..7ea991acd 100644
--- a/contrib/netcdf/v4/h5_test/tst_h_files4.c
+++ b/contrib/netcdf/v4/h5_test/tst_h_files4.c
@@ -44,14 +44,23 @@ with the H5Lvisit function call
*/
herr_t
-op_func (hid_t g_id, const char *name, const H5L_info_t *info,
+op_func (hid_t g_id, const char *name,
+#if H5_VERSION_GE(1,12,0)
+ const H5L_info2_t *info,
+#else
+ const H5L_info_t *info,
+#endif
void *op_data)
{
hid_t id;
H5I_type_t obj_type;
strcpy((char *)op_data, name);
+#if H5_VERSION_GE(1,12,0)
+ if ((id = H5Oopen_by_token(g_id, info->u.token)) < 0) ERR;
+#else
if ((id = H5Oopen_by_addr(g_id, info->u.address)) < 0) ERR;
+#endif
/* Using H5Ovisit is really slow. Use H5Iget_type for a fast
* answer. */
@@ -169,7 +178,11 @@ main()
{
hid_t fapl_id, fileid, grpid;
H5_index_t idx_field = H5_INDEX_CRT_ORDER;
+#if H5_VERSION_GE(1,12,0)
+ H5O_info2_t obj_info;
+#else
H5O_info_t obj_info;
+#endif
hsize_t num_obj;
ssize_t size;
char obj_name[STR_LEN + 1];
@@ -186,8 +199,13 @@ main()
if (H5Gget_num_objs(grpid, &num_obj) < 0) ERR;
for (i = 0; i < num_obj; i++)
{
+#if H5_VERSION_GE(1,12,0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
+ i, &obj_info, H5O_INFO_BASIC, H5P_DEFAULT)) ERR;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
i, &obj_info, H5P_DEFAULT)) ERR;
+#endif
if ((size = H5Lget_name_by_idx(grpid, ".", idx_field, H5_ITER_INC, i,
NULL, 0, H5P_DEFAULT)) < 0) ERR;
if (H5Lget_name_by_idx(grpid, ".", idx_field, H5_ITER_INC, i,
diff --git a/contrib/netcdf/v4/h5_test/tst_h_vars2.c b/contrib/netcdf/v4/h5_test/tst_h_vars2.c
index 49158ba86..2b731b3c9 100644
--- a/contrib/netcdf/v4/h5_test/tst_h_vars2.c
+++ b/contrib/netcdf/v4/h5_test/tst_h_vars2.c
@@ -31,7 +31,11 @@ main()
hsize_t num_obj;
hid_t fileid, grpid, spaceid;
int i;
+#if H5_VERSION_GE(1,12,0)
+ H5O_info2_t obj_info;
+#else
H5O_info_t obj_info;
+#endif
char names[NUM_ELEMENTS][MAX_SYMBOL_LEN + 1] = {"H", "He", "Li", "Be", "B", "C"};
char name[MAX_SYMBOL_LEN + 1];
ssize_t size;
@@ -79,8 +83,13 @@ main()
if (num_obj != NUM_ELEMENTS) ERR;
for (i = 0; i < num_obj; i++)
{
+#if H5_VERSION_GE(1,12,0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
+ i, &obj_info, H5O_INFO_BASIC, H5P_DEFAULT) < 0) ERR;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
i, &obj_info, H5P_DEFAULT) < 0) ERR;
+#endif
if (obj_info.type != H5O_TYPE_DATASET) ERR;
if ((size = H5Lget_name_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC, i,
NULL, 0, H5P_DEFAULT)) < 0) ERR;
@@ -106,7 +115,11 @@ main()
hid_t fileid, grpid;
hsize_t num_obj;
int i;
+#if H5_VERSION_GE(1,12,0)
+ H5O_info2_t obj_info;
+#else
H5O_info_t obj_info;
+#endif
char names[NUM_DIMSCALES][MAX_SYMBOL_LEN + 1] = {"b", "a"};
char name[MAX_SYMBOL_LEN + 1];
hid_t dimscaleid;
@@ -152,8 +165,13 @@ main()
if (num_obj != NUM_DIMSCALES) ERR;
for (i = 0; i < num_obj; i++)
{
+#if H5_VERSION_GE(1,12,0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
+ i, &obj_info, H5O_INFO_BASIC, H5P_DEFAULT) < 0) ERR;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
i, &obj_info, H5P_DEFAULT) < 0) ERR;
+#endif
if (obj_info.type != H5O_TYPE_DATASET) ERR;
if ((size = H5Lget_name_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC, i,
NULL, 0, H5P_DEFAULT)) < 0) ERR;
@@ -178,7 +196,11 @@ main()
hsize_t num_obj;
hid_t fileid, grpid, spaceid;
float val = 3.1495;
+#if H5_VERSION_GE(1,12,0)
+ H5O_info2_t obj_info;
+#else
H5O_info_t obj_info;
+#endif
char name[MAX_NAME_LEN + 1];
ssize_t size;
@@ -238,8 +260,14 @@ main()
if (H5Gget_num_objs(grpid, &num_obj) < 0) ERR;
if (num_obj != 1) ERR;
+
+#if H5_VERSION_GE(1,12,0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
+ 0, &obj_info, H5O_INFO_BASIC, H5P_DEFAULT) < 0) ERR;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
0, &obj_info, H5P_DEFAULT) < 0) ERR;
+#endif
if (obj_info.type != H5O_TYPE_DATASET) ERR;
if ((size = H5Lget_name_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC, 0,
NULL, 0, H5P_DEFAULT)) < 0) ERR;
diff --git a/contrib/netcdf/v4/nc_test4/tst_xplatform2.c b/contrib/netcdf/v4/nc_test4/tst_xplatform2.c
index 6b6e1ab24..acefe1807 100644
--- a/contrib/netcdf/v4/nc_test4/tst_xplatform2.c
+++ b/contrib/netcdf/v4/nc_test4/tst_xplatform2.c
@@ -564,7 +564,11 @@ main(int argc, char **argv)
hid_t file_typeid1[NUM_OBJ], native_typeid1[NUM_OBJ];
hid_t file_typeid2, native_typeid2;
hsize_t num_obj, i;
+#if H5_VERSION_GE(1,12,0)
+ H5O_info2_t obj_info;
+#else
H5O_info_t obj_info;
+#endif
char obj_name[NC_MAX_NAME + 1];
/* Open one of the netCDF test files. */
@@ -579,8 +583,13 @@ main(int argc, char **argv)
for (i = 0; i < num_obj; i++)
{
/* Get the name. */
+#if H5_VERSION_GE(1,12,0)
+ if (H5Oget_info_by_idx3(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
+ i, &obj_info, H5O_INFO_BASIC, H5P_DEFAULT) < 0) ERR_RET;
+#else
if (H5Oget_info_by_idx(grpid, ".", H5_INDEX_CRT_ORDER, H5_ITER_INC,
i, &obj_info, H5P_DEFAULT) < 0) ERR_RET;
+#endif
if (H5Lget_name_by_idx(grpid, ".", H5_INDEX_NAME, H5_ITER_INC, i,
obj_name, NC_MAX_NAME + 1, H5P_DEFAULT) < 0) ERR_RET;
printf(" reading type %s ", obj_name); |
Still two libMesh tests/examples are crashing:
So far I see the same two examples crash whether I use PETSc 3.19.2 or 3.20.0, though in both cases I've used HDF5 1.14.2 for compiling both libMesh and PETSc.
I'm attaching the configure log and the output from the two failing examples from the test with HDF5 1.14.2 & PETSc 3.19.6:
Let me know if I should create a separate issue for that. Edit: |
I had some issues applying your patch because the file paths contain symbolic links ( |
Ahh sorry, I forgot to mention that. Yesterday I had the reverse issue because the tar-ball (which I use for the build) doesn't have the |
Confirmed! The examples
|
OK, that makes it seem like we are missing a PETSc 3.20 came out in September 2023, so it did not even exist when libmesh 1.7.0 was first released. It's definitely possible that these examples worked OK with PETSc from around the same time, but then more/better error checking in PETSc was added subsequently. If you can get a stack trace leading up to that "Object is in wrong state" error, that would be helpful in pointing us to the place where (I assume) we already made the fix in master. As far as I'm aware, those examples work fine with libmesh master and new PETSc. |
OK, I tested my 1.7.5 build with PETSc 3.17 and these two examples definitely don't fail for me. I tested |
How can I run them to get a full stack trace? |
You can run any of the examples from the build directory via
You might be able to get more information by running the executable in the debugger. For
Once in |
That unfortunately didn't work. I've compiled libMesh with I will be off for the next weeks but will try to get back to this later. |
Last release of LibMesh is 1.5 years old, and there has been a lot of development. I have projects that depend on specific commits, but that is not appropriate to install on a large cluster (we want specific versions instead).
The text was updated successfully, but these errors were encountered: