Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replay mode for ASTE #100

Merged
merged 17 commits into from
Apr 21, 2022

Conversation

kursatyurt
Copy link
Collaborator

Make up for wrongly merged PR #85

@kursatyurt kursatyurt requested review from fsimonis and davidscn March 29, 2022 00:00
@davidscn
Copy link
Member

The config file format is good, I like it. I ran the replay mode case using our quickstart. Some comments:

  1. I still run into an MPI error in the last iteration *** The MPI_Comm_get_attr() function was called after MPI_FINALIZE was invoked.
  2. We should definitely print out in each iteration, which file is actually used during the coupling time-window.
  3. Initially, we print out the cryptic Names: 100. We should print out the mesh names we found and the corresponding dt files.
  4. The replay mode seems not to work with the vtp exports. We should either inform the user properly or make it work with vtp
  5. In case read and write data are wrong (swapped) in the json config file, I run into a weird PETSc. If handled properly, preCICE should give a descriptive error message. I suspect that we need to ask for the IDs before loading data from the vtk files or similar.

@kursatyurt
Copy link
Collaborator Author

  1. We should definitely print out in each iteration, which file is actually used during the coupling time-window.

Definitely a good idea.

  1. Initially, we print out the cryptic Names: 100. We should print out the mesh names we found and the corresponding dt files.

Actually, these ones would need a "debug" level logger in my perspective. Let's add more clear explanation now and change it to debug level later

  1. The replay mode seems not to work with the vtp exports. We should either inform the user properly or make it work with vtp

We are now only supporting vtu/vtk files for all taste tools. vtp format is another level probably we need to touch after the merge, let's discuss in #101

  1. In case read and write data are wrong (swapped) in the json config file, I run into a weird PETSc. If handled properly, preCICE should give a descriptive error message. I suspect that we need to ask for the IDs before loading data from the vtk files or similar.

Both meshes should be able to get dataID from preCICE. Then we need to load the data from vtk file if is it a write type, and if not available on the mesh, it should throw an error (Which is missing RN :) ). Since in replay mode actually both mesh have all the required data this step eventually doesn't create any error. Then the next step will be to write the data which is actually read type. I am not sure but this seems to be happening at the write/read data calls preCICE interface. I will try it, find the point and discuss the solution here.

@kursatyurt
Copy link
Collaborator Author

kursatyurt commented Apr 6, 2022

Now starting from arbitrary timestep[dt] (>=1) is possible, example configuration file is

{
  "participant": "Fluid",
  "starttime": 1,
  "meshes": [
    {
      "mesh": "Fluid-Mesh",
      "meshfileprefix": "/path/to/fluid/meshes/Fluid-Mesh-Fluid",
      "read-data": {
        "vector": [
          "Displacement"
        ]
      },
      "write-data": {
        "vector": [
          "Force"
        ]
      }
    }
  ],
  "precice-config": "/path/to/precice/confg/precice-config.xml"
}

@davidscn
Copy link
Member

davidscn commented Apr 7, 2022

The MPI error is gone now, thanks.

Further remarks:

  • using <exchange ... initialize="true" /> and the verbose option in preciceMap leads to a crash.
  • using an explicit coupling in the original case leads to crash in the very first aste run (it might work in the second where an init file might be generated). The reason for both errors is probably that aste tries to locate the init file, which is not unconditionally generated.
  • when using a different starttime, aste prints Total number of detected meshes: 101, but not all of them are relevant, we should give some information about the relevant meshes. I guess a log message similar to StartMesh ... EndtMehs ... or start time and end time might be very valuable.
  • starttime is actually not a good expression for the start, because preCICE exports according to the respective iteration. Either we call this option Startiteration or we we leave it as it is and use the time-step size in order to compute the iteration.
  • aste bails out of the time loop and stalls in case it runs out of meshes. We should exit the loop in a clean way and inform the user that no more meshes are available.
  • we still need to safeguard against a swap of read and write data names. This might be a common misconfiguration and leads to an immediate crash.

@kursatyurt
Copy link
Collaborator Author

* using `<exchange ... initialize="true" />` and the verbose option in `preciceMap` leads to a crash.

This seems strange I will try to reproduce it

* using an explicit coupling in the original case leads to crash in the very first aste run (it might work in the second where an `init` file might be generated). The reason for both errors is probably that aste tries to locate the `init` file, which is not unconditionally generated.

Yes, I will change the logic definitely. Since the parsing is always in order I just need to locate position of current timestep.

* when using a different starttime, aste prints `Total number of detected meshes: 101`, but not all of them are relevant, we should give some information about the relevant meshes. I guess a log message similar to `StartMesh ... EndtMehs ...` or start time and end time might be very valuable.

StartMesh and EndMesh are good idea

* `starttime` is actually not a good expression for the start, because preCICE exports according to the respective iteration. Either we call this option `Startiteration` or we we leave it as it is and use the time-step size in order to compute the iteration.

How about startmeshdt ?

* aste bails out of the time loop and stalls in case it runs out of meshes. We should exit the loop in a clean way and inform the user that no more meshes are available.

Ah it seems while moving to modes I missed finalize preCICE :)

* we still need to safeguard against a swap of read and write data names. This might be a common misconfiguration and leads to an immediate crash.

I literally have no idea how we can check this.

@kursatyurt
Copy link
Collaborator Author

The problem with initialize true is solved. If the mesh required for initialization cannot be found it will thrown on error

---[precice]  iteration: 1, time-window: 1, time: 0 of 5, time-window-size: 0.01, max-timestep-length: 0.01, ongoing: yes, time-window-complete: no, write-initial-data 
2022-04-10 01:59:42,166 VERBOSE-1 [default] Looking for dt = 1
2022-04-10 01:59:42,166 VERBOSE-1 [default] Found in position 0

2022-04-10 01:59:42,166 VERBOSE-1 [default] ASTE Start mesh is /home/kursatyurt/tutorials/perpendicular-flap/fluid-nutils/VTK/Fluid-Mesh-Fluid.dt1.vtk
2022-04-10 01:59:42,166 VERBOSE-1 [default] ASTE Final mesh is /home/kursatyurt/tutorials/perpendicular-flap/fluid-nutils/VTK/Fluid-Mesh-Fluid.dt500.vtk
terminate called after throwing an instance of 'std::runtime_error'
  what():  Starting from dt = 1 but previous timestep ".init" or 0 is cannot be found check your mesh folder

Stall problem is solved, it won't stall anymore, finalize correctly.

Start end and mesh filenames for first mesh will be printed to screen.

2022-04-10 01:58:26,499 VERBOSE-1 [default] ASTE Start mesh is /home/kursatyurt/tutorials/perpendicular-flap/fluid-nutils/VTK/Fluid-Mesh-Fluid.dt1.vtk
2022-04-10 01:58:26,499 VERBOSE-1 [default] ASTE Final mesh is /home/kursatyurt/tutorials/perpendicular-flap/fluid-nutils/VTK/Fluid-Mesh-Fluid.dt500.vtk

@davidscn
Copy link
Member

With the current configuration, I just get an error

preciceMap --aste-config config.json 
terminate called after throwing an instance of 'nlohmann::detail::type_error'
  what():  [json.exception.type_error.302] type must be number, but is null

what(): Starting from dt = 1 but previous timestep ".init" or 0 is cannot be found check your mesh folder

Looks good, but I would prefer if we get the #73 also done for the cpp part (as discussed using the boost logger) so that we can print an error message formatted similar to preCICE.

@kursatyurt
Copy link
Collaborator Author

kursatyurt commented Apr 12, 2022

starttime in the config file is changed to startdt probably this is the reason you have this error

I am also looking for the boost logger PR for that one will come soon after the merging of this one, since this PR actually has a huge change in the code with current version changing the logging does not make any sense to me

@davidscn
Copy link
Member

starttime in the config file is changed to startdt probably this is the reason you have this error

No, I actually changed this part. Even if this would be the case, we should exit in an understandable way stating that an option was missing or unknown to the parser.

Copy link
Member

@davidscn davidscn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parsing error is now much better, thanks. One we still need to catch properly would be 'file not found'.

src/modes.cpp Outdated Show resolved Hide resolved
src/mesh.cpp Outdated
void readData(Mesh &mesh, const std::string &filename)
{
if (!fs::is_regular_file(filename)) {
throw std::invalid_argument{"The mesh file does not exist: " + filename};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using throw at this place is in principle fine. However, due to precice/precice#964, this will currently lead to a deadlock. Can we use a macro instead (and maybe for the moment rather exit instead of throw?), where we change this easily in the future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed all them to print to std::cerr which will be replaced later on logging error. Depend on location I use MPI_Abort or std::exit.

Comment on lines 41 to 43
* @param interface
* @param mesh
* @param meshID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some descriptions here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More explicit descriptions are added. Done.

@davidscn davidscn merged commit 338d8d7 into precice:develop Apr 21, 2022
@davidscn
Copy link
Member

Thanks!

@kursatyurt kursatyurt deleted the set_configs_from_preciceconfig branch April 25, 2022 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants