-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up simulations to make milestone completion feasible. #98
Comments
Ilker tried to create a fresh ExaWind driver, but hitting an error on Frontier. (Without Trilinos solvers.) He will start profiling the code after he gets it running. He will talk to Jon Rood about getting the Frontier build working, hopefully tomorrow (5/10). Or he might ask Phil. |
We need to pin to openfast develop branch to a commit for our exawind work. |
Ilker was able to build, but is out sick today (5/14). Jon: Marc HdF and I will work on Tioga performance. Ilker can run and profile this and see what parts of Tioga are bad. Tioga might have MPI issues that need to be addressed, so that it doesn't wait or communicate better. Some base case issues were resolved, but with a bunch of Nalu-Wind instances, things get worse. It is certainly an MPI issue in Tioga. Ilker and I will continue to work this. Making Tioga faster for this case would alleviate time bottlenecks elsewhere in other cases. |
Jon: still mostly focused on Tioga, with Ilker working on this. Ilker has a profile for this on Frontier, so we just need to look at it. For AMR-Wind, we would need help from LBNL. Nalu-Wind isn't the bottleneck here. |
Ilker is back at work. He has run multiple cases and just needs to start profiling. |
Ilker: looked at my results and had debug flags on. |
Ilker: I am having some exawind-manager trouble with a debug symbols build on Frontier, and I am dealing with that right now. Jon: I will reach out. |
Ilker: there are problems with compiling on Frontier. We can run simulations, but there are issues with debug flag outputs not working, and segfaults on the 16 turbine case. Jon: spent a lot of time figuring out right configuration to run on Frontier. There's an OpenFast commit up in the air. We couldn't build with debug symbols with certain versions of rocm. Using rocm-6 seems to build with debug symbols. We're looking into perhaps a new version of HPC-toolkit. Segfault in OpenFAST, so maybe wrong commit of OpenFAST. Tried OpenFAST-dev that doesn't seem to work currently. Posted in ExaWind channel, but should tag someone. Going to try a problem without OpenFAST to try a simpler case to profile. Ganesh and post-doc are willing to help get this going since it is a roadblock for him now. |
Ganesh: just recompiled and ran, but hasn't done more debug cases yet. Marc: spend time running the 16 turbine case with 3 blade mesh on Frontier, looking for ways to speed it up. Load imbalance. Got 10% speedup, but looking for Nx speedups. Ilker still working on this. Could be an issue with 3 meshes at the hub causing a lot of overset and MPI communication, leading to load imbalances. Ganesh: has an idea to break apart each blade into 2 blocks to make TIOGA faster. |
Ilker: Ganesh has a specific case that showed a slowdown. Viz the mesh and saw that that was the issue. Discussed with Ganesh. 3 blocks per blade and physical intersection. TIOGA does a search at this intersection. Ganesh has an idea for working around this issue, expected to complete by tomorrow. |
Ganesh: still in progress on this. Ilker: no updates on this one. Waiting for the OpenFAST issue to be resolved, and build to be complete on Frontier. Nate: I've got a branch that I know works with FSI, and have it going on Flight at Sandia with exawind_manager with classic intel compiler. Starts and restarts with FSI and CFD with 2 turbines, so I know it is working. Now just need to build on Frontier. Might be old enough that it doesn't include Derek's new changes. Ilker should try to build on Frontier, and there's a chance it might work. Got a sizeable speedup by changing AMR-Wind's blocking-factor=32, and max-grid-size=64, which achieved a better load balancing and/or MPI performance overall. |
Ilker: have built Nate's branch and 16 turbine case, building with debug flags, and then looking at profiling data. Ganesh: I split the mesh and am exporting it, and should be available here in a little bit. Nate: change the AMR-Wind blocking size and see if this speeds things up. Observe which part speeds up due to this change. There may be optimal blocking factors. Defaults showed big slowdown. Should be able to see through profiling. |
Ilker: tried profiling on 1024 nodes. Looked at case files. Once this runs, will have a better idea of what to look for. The mesh might not be set up correctly, but need to confirm with creator of mesh (unknown who made it? maybe Ashesh) 3 blades and a tower with some overlap. Ganesh: done exporting mesh. |
Ilker: 1024 nodes seeing a segfault. Now running smaller case. (Had been running larger case.) No overset between blades and tower? May be an error here. Ganesh: mesh on kestrel, but haven't yet used it. Reduces by half (or more) the volume needed to search. |
Nate: Lawrence posted timings with different numbers of nodes. Discussed yesterday. Lawrence: I tested the production case on variable number of nodes. Submitted milestone case on 384 nodes. |
Ilker: nothing specific, but runs in the queue. Blocking-factor may make a difference. |
Ilker going to profile simulation, and maybe make changes to Tioga to make this faster.
Nate: I fear we'd still be too slow if we don't get massive speedup. The timestep size we're forced to run at with FSI are going to be too slow. Need 500 - 600 s of simulation time, but looking like can only get to 14 s in a 12 hour run (max allowed on Frontier right now).
Could we throw more nodes at it. Nalu-Wind is currently segfaulting on GPUs.
The text was updated successfully, but these errors were encountered: