-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix issues in two phase build #72
fix issues in two phase build #72
Conversation
Hi @jedwards4b , I've been banging my head against this for several hours. The solution that had me very hopeful was to make an extra Case instance at the beginning of nck.build and then flushing it at the end if sharedlib_only. Unfortunately, this failed along with everything else I tried. Many thanks for this fix. Would you mind briefly explaining what the core problem was and why this fixes it? |
…ements fix issues in two phase build
Oh, and build.py already creates the sharedlibroot, so preview_namelists does not need to do it. I'll make that change. You are correct, it is not safe to have preview_namelists do anything to the shared area because they run in parallel. |
The problem was that we were creating two build directories for the cpl/drv On Sun, Feb 28, 2016 at 3:44 PM, James Foucar [email protected]
Jim Edwards CESM Software Engineer |
Hi @jedwards4b , I tried your fix and still getting the same segf that I was getting before. I'm gathering some additional info now. |
Hi @jedwards4b , The process of using these new tools for this problem in particular is as follows: % ./create_test NCK_Ld3.f45_g37_rx1.A -t jgf_broken --no-run % * cd to test_root * % cd NCK_Ld3.f45_g37_rx1.A.melvin_gnu.jgf_works % ./case.test_build % cd .. % normalize_cases NCK_Ld3.f45_g37_rx1.A.melvin_gnu.jgf_broken NCK_Ld3.f45_g37_rx1.A.melvin_gnu.jgf_works % case_diff NCK_Ld3.f45_g37_rx1.A.melvin_gnu.jgf_broken NCK_Ld3.f45_g37_rx1.A.melvin_gnu.jgf_works Let me know what you think! I've pushed these new tools to the same branch on which I'm doing the two-phase work. |
now MPI_abort does not overwrite ret_val
This fixes the issue with the NCK test. There still appears to be an intermittent problem - it looks like the mkdir command in preview_namelist at line 137 occasionally fails. Is it because multiple threads arrive here at the same time?