-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes in temperature not being accurately reported #110
Comments
I have now run the same model in the single processor mode. It does not show the same behavior there. So this is associated with parallel processing. I suspect something got divided by two or thre when running with 3 processors, when is should not have. |
It turns out this is quite a fiddly one but I've figured it out. There are a number of problems here, actually. It all centres around the loop in which information on cells is exchanged between threads. What happens here for (n_mpi=0; n_mpi < np_mpi_global; n_mpi++) // loop over all processors
{
if (rank_global == n_mpi) // it's this processor. Pack your stuff into memory
{
[...do the packing]
}
[...some MPI broadcast and barrier commands]
if (rank_global != n_mpi) // it's another processor. unpack your stuff from memory
{
[...do the unpacking]
t_r_ave_old += plasmamain[n].t_r;
t_e_ave_old += plasmamain[n].t_e;
iave++;
//Perform checks to see how much temperatures have changed in this iteration
if ((fabs (t_r_old - plasmamain[n].t_r)) > fabs (dt_r))
{
dt_r = plasmamain[n].t_r - t_r_old;
nmax_r = n;
}
if ((fabs (t_e_old - plasmamain[n].t_e)) > fabs (dt_e))
{
dt_e = plasmamain[n].t_e - t_e_old;
nmax_e = n;
}
t_r_ave += plasmamain[n].t_r;
t_e_ave += plasmamain[n].t_e;
}
} Basically what we were doing at the bottom there was all wrong - perhaps copy and pastes had not been removed or done properly here:
I've fixed this so now, each thread does the following
This is done in commit agnwinds/python@74364c5 - @kslong perhaps you could just double check on your model that this is behaving as in serial mode? |
I confirm that this looks to be resolved. |
Running a fairly standard SV model of a CV, I find afteer 20 cycles of 1e6 pothons that the follwoing is being reported on the last cycle:
!!wind_update: Max change in t_r 13583 at cell 143 (4,23)
!!wind_update: Ave change in t_r 77549 from 37014 to 114563
!!wind_update: Max change in t_e -14924 at cell 238 (7,28)
!!wind_update: Ave change in t_e 16735 from 18764 to 35499
!!Check_converging: 235 (0.983) converged and 170 (0.711) converging of 239 cells
!!Check_convergence_breakdown: t_r 239 t_e(real) 236 t_e(maxed) 0 hc(real) 237
Although the model appears converged, the average temperatures appear to be changing by a large amount. This seemed odd to say the least. The history of the changes for t_e are:
v_2.diag:!!wind_update: Ave change in t_e 53335 from 35868 to 89203
sv_2.diag:!!wind_update: Ave change in t_e 44617 from 30492 to 75109
sv_2.diag:!!wind_update: Ave change in t_e 37179 from 26020 to 63198
sv_2.diag:!!wind_update: Ave change in t_e 31206 from 22732 to 53939
sv_2.diag:!!wind_update: Ave change in t_e 26614 from 20891 to 47505
sv_2.diag:!!wind_update: Ave change in t_e 23187 from 19949 to 43137
sv_2.diag:!!wind_update: Ave change in t_e 20807 from 19490 to 40298
sv_2.diag:!!wind_update: Ave change in t_e 19157 from 19192 to 38349
sv_2.diag:!!wind_update: Ave change in t_e 18245 from 18993 to 37238
sv_2.diag:!!wind_update: Ave change in t_e 17631 from 18989 to 36620
sv_2.diag:!!wind_update: Ave change in t_e 16948 from 19034 to 35982
sv_2.diag:!!wind_update: Ave change in t_e 16815 from 18985 to 35801
sv_2.diag:!!wind_update: Ave change in t_e 16917 from 18979 to 35897
sv_2.diag:!!wind_update: Ave change in t_e 16732 from 19022 to 35753
sv_2.diag:!!wind_update: Ave change in t_e 16799 from 19001 to 35801
sv_2.diag:!!wind_update: Ave change in t_e 16775 from 19005 to 35780
sv_2.diag:!!wind_update: Ave change in t_e 16756 from 18993 to 35750
sv_2.diag:!!wind_update: Ave change in t_e 16766 from 18991 to 35756
sv_2.diag:!!wind_update: Ave change in t_e 16787 from 18996 to 35783
sv_2.diag:!!wind_update: Ave change in t_e 16951 from 18981 to 35932
It seems obvious that something is quite odd. Both the old values and the new values are changing, but what we are not seeing is that the value from one cycle is moving the old value for the next cycle.
I am surprised that this has not been noticed before.
The text was updated successfully, but these errors were encountered: