You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When re-running the testcase from #2 in order to compare the performance after parallelization of kpath (see #5), initialization phase is very inefficient again:
Testcase parameters
32sm running at 64 processes:
kpath = T
kpath_task = curv
kpath_num_points = 500
kpath_bands_colour = spin
kslice = F
berry = T
berry_task = ahc
berry_kmesh = 48 48 48
Analysis
The reason for the inefficiency is that for the specific testcase parameters instead of the optimized routine get_morb_R (see #3), get_ahc_R is called, containing again the inefficient loop (get_oper.F90, lines 402ff.):
! Wannier-gauge overlap matrix S in the projected subspace
!
call get_win_min(ik,winmin_q)
call get_win_min(nnlist(ik,nn),winmin_qb)
S=cmplx_0
do m=1,num_wann
do n=1,num_wann
do i=1,num_states(ik)
ii=winmin_q+i-1do j=1,num_states(nnlist(ik,nn))
jj=winmin_qb+j-1
S(n,m)=S(n,m)&
+conjg(v_matrix(i,n,ik))*S_o(ii,jj)&
*v_matrix(j,m,nnlist(ik,nn))
end doend doend doend do
TODO
Cleanup get_oper.F90 and minimize duplicated code.
Possible approach: Join different get_* routines to a single routine, providing logical flags as parameters for indicating which matrices need to be initialized.
Consistently use get_gauge_overlap_matrix instead of nested loops similar to the above code snippet.
Optoinal: Think about better names for get_gauge_overlap_matrix and its parameters.
The text was updated successfully, but these errors were encountered:
When re-running the testcase from #2 in order to compare the performance after parallelization of kpath (see #5), initialization phase is very inefficient again:
Testcase parameters
32sm running at 64 processes:
Analysis
The reason for the inefficiency is that for the specific testcase parameters instead of the optimized routine
get_morb_R
(see #3),get_ahc_R
is called, containing again the inefficient loop (get_oper.F90, lines 402ff.):TODO
get_*
routines to a single routine, providing logical flags as parameters for indicating which matrices need to be initialized.get_gauge_overlap_matrix
instead of nested loops similar to the above code snippet.get_gauge_overlap_matrix
and its parameters.The text was updated successfully, but these errors were encountered: