Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with library mode and excluded bands #186

Closed
giovannipizzi opened this issue Jun 21, 2018 · 5 comments · Fixed by #196
Closed

Problem with library mode and excluded bands #186

giovannipizzi opened this issue Jun 21, 2018 · 5 comments · Fixed by #196
Assignees
Labels
Milestone

Comments

@giovannipizzi
Copy link
Member

[Originally reported by @rmargine]

The fix in #170 has still some issues.

The idea in #170: setting num_bands before to the correct number of bands (https://github.com/wannier-developers/wannier90/pull/170/files#diff-bda432ec9c162faca707e04d733436e5R138 )
and not changing anything in the parameters.F90 (https://github.com/wannier-developers/wannier90/pull/170/files#diff-9e21907dd756ab9320b75aa5752d2022R622 )

Roxana's suggestion/fix, instead, w.r.t. v2.1:

--- /home/pizzi/Desktop/parameters_orig21.F90	2018-06-21 14:46:48.882043511 +0200
+++ /home/pizzi/Desktop/parameters.F90	2018-06-21 14:45:45.922712361 +0200
@@ -565,15 +565,20 @@
             call io_error('Error: exclude_bands must contain positive numbers')
     end if
 
-    ! AAM_2016-09-16: some changes to logic to patch a problem with uninitialised num_bands in library mode
-!    num_bands       =   -1   
+    ! AAM_2016-09-16: some changes to logic to patch a problem with
+    ! uninitialised num_bands in library mode
+!    num_bands       =   -1
     call param_get_keyword('num_bands',found,i_value=i_temp)
     if(found.and.library) write(stdout,'(/a)') ' Ignoring <num_bands> in input file'
     if (.not. library .and. .not.effective_model) then
        if(found) num_bands=i_temp
        if(.not.found) num_bands=num_wann
     end if
-    if (library) num_bands = num_bands - num_exclude_bands
+    ! RM_2018-03-21: this should only be done once, but param_read is called
+    ! both in wannier_setup and wannier_run
+    ! RM_2018-03-21: commented line below, as now the correct value for
+    ! num_bands is set in library mode after calling param_read
+!    if (library) num_bands = num_bands - num_exclude_bands
     if (.not. effective_model) then
        if(found .and. num_bands<num_wann) then
           call io_error('Error: num_bands must be greater than or equal to num_wann')

and

--- /home/pizzi/Desktop/wannier_lib_orig21.F90	2018-06-21 14:47:05.865863010 +0200
+++ /home/pizzi/Desktop/wannier_lib.F90	2018-06-21 14:45:45.954712022 +0200
@@ -135,10 +135,13 @@
 
   ! AAM_2016-09-14: initialise num_bands as it's used in param_read()
   num_bands = num_bands_tot
-  call param_read()
   ! set num_bands and cell_volume as they are written to output in param_write
   ! AAM_2016-09-14: num_exclude_bands is now subtracted in param_read
-  ! num_bands = num_bands_tot - num_exclude_bands
+  ! RM_2018-03-21: num_exclude_bands is now subtracted below
+  ! this is to avoid subtracting it twice since param_read is called
+  ! both in wannier_setup and wannier_run
+  call param_read()
+  num_bands = num_bands_tot - num_exclude_bands
   cell_volume = real_lattice(1,1)*(real_lattice(2,2)*real_lattice(3,3)-real_lattice(3,2)*real_lattice(2,3)) +&
                 real_lattice(1,2)*(real_lattice(2,3)*real_lattice(3,1)-real_lattice(3,3)*real_lattice(2,1)) +& 
                 real_lattice(1,3)*(real_lattice(2,1)*real_lattice(3,2)-real_lattice(3,1)*real_lattice(2,2))
@@ -168,7 +171,7 @@
   nntot_loc       = nntot        
   nnlist_loc(:,1:nntot)   =  nnlist(:,1:nntot)       
   nncell_loc(:,:,1:nntot) =  nncell(:,:,1:nntot)       
-  num_bands_loc=num_bands_tot-num_exclude_bands
+  num_bands_loc=num_bands
   num_wann_loc=num_wann
   if(allocated(proj_site)) then
      proj_site_loc(:,1:num_proj)   = proj_site(:,1:num_proj)

Reason from Roxana:

  1. The problem is that num_exclude_bands=0 in wannier_lib.F90/wannier_setup before CALL param_read() and
num_bands = num_bands_tot - num_exclude_bands = num_bands_tot - 0

As a result output variable in wannier_setup num_bands_loc is also equal to num_bands_tot since

num_bands_loc=num_bands 

This causes a crash in setup_nnkp with the error

     Error in routine setup_nnkp (1):
      something wrong with num_bands
  1. Setting num_bands = num_bands_tot - num_exclude_bands after CALL param_read() in wannier_lib.F90/wannier_setup fixes the problem because num_exclude_bands was found in parameters.F90/param_read at lines:
    num_exclude_bands=0
    call param_get_range_vector('exclude_bands',found,num_exclude_bands,lcount=.true.)
@giovannipizzi giovannipizzi self-assigned this Jun 21, 2018
@giovannipizzi giovannipizzi added this to the v2.2 milestone Jun 21, 2018
@giovannipizzi
Copy link
Member Author

Note that this (partially) reverts commit 26eb0a9 (but not completely, e.g. num_bands_loc=num_bands wasn't in that commit)

@giovannipizzi
Copy link
Member Author

As a reference for me, there are quite a number of bugs in the current implementation of library, also some due to the new introduction of parallelism.

I'm testing it in a branch in my fork (called fix_186_...).
I wrote a minimal test implementation of a code using the library, and I'm testing it. There is a run.sh in the test-suite/library-mode-test that should run this program.

At the moment where I stopped now (I will continue in a few days), I get

At line 378 of file ../disentangle.F90
Fortran runtime error: Index '1' of dimension 4 of array 'm_matrix_orig_local' above upper bound of 0

because in internal_slim_m() the array m_matrix_orig_local is actually not allocated (because allocation currently happens in overlap_read but this is never called from the library mode).

Idea: split overlap_read in a overlap_alloc, and call overlap_alloc from wannier_run.

@giovannipizzi
Copy link
Member Author

After some work today, as of now, the branch seems to be working (note: only in serial!).
This probably just needs some cleanup, and adding the test to the testsuite, and then can be merged.
Probably it would be good to test it with an actual code using the library mode first...

@giovannipizzi
Copy link
Member Author

Note that probably it would be good to check if the reference output of the test example is correct after the fix of #192 (note that this branch already contains that fix, so if #192 is not approved or changed, we should probably revert that commit).

I've also asked the EPW guys to give a look to this branch (thanks if you could! :-) )

@giovannipizzi
Copy link
Member Author

Ok, Samuel from EPW confirmed that this version works properly (in serial, as expected).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant