Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improper use of pointers in the code #177

Open
sangallidavide opened this issue Jan 23, 2025 · 6 comments
Open

Improper use of pointers in the code #177

sangallidavide opened this issue Jan 23, 2025 · 6 comments
Assignees

Comments

@sangallidavide
Copy link
Member

sangallidavide commented Jan 23, 2025

Issue spot by Cineca team

Following code is not correct

  • Original code
       KERNEL_p => DEV_VAR(KERNEL%blc)(:,:,:)                           
       Xo_p     => DEV_VAR(Xo%blc)(:,:,:)                               

From my understanding, it should be one of this options

  • Option 1
       KERNEL_p => DEV_VAR(KERNEL%blc)(1,1,1)
       Xo_p     => DEV_VAR(Xo%blc)(1,1,1)
  • Option 2
       KERNEL_p => DEV_VAR(KERNEL%blc)
       Xo_p     => DEV_VAR(Xo%blc)
  • Option 3
       KERNEL_p(:,:,:) => DEV_VAR(KERNEL%blc)(:,:,:)
       Xo_p(:,:,:)     => DEV_VAR(Xo%blc)(:,:,:)

Is this true ? Any documentation where the original code is reported as not correct?

@sangallidavide
Copy link
Member Author

The following would be a possible yambo patch

diff --git a/src/bse/K_correlation_kernel_dir.F b/src/bse/K_correlation_kernel_dir.F
index b800ca05ed..dd16b1a340 100644
--- a/src/bse/K_correlation_kernel_dir.F
+++ b/src/bse/K_correlation_kernel_dir.F
@@ -41,8 +41,8 @@ function K_correlation_kernel_dir(i_block,i_p,i_pmq,
  iO1 = BS_blk(i_block)%O_table(i_k_s,i_p_s,1,1,i_n_k,i_n_p,i_k_sp_pol_n)
  iO2 = BS_blk(i_block)%O_table(i_kmq_s,i_pmq_s,i_kmq_t,i_pmq_t,i_m_k,i_m_p,i_k_sp_pol_m)
  !
- O_c_d_iO1 => DEV_VAR(BS_blk(i_block)%O_c)(:,iO1)
- O_c_d_iO2 => DEV_VAR(BS_blk(i_block)%O_c)(:,iO2)
+ O_c_d_iO1 => DEV_VAR(BS_blk(i_block)%O_c)(1,iO1)
+ O_c_d_iO2 => DEV_VAR(BS_blk(i_block)%O_c)(1,iO2)
  !
  !DEV_ACC_DEBUG data present(O_c_d_iO1,O_c_d_iO2,G_m_G,g_rot,O1,O2)
  !DEV_ACC parallel loop
diff --git a/src/bse/K_correlation_kernel_std.F b/src/bse/K_correlation_kernel_std.F
index 6d39cd92c1..b792cd9f94 100644
--- a/src/bse/K_correlation_kernel_std.F
+++ b/src/bse/K_correlation_kernel_std.F
@@ -63,8 +63,8 @@ function K_correlation_kernel_std(i_block,i_p,i_pmq,
  if (PHASE_2==-99._SP) PHASE_2=1._SP
  !
  !
- O_c1_p => DEV_VAR(BS_blk(i_block_1)%O_c)(:,:)
- O_c2_p => DEV_VAR(BS_blk(i_block_2)%O_c)(:,:)
+ O_c1_p => DEV_VAR(BS_blk(i_block_1)%O_c)
+ O_c2_p => DEV_VAR(BS_blk(i_block_2)%O_c)
  !
  !DEV_ACC_DEBUG data present(O_c1_p,O_c2_p,G_m_G,g_rot,O1,O2)
  !DEV_ACC parallel loop private(i_g2,i_g3)
diff --git a/src/dipoles/DIPOLE_kb_sum.F b/src/dipoles/DIPOLE_kb_sum.F
index 060be44b4c..40075923dc 100644
--- a/src/dipoles/DIPOLE_kb_sum.F
+++ b/src/dipoles/DIPOLE_kb_sum.F
@@ -32,7 +32,7 @@ subroutine DIPOLE_kb_project(pp_range,ib_range,ib,i_wf,wf_ncx,nbndx,npp,&
  complex(SP) DEV_ATTR, pointer :: kbv_p(:,:),XX_p(:,:)
  complex(SP) DEV_ATTR, pointer :: WF_p(:,:)
 
- WF_p => WF(:,:,i_wf)
+ WF_p => WF(1,1,i_wf)
  dim_flat=n_spinor*4*npp
  !
  ! GPU treatment
diff --git a/src/dipoles/DIPOLE_overlaps.F b/src/dipoles/DIPOLE_overlaps.F
index c18c83e903..377b1863c9 100644
--- a/src/dipoles/DIPOLE_overlaps.F
+++ b/src/dipoles/DIPOLE_overlaps.F
@@ -104,7 +104,7 @@ subroutine DIPOLE_overlaps(Xk,Dip)
        !
        call WF_symm_kpoint_gpu((/ib,ib/),ikbz,i_sp_pol,Xk,WF_symm)
        !
-       WF_tmp=>WF_ik(:,:,ib:ib)
+       WF_tmp=>WF_ik(1,1,ib)
        !  
        call WF_shift_kpoint_gpu((/ib,ib/),1,ikbz,i_sp_pol,WF_shifts(ikbz,:),Xk,WF_symm,WF_tmp)
        !
diff --git a/src/pol_function/X_redux.F b/src/pol_function/X_redux.F
index 682dfc9b71..8c685b4338 100644
--- a/src/pol_function/X_redux.F
+++ b/src/pol_function/X_redux.F
@@ -393,8 +393,8 @@ end subroutine
      !
      if (compute_on_gpu) then
        !
-       KERNEL_p => DEV_VAR(KERNEL%blc)(:,:,:)
-       Xo_p     => DEV_VAR(Xo%blc)(:,:,:)
+       KERNEL_p => DEV_VAR(KERNEL%blc)
+       Xo_p     => DEV_VAR(Xo%blc)
        !
        !DEV_ACC_DEBUG data present(KERNEL_p,Xo_p,bare_qpg)
        !DEV_ACC parallel loop collapse(2)
diff --git a/src/qp/XCo_Hartree_Fock.F b/src/qp/XCo_Hartree_Fock.F
index cd74be3e4d..32322294d1 100644
--- a/src/qp/XCo_Hartree_Fock.F
+++ b/src/qp/XCo_Hartree_Fock.F
@@ -168,7 +168,7 @@ subroutine XCo_Hartree_Fock(E,k,Xk,q,mode)
    !
    call scatter_Gamp_gpu(isc,'x')
    !
-   gamp_p => DEV_VAR(isc%gamp)(:,1)
+   gamp_p => DEV_VAR(isc%gamp)(1,1)
    isc_rhotw_p => DEV_VAR(isc%rhotw) 
    iscp_rhotw_p => DEV_VAR(iscp%rhotw) 
    !

@muralidhar-nalabothula
Copy link
Contributor

muralidhar-nalabothula commented Jan 24, 2025

Hi Davide,

Last year I was looking in the fortran standard manual (when writing the ydiago part in yambo), As far as I remember, Original code and Option 2 are same. and Option-1 and 3 are not valid (I doubt if they even compile). Do you see any compiler error for the code?

Edit: Here is part in the FORTRAN-2003 standard:
see 7.4.2 in (https://j3-fortran.org/doc/year/04/04-007.pdf) (page 143). The standard clearly says option-3 is invalid and option-1 is rank mismatch (DEV_VAR(KERNEL%blc)(1,1,1) is a scalar```). The standard also has examples with Option 2 and original code (see page 145 NOTE 7.43)

Option-3 is allowed with the following change KERNEL_p(1:,1:,1:) = DEV_VAR(KERNEL%blc)(:,:,:). Again the standard says that KERNEL_p(1:N,1:N,1:N) = DEV_VAR(KERNEL%blc)(:,:,:) is allowed only if DEV_VAR(KERNEL%blc)(:,:,:) is reshape to rank 1 array.

@sangallidavide
Copy link
Member Author

Thanks Murali. The Cineca team got errors with the original code and suggested option 2 in X_redux. Maybe it's an nvfortran issue. I'll look into the manual.

Please notice, however, that here we are discussing pointers mapping, not data transfer between variables, e.g. the pointers are not allocated here.

@so07
Copy link

so07 commented Jan 27, 2025

Hi Davide,

We have encountered issues with the original code in X_redux on Leonardo with nvhpc-24.9 compiler and CUDA-Fortran version.
The original code is not working, we fix the problem with the option2

       KERNEL_p => DEV_VAR(KERNEL%blc)
       Xo_p     => DEV_VAR(Xo%blc)

In this way everything works properly.

@sangallidavide
Copy link
Member Author

Hi Davide,

We have encountered issues with the original code in X_redux on Leonardo with nvhpc-24.9 compiler and CUDA-Fortran version. The original code is not working, we fix the problem with the option2

       KERNEL_p => DEV_VAR(KERNEL%blc)
       Xo_p     => DEV_VAR(Xo%blc)

In this way everything works properly.

Yeah. This was clear from the discussion.

Just I'd like to understand if this was a yambo error or a compiler bug, to decide if we need to change other points in the code or not.

For the fortran documentation

       KERNEL_p => DEV_VAR(KERNEL%blc)(:,:,:)                           
       Xo_p     => DEV_VAR(Xo%blc)(:,:,:)                               

this is valid, so I tend to think it is an nvfortran issue.

@muralidhar-nalabothula
Copy link
Contributor

Hi @so07,

Could you please post the compiler error message? Original syntax ( a => b%c(:,:,:)) can be perfectly compiled with nvfortran 24.9 (see https://godbolt.org/z/fW6hd65oe) . The problems seems to be somewhere else IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants