Skip to content

Commit

Permalink
Merge branch 'fix.surf.comm' into 'master.dev'
Browse files Browse the repository at this point in the history
[fix.surf.comm] Separate communicator for processors with a BC side

Closes #210

See merge request piclas/piclas!839
  • Loading branch information
pnizenkov committed Oct 23, 2023
2 parents 32a64cd + 5b56602 commit 81c3e47
Show file tree
Hide file tree
Showing 113 changed files with 1,351 additions and 1,380 deletions.
2 changes: 1 addition & 1 deletion CMakeListsMachine.txt
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,7 @@ IF (USE_PGO)
SET(CMAKE_Fortran_FLAGS_RELEASE "${CMAKE_Fortran_FLAGS_RELEASE} -fprofile-use")
SET(CMAKE_Fortran_FLAGS_PROFILE "${CMAKE_Fortran_FLAGS_PROFILE} -fprofile-generate")
ELSE()
MESSAGE( SEND_ERROR "Profile-guided optimization (PGO) currently only supported for GNU compiler. Either set USE_GPO=OFF or use the GNU compiler." )
MESSAGE( SEND_ERROR "Profile-guided optimization (PGO) currently only supported for GNU compiler. Either set USE_PGO=OFF or use the GNU compiler." )
ENDIF()
ENDIF()

Expand Down
67 changes: 62 additions & 5 deletions docs/documentation/developerguide/mpi.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,7 @@

This chapter describes how PICLas subroutines and functions are parallelized.


## General Remarks: Things to consider
In case any new communicator (e.g. SurfCOMM%COMM) is built during init or anywhere else with
`CALL MPI_COMM_SPLIT(NEWCOMMUNICATOR,iERROR)` or such, it is necessary to finalize it with `CALL MPI_COMM_FREE(NEWCOMMUNICATOR,iERROR)`.

Else, load balancing will produce undefined errors that are almost impossible to find.

Debug MPI

Expand Down Expand Up @@ -121,3 +116,65 @@ Additionally to conventional sides, mappings for the sides that belong to a boun
PEM%GlobalElemID(iPart) ! Global element ID
PEM%CNElemID(iPart) ! Compute-node local element ID (GlobalElem2CNTotalElem(PEM%GlobalElemID(iPart)))
PEM%LocalElemID(iPart) ! Core local element ID (PEM%GlobalElemID(iPart) - offsetElem)

## Custom communicators

To limit the number of communicating processors, feature specific communicators can be built. In the following, an example is given
for a communicator, which only contains processors with a local surface side (part of the `InitParticleBoundarySurfSides` routine). First, a global variable `SurfCOMM`, which is based on the `tMPIGROUP` type, is required:
```
TYPE tMPIGROUP
INTEGER :: UNICATOR=MPI_COMM_NULL !< MPI communicator for surface sides
INTEGER :: nProcs !< number of MPI processes
INTEGER :: MyRank !< MyRank, ordered from 0 to nProcs - 1
END TYPE
TYPE (tMPIGROUP) :: SurfCOMM
```
To create a subset of processors, a condition is required, which is defined by the `color` variable:
```
color = MERGE(1337, MPI_UNDEFINED, nSurfSidesProc.GT.0)
```
Here, every processor with the same `color` will be part of the same communicator. The condition `nSurfSidesProc.GT.0` in this case includes every processor with a surface side. Every other processor will be set to `MPI_UNDEFINED` and consequently be part of `MPI_COMM_NULL`. Now, the communicator itself can be created:
```
CALL MPI_COMM_SPLIT(MPI_COMM_PICLAS, color, MPI_INFO_NULL, SurfCOMM%UNICATOR, iError)
```
`MPI_COMM_PICLAS` denotes the global PICLas communicator containing every processor (but can also be a previously created subset) and the `MPI_INFO_NULL` entry denotes the rank assignment within the new communicator (default: numbering from 0 to nProcs - 1). Additional information can be stored within the created variable:
```
IF(SurfCOMM%UNICATOR.NE.MPI_COMM_NULL) THEN
! Stores the rank within the given communicator as MyRank
CALL MPI_COMM_RANK(SurfCOMM%UNICATOR, SurfCOMM%MyRank, iError)
! Stores the total number of processors of the given communicator as nProcs
CALL MPI_COMM_SIZE(SurfCOMM%UNICATOR, SurfCOMM%nProcs, iError)
END IF
```
Through the IF clause, only processors that are part of the communicator can be addressed. And finally, it is important to free the communicator during the finalization routine:
```
IF(SurfCOMM%UNICATOR.NE.MPI_COMM_NULL) CALL MPI_COMM_FREE(SurfCOMM%UNICATOR,iERROR)
```
This works for communicators, which have been initialized with MPI_COMM_NULL, either initially during the variable definition or during the split call.
If not initialized initially, you have to make sure that the freeing call is only performed, if the respective split routine has been called to guarantee
that either a communicator exists and/or every (other) processor has been set to MPI_COMM_NULL.

### Available communicators

| Handle | Description | Derived from |
| ----------------------- | --------------------------------------------- | ----------------------- |
| MPI_COMM_WORLD | Default global communicator | - |
| MPI_COMM_PICLAS | Duplicate of MPI_COMM_WORLD | MPI_COMM_PICLAS |
| MPI_COMM_NODE | Processors on a node | MPI_COMM_PICLAS |
| MPI_COMM_LEADERS | Group of node leaders | MPI_COMM_PICLAS |
| MPI_COMM_WORKERS | All remaining processors, who are not leaders | MPI_COMM_PICLAS |
| MPI_COMM_SHARED | Processors on a node | MPI_COMM_PICLAS |
| MPI_COMM_LEADERS_SHARED | Group of node leaders (myComputeNodeRank = 0) | MPI_COMM_PICLAS |
| MPI_COMM_LEADERS_SURF | Node leaders with surface sides | MPI_COMM_LEADERS_SHARED |

#### Feature-specific

| Handle | Description | Derived from |
| ----------------------------------- | ---------------------------------------------------------------------- | --------------- |
| PartMPIInitGroup(nInitRegions)%COMM | Emission groups | MPI_COMM_PICLAS |
| SurfCOMM%UNICATOR | Processors with a surface side (e.g. reflective), including halo sides | MPI_COMM_PICLAS |
| CPPCOMM%UNICATOR | Coupled power potential | MPI_COMM_PICLAS |
| EDC%COMM(iEDCBC)%UNICATOR | Electric displacement current (per BC) | MPI_COMM_PICLAS |
| FPC%COMM(iUniqueFPCBC)%UNICATOR | Floating potential (per BC) | MPI_COMM_PICLAS |
| EPC%COMM(iUniqueEPCBC)%UNICATOR | Electric potential (per BC) | MPI_COMM_PICLAS |
| BiasVoltage%COMM%UNICATOR | Bias voltage | MPI_COMM_PICLAS |
12 changes: 12 additions & 0 deletions docs/documentation/userguide/workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,18 @@ the mesh is simply divided into parts along the space filling curve. Thus, domai
not limited by e.g. an integer factor between the number of cores and elements. The only limitation is that the number of cores
may not exceed the number of elements.

### Profile-guided optimization (PGO)

To further increase performance for production runs, profile-guided optimization can be utilized with the GNU compiler. This requires the execution of a representative simulation run with PICLas compiled using profiling instrumentation. For this purpose, the code has to be configured and compiled using the following additional settings and the `Profile` build type:

-DPICLAS_PERFORMANCE=ON -DUSE_PGO=ON -DCMAKE_BUILD_TYPE=Profile

A short representative simulation has to be performed, where additional files with the profiling information will be stored. Note that the test run should be relatively short as the code will be substantially slower than the regular `Release` build type. Afterwards, the code can be configured and compiled again for the production runs, using the `Release` build type:

-DPICLAS_PERFORMANCE=ON -DUSE_PGO=ON -DCMAKE_BUILD_TYPE=Release

Warnings regarding missing profiling files (`-Wmissing-profile`) can be ignored, if they concern modules not relevant for the current simulation method (e.g. `bgk_colloperator.f90` will be missing profile information if only a DSMC simulation has been performed).

## Post-processing

**PICLas** comes with a tool for visualization. The piclas2vtk tool converts the HDF5 files generated by **PICLas** to the binary
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,17 @@
001-TIME,002-Current-Spec-001-SF-001
0.0000000000000000E+000,0.0000000000000000E+000
0.1000000000000000E-011,0.1467593701480000E+001
0.2000000000000000E-011,0.1467433483827000E+001
0.3000000000000000E-011,0.1467433483827000E+001
0.4000000000000000E-011,0.1467433483827000E+001
0.5000000000000000E-011,0.1467593701480000E+001
0.5999999999999999E-011,0.1467593701480000E+001
0.6999999999999999E-011,0.1467433483827000E+001
0.8000000000000000E-011,0.1467593701480000E+001
0.9000000000000000E-011,0.1467433483827000E+001
0.9999999999999999E-011,0.1467593701480000E+001
0.2000000000000000E-010,0.1467513592653500E+001
0.4000000000000000E-010,0.1467513592653500E+001
0.6000000000000000E-010,0.1467513592653500E+001
0.8000000000000000E-010,0.1467433483827000E+001
0.1000000000000000E-009,0.1467513592653499E+001
0.1200000000000000E-009,0.1467593701480000E+001
0.1400000000000000E-009,0.1467433483827000E+001
0.1600000000000000E-009,0.1467513592653500E+001
0.1800000000000000E-009,0.1467593701480000E+001
0.2000000000000000E-009,0.1467513592653501E+001
0.2200000000000000E-009,0.1467433483827000E+001
0.2400000000000000E-009,0.1467513592653500E+001
0.2600000000000000E-009,0.1467513592653500E+001
0.2800000000000000E-009,0.1467513592653500E+001
0.3000000000000000E-009,0.1467513592653498E+001
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
001-TIME,002-Flux-Spec-001-BC_Xplus,003-TotalElectricCurrent-BC_Xplus
0.0000000000000000E+000,0.0000000000000000E+000,0.0000000000000000E+000
0.2000000000000000E-010,0.0000000000000000E+000,0.0000000000000000E+000
0.4000000000000000E-010,0.0000000000000000E+000,0.0000000000000000E+000
0.6000000000000000E-010,0.0000000000000000E+000,0.0000000000000000E+000
0.8000000000000000E-010,0.0000000000000000E+000,0.0000000000000000E+000
0.1000000000000000E-009,0.0000000000000000E+000,0.0000000000000000E+000
0.1200000000000000E-009,0.0000000000000000E+000,0.0000000000000000E+000
0.1400000000000000E-009,0.0000000000000000E+000,0.0000000000000000E+000
0.1600000000000000E-009,0.0000000000000000E+000,0.0000000000000000E+000
0.1800000000000000E-009,0.0000000000000000E+000,0.0000000000000000E+000
0.2000000000000000E-009,0.0000000000000000E+000,0.0000000000000000E+000
0.2200000000000000E-009,0.0000000000000000E+000,0.0000000000000000E+000
0.2400000000000000E-009,0.2500000000000002E+017,-.4005441325000004E-002
0.2600000000000000E-009,0.6887500000000006E+019,-.1103499085037501E+001
0.2800000000000000E-009,0.9169500000000008E+019,-.1469115769183501E+001
0.3000000000000000E-009,0.9144999999999985E+019,-.1465190436684997E+001
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
! compare the last line of PartAnalyze.csv with a reference file
compare_data_file_name = PartAnalyze.csv
compare_data_file_reference = PartAnalyze_ref.csv
compare_data_file_tolerance = 5e-3
! compare the current entering through the surface flux
compare_column_file = PartAnalyze.csv
compare_column_reference_file = PartAnalyze_ref.csv
compare_column_index = 1
compare_column_tolerance_value = 0.01
compare_column_tolerance_type = relative

! compare the current leaving the domain
compare_data_file_name = SurfaceAnalyze.csv
compare_data_file_reference = SurfaceAnalyze_ref.csv
compare_data_file_tolerance = 0.01
compare_data_file_tolerance_type = relative
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -1 +1 @@
MPI=1,2,4
MPI=1,4,10
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ NVisu =1
Mode =1

DEFVAR = (REAL): minus_x = 0.0
DEFVAR = (REAL): plus_x = 10.0
DEFVAR = (REAL): plus_x = 5.0

DEFVAR = (REAL): minus_y = 0.0
DEFVAR = (REAL): plus_y = 1.0
Expand All @@ -14,7 +14,7 @@ DEFVAR = (REAL): minus_z = 0.0
DEFVAR = (REAL): plus_z = 1.0

Corner =(/minus_x,minus_y,minus_z ,, plus_x,minus_y,minus_z ,, plus_x,plus_y,minus_z ,, minus_x,plus_y,minus_z ,, minus_x,minus_y,plus_z ,, plus_x,minus_y,plus_z ,, plus_x,plus_y,plus_z ,, minus_x,plus_y,plus_z /)
nElems =(/10,2,2/)
nElems =(/5,2,2/)
elemtype =108

BCIndex =(/6 ,4 ,1 ,3 ,2 ,5/)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,27 +14,34 @@ NAnalyze = 1 ! Number of analyze points
MeshFile = channel_mesh.h5
useCurveds = F
! if boundaries have to be changed (else they are used from Mesh directly):
TrackingMethod = triatracking
TrackingMethod = triatracking
CalcMeshInfo = T
CalcHaloInfo = T

Part-FIBGMdeltas=(/0.5E-01,0.1E-01,0.1E-01/)
Part-FactorFIBGM=(/5,2,2/)
! =============================================================================== !
! OUTPUT / VISUALIZATION
! =============================================================================== !
ProjectName = SurfFlux_Tria_EmissionCurrent
IterDisplayStep = 10
IterDisplayStep = 5
Part-AnalyzeStep = 1
Surface-AnalyzeStep = 1
CalcSurfFluxInfo = T
!CalcPartBalance = T
! =============================================================================== !
! CALCULATION
! =============================================================================== !
tend = 1.0E-11
Analyze_dt = 1.0E-11
ManualTimeStep = 1.0000E-12
tend = 3.0E-10
Analyze_dt = 1.0E-10
ManualTimeStep = 2.0000E-11
! Load balance
DoLoadBalance = T
PartWeightLoadBalance = T
! Initial load balance
DoInitialAutoRestart = T
InitialAutoRestart-PartWeightLoadBalance = T
LoadBalanceMaxSteps = 1
DoInitialAutoRestart = F
InitialAutoRestart-PartWeightLoadBalance = F
LoadBalanceMaxSteps = 5
Load-DeviationThreshold = 1E-9
! =============================================================================== !
! BOUNDARY CONDITIONS - FIELD SOLVER
Expand All @@ -43,7 +50,7 @@ BoundaryName = BC_Xplus
BoundaryType = (/4,0/)
BoundaryName = BC_Xminus
BoundaryType = (/5,1/)
RefState = (/-1E6 , 0 , 0/)
RefState = (/-5E5 , 0 , 0/)
BoundaryName = BC_Yplus
BoundaryType = (/10,0/)
BoundaryName = BC_Yminus
Expand All @@ -52,11 +59,14 @@ BoundaryName = BC_Zplus
BoundaryType = (/10,0/)
BoundaryName = BC_Zminus
BoundaryType = (/10,0/)

epsCG = 1e-3
! =============================================================================== !
! BOUNDARY CONDITIONS - PARTICLES
! =============================================================================== !
Part-maxParticleNumber=500000
Part-nSpecies=1

Part-nBounds=6
Part-Boundary1-SourceName = BC_Xplus
Part-Boundary1-Condition = reflective
Expand All @@ -65,8 +75,8 @@ Part-Boundary1-SpeciesSwaps1 = (/1,0/)

Part-Boundary2-SourceName = BC_Xminus
Part-Boundary2-Condition = reflective

Part-Boundary2-WallTemp = 2700, 2635.24

Part-Boundary3-SourceName = BC_Yplus
Part-Boundary3-Condition = symmetric
Part-Boundary4-SourceName = BC_Yminus
Expand All @@ -75,7 +85,6 @@ Part-Boundary5-SourceName = BC_Zplus
Part-Boundary5-Condition = symmetric
Part-Boundary6-SourceName = BC_Zminus
Part-Boundary6-Condition = symmetric
Part-FIBGMdeltas=(/1e-2,1e-2,1e-2/)

CalcBoundaryParticleOutput = T
BPO-NPartBoundaries = 1 ! Nbr of bounaries
Expand All @@ -87,7 +96,7 @@ BPO-Species = (/1/) ! electrons
! =============================================================================== !
Part-Species1-MassIC = 9.11E-31
Part-Species1-ChargeIC = -1.60217653E-19
Part-Species1-MacroParticleFactor = 1E4
Part-Species1-MacroParticleFactor = 5E4

Part-Species1-nSurfaceFluxBCs=1
Part-Species1-Surfaceflux1-BC=2
Expand All @@ -105,7 +114,8 @@ nocrosscombination:Part-Boundary2-WallTemp, Part-Species1-Surfaceflux1-Thermioni
! =============================================================================== !
! DSMC
! =============================================================================== !
Particles-HaloEpsVelo=2.0E+07
Particles-HaloEpsVelo=2.0E+06
Part-NumberOfRandomSeeds=2
Particles-RandomSeed1=1
Particles-RandomSeed2=2
NVisu=1
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
* Comparing the calculated current to the expected value of the Richardson Dushman equation for Tungsten
* Input: W = 4.54 eV, A = 60 A/(cm^2 K^2), T_w = 2700 K (without Schottky), T_w = 2635.24 K
* Output: j = 1.47 A / cm^2 -> I = 1.4675 A (A = 1 cm^2)
* In the case with the Schottky effect, the current reduces slightly over time as the electrons in the domain reduce the potential difference
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
MPI = 4,10
!restart_file = 2Dplasma_test_State_000.00000000000000000.h5, 2Dplasma_test_State_000.00000005000000000.h5
Original file line number Diff line number Diff line change
Expand Up @@ -71,13 +71,11 @@ PIC-AlgebraicExternalField = 1 ! 1: Charoy 2019 magnetic + electric field
Part-maxParticleNumber = 500000
Part-nSpecies = 2
Part-FIBGMdeltas = (/2.5e-2 , 1.28e-2 , 0.01e-2/)
!Part-FactorFIBGM = (/ 500 , 256 , 1/)
Part-FactorFIBGM = (/ 10 , 10 , 1/)
PIC-DoDeposition = T
PIC-DoInterpolation = T

PIC-Deposition-Type = cell_volweight_mean
!PIC-AlgebraicExternalField = 1
DisplayLostParticles=T

Part-Species1-MacroParticleFactor = 1.67e4 ! 1.67e2 originally used for z=1e-4 m (case2: 75 parts per cell with dx=dy=5e-5 m)
Expand All @@ -91,17 +89,9 @@ Part-nBounds = 6

Part-Boundary1-SourceName = BC_ANODE
Part-Boundary1-Condition = open
!Part-Boundary1-NbrOfSpeciesSwaps = 3
!Part-Boundary1-SpeciesSwaps1 = (/1,0/)
!Part-Boundary1-SpeciesSwaps2 = (/2,0/)
!Part-Boundary1-SpeciesSwaps3 = (/3,0/)

Part-Boundary2-SourceName = BC_CATHODE
Part-Boundary2-Condition = open
!Part-Boundary2-NbrOfSpeciesSwaps = 3
!Part-Boundary2-SpeciesSwaps1 = (/1,0/)
!Part-Boundary2-SpeciesSwaps2 = (/2,0/)
!Part-Boundary2-SpeciesSwaps3 = (/3,0/)

Part-Boundary3-SourceName = BC_periodicy+
Part-Boundary3-Condition = periodic
Expand Down
Loading

0 comments on commit 81c3e47

Please sign in to comment.