Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge develop in master #731

Merged
merged 65 commits into from
Jun 24, 2024
Merged
Changes from 13 commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
0d599e8
try to improve particle exchange (CPU for now)
Apr 2, 2024
50891ac
repurpose extractParticles
Apr 2, 2024
da1b172
CI on particle_exchange
Apr 2, 2024
593e96c
Fix in the new copyParticles
Apr 2, 2024
f565925
new reference
Apr 3, 2024
5c83d9d
particle exchange GPU
Apr 4, 2024
ecd3be0
forgot to remove function
Apr 4, 2024
c586a90
Merge branch 'develop' into particle_exchange
Apr 17, 2024
3afa356
scatter recvBuffers on CPU instead of GPU
Apr 18, 2024
ccac4ba
Implentation of GPU acceleration for the 1D cartesian geometry
charlesprouveur Apr 18, 2024
530529d
fix analysis
Apr 18, 2024
ff0266e
more
Apr 19, 2024
73b7de8
added the appropriate field1D and projector factory source files + cl…
charlesprouveur Apr 19, 2024
1fc01c7
add smilei_omp_threads in namelist
Apr 25, 2024
348faa0
fix particle exchange
Apr 25, 2024
4b2f648
make happi working with virtualenv
Apr 25, 2024
8bcaeb4
add publication
Apr 29, 2024
152c4be
Fix in coef found on adastra thanks to a different compiler behaviour…
charlesprouveur Apr 29, 2024
27dd743
add publication
May 2, 2024
0cd852a
Merge branch 'particle_exchange' into 'develop'
May 4, 2024
337a1ee
Sort on gpu with thrust::gather
May 7, 2024
8a6b4a8
add article
May 9, 2024
8447b75
add publication
May 11, 2024
47e30b4
add publication
May 13, 2024
18f1e1c
Typo in deprecated error message
beck-llr May 14, 2024
36a2bb3
Merge branch 'develop' of llrgit.in2p3.fr:smilei/smilei into develop
beck-llr May 14, 2024
af0070a
try sorting with zip_iterator
May 15, 2024
227811c
huge simplification of nvidiaParticles using thrust asynchronism
May 16, 2024
0b7d91e
fix for documentation typos (#716)
BrianMarre May 17, 2024
5b60a4d
add article, use extended journal names for each article
May 19, 2024
0974227
add publication
May 21, 2024
443a625
fix for AMD
May 22, 2024
93ba8e1
typos
May 22, 2024
9f362f3
Slightly faster GPU sort
May 24, 2024
3074a9f
add publication
May 24, 2024
1045fd2
Various small fixes
May 27, 2024
a48d556
fix many warnings
May 27, 2024
da51604
update ci
May 27, 2024
2d0474a
test CI
May 28, 2024
83ee20d
retest CI
beck-llr May 28, 2024
b9754d7
support matplotlib 3.9
May 31, 2024
d5eadb4
Fix recent commit for laser offset
May 31, 2024
ba3d0d9
Merge branch 'GPU_1D_implementation' into develop
charlesprouveur Jun 1, 2024
4642c1b
Fix: Adapting new 1D GPU implementation to the change in macro names
charlesprouveur Jun 1, 2024
e43c85f
Merge commit 'd5eadb44ad81b974c52b9d61a9473903c57f33a8' into develop
mccoys Jun 3, 2024
7323b5b
Fixing error in 1D MA solver introduced with its GPU implementation
charlesprouveur Jun 3, 2024
e42a578
Fixing typo in 1D SM BC
charlesprouveur Jun 3, 2024
5fa5204
reversed some changes in the betis part in interpolation 1D order 2 -…
charlesprouveur Jun 4, 2024
04564b0
fixing previous commit
charlesprouveur Jun 4, 2024
7b237fe
Update partners
beck-llr Jun 4, 2024
249404d
:Merge branch 'develop' of llrgit.in2p3.fr:smilei/smilei into develop
beck-llr Jun 4, 2024
59ce009
fixes for sphinx >= 5
Jun 4, 2024
51edda8
cleaning
charlesprouveur Jun 4, 2024
0169ab6
Merge branch 'develop' of https://llrgit.in2p3.fr/smilei/smilei into …
charlesprouveur Jun 4, 2024
13641d4
small change
charlesprouveur Jun 4, 2024
b7c9d9d
took out st1d_24_cir_plane_wave_BTIS3.py for further investigations
charlesprouveur Jun 5, 2024
e41df5e
Persistent buffers for GPU sorting
Jun 5, 2024
d204c5b
correct B-TIS3 implementation in 1D
Jun 5, 2024
aaddbb1
revert persistent buffers until more complete analysis
Jun 7, 2024
d96071f
Merge branch 'develop' into sort_gpu
Jun 7, 2024
b033879
Adding new publications
Z10Frank Jun 24, 2024
70d750d
Merge branch 'new_publications' into 'develop'
Jun 24, 2024
9e76031
Merge branch 'sort_gpu' into 'develop'
Jun 24, 2024
438d43d
prepare v5.1
Jun 24, 2024
b90b3b5
Merge branch 'develop' of /home/fperez/Repositories/smilei-github int…
Jun 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
@@ -15,6 +15,7 @@ install:
stage: install
only:
- develop
- particle_exchange

script:
# Force workdir cleaning in case of retried
@@ -33,6 +34,7 @@ compile_default:
stage: compile_default
only:
- develop
- particle_exchange

script:
# Move in test dir
@@ -44,6 +46,7 @@ runQuick:
stage: run_quick
only:
- develop
- particle_exchange

script:
# Move in test dir
@@ -55,6 +58,7 @@ run1D:
stage: run_default
only:
- develop
- particle_exchange

script:
# Move in test dir
@@ -67,6 +71,7 @@ run2D:
stage: run_default
only:
- develop
- particle_exchange

script:
# Move in test dir
@@ -81,6 +86,7 @@ run3D:
stage: run_default
only:
- develop
- particle_exchange

script:
# Move in test dir
@@ -96,6 +102,7 @@ runAM:
stage: run_default
only:
- develop
- particle_exchange

script:
# Move in test dir
@@ -108,6 +115,7 @@ runCollisions:
stage: run_default
only:
- develop
- particle_exchange

script:
# Move in test dir
2 changes: 1 addition & 1 deletion doc/Sphinx/implementation.rst
Original file line number Diff line number Diff line change
@@ -547,7 +547,7 @@ file ``Smilei.cpp`` thought calls to different ``vecPatches`` methods.

.. code-block:: c++

vecPatches.finalizeAndSortParticles( params, &smpi, simWindow,
vecPatches.finalizeExchParticlesAndSort( params, &smpi, simWindow,
time_dual, timers, itime );

* **Particle merging**: merging process for particles (still experimental)
2 changes: 1 addition & 1 deletion makefile
Original file line number Diff line number Diff line change
@@ -216,7 +216,7 @@ endif
ifneq (,$(call parse_config,gpu_amd))
CXXFLAGS += -DSMILEI_ACCELERATOR_MODE
GPU_COMPILER ?= $(CC)
GPU_COMPILER_FLAGS += -x hip -DSMILEI_ACCELERATOR_MODE -std=c++14 $(DIRS:%=-I%) #$(PY_FLAGS)
GPU_COMPILER_FLAGS += -x hip -DSMILEI_ACCELERATOR_MODE -std=c++14 $(DIRS:%=-I%)
GPU_COMPILER_FLAGS += -I$(BUILD_DIR)/src/Python $(PY_CXXFLAGS)
GPU_KERNEL_SRCS := $(shell find src/* -name \*.cu)
GPU_KERNEL_OBJS := $(addprefix $(BUILD_DIR)/, $(GPU_KERNEL_SRCS:.cu=.o))
11 changes: 3 additions & 8 deletions src/MovWindow/SimWindow.cpp
Original file line number Diff line number Diff line change
@@ -384,14 +384,9 @@ void SimWindow::shift( VectorPatch &vecPatches, SmileiMPI *smpi, Params &params,
} // end loop nSpecies

#if defined ( SMILEI_ACCELERATOR_MODE )
if ( params.gpu_computing ) {
// ADD NEW PARTS ON GPU
for( unsigned int ispec=0 ; ispec<nSpecies ; ispec++ ) {
mypatch->vecSpecies[ispec]->particles_to_move->clear();
// mypatch->vecSpecies[ispec]->particles->copyParticles( 0, mypatch->vecSpecies[ispec]->getNbrOfParticles(),
// *mypatch->vecSpecies[ispec]->particles_to_move, 0 );
mypatch->vecSpecies[ispec]->particles->initializeDataOnDevice();
mypatch->vecSpecies[ispec]->particles_to_move->initializeDataOnDevice();
if( params.gpu_computing ) {
for( auto spec: mypatch->vecSpecies ) {
spec->allocateParticlesOnDevice();
}
}
#endif
30 changes: 16 additions & 14 deletions src/ParticleBC/BoundaryConditionType.cpp
Original file line number Diff line number Diff line change
@@ -28,9 +28,9 @@ void internal_inf( Species *species, int imin, int imax, int direction, double l
cell_keys /* [imin:imax - imin] */ )
#pragma omp teams distribute parallel for
#endif
for (int ipart=imin ; ipart<imax ; ipart++ ) {
if ( position[ ipart ] < limit_inf) {
cell_keys[ ipart ] = -1;
for( int ipart=imin ; ipart<imax ; ipart++ ) {
if( cell_keys[ ipart ] >= 0 && position[ ipart ] < limit_inf ) {
cell_keys[ ipart ] = -2 - 2 * direction;
}
}
}
@@ -50,9 +50,9 @@ void internal_sup( Species *species, int imin, int imax, int direction, double l
cell_keys /* [imin:imax - imin] */ )
#pragma omp teams distribute parallel for
#endif
for (int ipart=imin ; ipart<imax ; ipart++ ) {
if ( position[ ipart ] >= limit_sup) {
cell_keys[ ipart ] = -1;
for( int ipart=imin ; ipart<imax ; ipart++ ) {
if( cell_keys[ ipart ] >= 0 && position[ ipart ] >= limit_sup ) {
cell_keys[ ipart ] = -3 - 2 * direction;
}
}
}
@@ -63,10 +63,11 @@ void internal_inf_AM( Species *species, int imin, int imax, int /*direction*/, d
double* position_y = species->particles->getPtrPosition(1);
double* position_z = species->particles->getPtrPosition(2);
int* cell_keys = species->particles->getPtrCellKeys();
for (int ipart=imin ; ipart<imax ; ipart++ ) {
double limit_inf2 = limit_inf*limit_inf;
for( int ipart=imin ; ipart<imax ; ipart++ ) {
double distance2ToAxis = position_y[ipart]*position_y[ipart]+position_z[ipart]*position_z[ipart];
if ( distance2ToAxis < limit_inf*limit_inf ) {
cell_keys[ ipart ] = -1;
if( cell_keys[ ipart ] >= 0 && distance2ToAxis < limit_inf2 ) {
cell_keys[ ipart ] = -4;
}
}
}
@@ -77,10 +78,11 @@ void internal_sup_AM( Species *species, int imin, int imax, int /*direction*/, d
double* position_y = species->particles->getPtrPosition(1);
double* position_z = species->particles->getPtrPosition(2);
int* cell_keys = species->particles->getPtrCellKeys();
for (int ipart=imin ; ipart<imax ; ipart++ ) {
double limit_sup2 = limit_sup*limit_sup;
for( int ipart=imin ; ipart<imax ; ipart++ ) {
double distance2ToAxis = position_y[ipart]*position_y[ipart]+position_z[ipart]*position_z[ipart];
if ( distance2ToAxis >= limit_sup*limit_sup ) {
cell_keys[ ipart ] = -1;
if( cell_keys[ ipart ] >= 0 && distance2ToAxis >= limit_sup2 ) {
cell_keys[ ipart ] = -5;
}
}
}
@@ -97,8 +99,8 @@ void reflect_particle_inf( Species *species, int imin, int imax, int direction,
#pragma omp target is_device_ptr( position, momentum )
#pragma omp teams distribute parallel for
#endif
for (int ipart=imin ; ipart<imax ; ipart++ ) {
if ( position[ ipart ] < limit_inf ) {
for( int ipart=imin ; ipart<imax ; ipart++ ) {
if( position[ ipart ] < limit_inf ) {
position[ ipart ] = 2.*limit_inf - position[ ipart ];
momentum[ ipart ] = -momentum[ ipart ];
}
161 changes: 155 additions & 6 deletions src/Particles/Particles.cpp
Original file line number Diff line number Diff line change
@@ -413,6 +413,51 @@ void Particles::copyParticles( unsigned int iPart, unsigned int nPart, Particles
}
}

// ---------------------------------------------------------------------------------------------------------------------
//! Copy particles indexed by array 'indices' to dest_id in dest_parts
//! The array 'indices' must be sorted in increasing order
//! cell keys not affected
// ---------------------------------------------------------------------------------------------------------------------
void Particles::copyParticles( vector<size_t> indices, Particles &dest_parts, int dest_id )
{
const size_t transfer_size = indices.size();
const size_t dest_new_size = dest_parts.size() + transfer_size;
const size_t displaced_size = dest_parts.size() - dest_id;

for( unsigned int iprop=0 ; iprop<double_prop_.size() ; iprop++ ) {
// Make space in dest array
dest_parts.double_prop_[iprop]->resize( dest_new_size );
auto loc = dest_parts.double_prop_[iprop]->begin() + dest_id;
move_backward( loc, loc + displaced_size, dest_parts.double_prop_[iprop]->end() );
// Copy data
for( size_t i = 0; i < transfer_size; i++ ) {
( *dest_parts.double_prop_[iprop] )[dest_id+i] = ( *double_prop_[iprop] )[indices[i]];
}
}

for( unsigned int iprop=0 ; iprop<short_prop_.size() ; iprop++ ) {
// Make space in dest array
dest_parts.short_prop_[iprop]->resize( dest_new_size );
auto loc = dest_parts.short_prop_[iprop]->begin() + dest_id;
move_backward( loc, loc + displaced_size, dest_parts.short_prop_[iprop]->end() );
// Copy data
for( size_t i = 0; i < transfer_size; i++ ) {
( *dest_parts.short_prop_[iprop] )[dest_id+i] = ( *short_prop_[iprop] )[indices[i]];
}
}

for( unsigned int iprop=0 ; iprop<uint64_prop_.size() ; iprop++ ) {
// Make space in dest array
dest_parts.uint64_prop_[iprop]->resize( dest_new_size );
auto loc = dest_parts.uint64_prop_[iprop]->begin() + dest_id;
move_backward( loc, loc + displaced_size, dest_parts.uint64_prop_[iprop]->end() );
// Copy data
for( size_t i = 0; i < transfer_size; i++ ) {
( *dest_parts.uint64_prop_[iprop] )[dest_id+i] = ( *uint64_prop_[iprop] )[indices[i]];
}
}
}

// ---------------------------------------------------------------------------------------------------------------------
//! Make a new particle at the position of another
//! cell keys not affected
@@ -529,6 +574,70 @@ void Particles::eraseParticle( unsigned int ipart, unsigned int npart, bool comp

}


// ---------------------------------------------------------------------------------------------------------------------
//! Erase particles indexed by array 'indices' to dest_id in dest_parts
//! The array 'indices' must be sorted in increasing order
//! cell keys not affected
// ---------------------------------------------------------------------------------------------------------------------
void Particles::eraseParticles( vector<size_t> indices )
{
const size_t indices_size = indices.size();
const size_t initial_size = size();

if( indices_size > 0 ) {

for( auto prop : double_prop_ ) {
// Relocate data to fill erased space
size_t j = 1, stop = ( 1 == indices_size ) ? initial_size : indices[1], to = indices[0];
for( size_t from = indices[0]+1; from < initial_size; from++ ) {
if( from < stop ) {
( *prop )[to] = ( *prop )[from];
to++;
} else {
j++;
stop = ( j == indices_size ) ? initial_size : indices[j];
}
}
// Resize
prop->resize( initial_size - indices_size );
}

for( auto prop : short_prop_ ) {
// Relocate data to fill erased space
size_t j = 1, stop = ( 1 == indices_size ) ? initial_size : indices[1], to = indices[0];
for( size_t from = indices[0]+1; from < initial_size; from++ ) {
if( from < stop ) {
( *prop )[to] = ( *prop )[from];
to++;
} else {
j++;
stop = ( j == indices_size ) ? initial_size : indices[j];
}
}
// Resize
prop->resize( initial_size - indices_size );
}

for( auto prop : uint64_prop_ ) {
// Relocate data to fill erased space
size_t j = 1, stop = ( 1 == indices_size ) ? initial_size : indices[1], to = indices[0];
for( size_t from = indices[0]+1; from < initial_size; from++ ) {
if( from < stop ) {
( *prop )[to] = ( *prop )[from];
to++;
} else {
j++;
stop = ( j == indices_size ) ? initial_size : indices[j];
}
}
// Resize
prop->resize( initial_size - indices_size );
}

}
}

// ---------------------------------------------------------------------------------------------------------------------
// Print parameters of particle iPart
// ---------------------------------------------------------------------------------------------------------------------
@@ -1190,21 +1299,61 @@ void Particles::copyFromHostToDevice()
{
ERROR( "Device only feature, should not have come here!" );
}
void Particles::copyFromDeviceToHost()
void Particles::copyFromDeviceToHost( bool )
{
ERROR( "Device only feature, should not have come here!" );
}

void Particles::extractParticles( Particles* particles_to_move )
// Loop all particles and copy the outgoing ones to buffers
void Particles::copyLeavingParticlesToBuffers( const vector<bool> copy, const vector<Particles*> buffer )
{
particles_to_move->clear();
for ( int ipart=0 ; ipart<size() ; ipart++ ) {
if ( cell_keys[ipart] == -1 ) {
copyParticle( ipart, *particles_to_move );
// Leaving particles have a cell_key equal to -2-direction
// where direction goes from 0 to 6 and tells which way the particle escapes.
// If the cell_key is -1, the particle must be destroyed so it is not extracted.

#if defined( SMILEI_ACCELERATOR_GPU_OMP ) || defined( SMILEI_OPENACC_MODE )

// GPU

// Copy leaving particles to buffer[0] on the GPU
copyLeavingParticlesToBuffer( buffer[0] );

// Dispatch between the different buffers on the CPU
// (doing this on the GPU is slower; maybe replacing thrust operations with pure cuda would work)
vector<size_t> indices;
for( size_t ipart = 0; ipart < buffer[0]->size(); ipart++ ) {
int direction = -buffer[0]->cell_keys[ipart] - 2;
if( direction > 0 ) {
if( copy[direction] ) {
buffer[0]->copyParticle( ipart, *buffer[direction] );
}
indices.push_back( ipart );
}
}
buffer[0]->eraseParticles( indices );

#else

// CPU

for( size_t ipart = 0; ipart < size(); ipart++ ) {
if( cell_keys[ipart] < -1 ) {
int direction = -cell_keys[ipart] - 2;
if( copy[direction] ) {
copyParticle( ipart, *buffer[direction] );
}
}
}

#endif
}

void Particles::copyLeavingParticlesToBuffer( Particles* )
{
ERROR( "Device only feature, should not have come here!" );
}


void Particles::savePositions() {
unsigned int ndim = Position.size(), npart = size();
double *p[3], *pold[3];
Loading