Misc tasks (in the lack of a better name) #3

amartinhuertas · 2020-04-28T11:37:57Z

I would suggest the following tasks for @amartinhuertas

Create a new version of the assembling process that makes use of a GridPortion based on the owned cells (thus generating the owned vefs of lower dim) and thus consider a sub-assembled local matrix, which can be used for DD methods, we could provide a flag when creating the distributed fe space to decide which one use, or even compare both implementations (adding a comm in the subassembled one). I think he can learn many things about Gridap in this process (possibly related to some extent to issue Implement localize function to remove ghost cells of a given triangulation #2)
Create a new version of the current approach for multifield that only does two comms, and extend the sub-assembled approach + comm for this case too

fverdugo · 2020-04-29T16:46:11Z

I have implemented function remove_ghost_cellsin order to remove ghost cells from a given distributed triangulation. I have also implemented include_ghost_cells which reverts a previous removal of ghost cells.

I have implemented them only for volume triangulations since it was 4 lines of code and I needed them to have a first working driver. @amartinhuertas you can try to extend the functions to boundary and skeleton triangulations if you wish.

amartinhuertas · 2020-05-23T03:42:31Z

Hi @fverdugo, @santiagobadia,

I have finished a first prototype of the distributed (still sequential, though) algorithm for integration+assembly restricted to owned cells. By now, the code implementing this algorithm is provided in a separate source file which has not been yet integrated into the software architecture of GridapDistributed.jl (because of reasons made clear below). In particular, you will find it in the following file:

https://github.com/gridap/GridapDistributed.jl/blob/distributed_integration_restricted_to_owned_cells/distributed_integration_restricted_to_owned_cells.jl

I had to implement algorithm in a separate file because I could not (or, perhaps, I did not find how to) accommodate it (6-step sketch available here) into the current design pattern chosen for the assemble_matrix_and_vector+DistributedAssembler type dispatch combination, available here.

Thus, I propose that the central topic in our next meeting, @fverdugo, will be to go over this file, and to determine whether it can be actually accommodated into the software architecture of GridadDistributed.jl, and if not, how the latter should be adapted to the requirements of this code.

Other issues to be discussed: (will be made clear during the meeting, hard to explain here):

How to determine the communication pattern within assemble_vector without having access to the raw entries (before local sub-assembly) of the vector.

@amartinhuertas

fverdugo · 2020-05-23T10:36:23Z

@amartinhuertas I would say that the integration+assembly restricted to owned cells can be implemented in the current framework, and it can be done in few lines of code (it only requires to write a very simple specialization of AssemblyStrategy and to use the filter of ghost cells that we have already in place).

Provably we had two different things in mind when talking about this topic. We can clarify them in the next meeting.

fverdugo · 2020-05-25T07:30:22Z

Sketch of the code I was thinking of. (commented during meeting with @amartinhuertas on 2020-5-25)

With this we control how and where (in which part) the row nnz entries are computed. How to assemble a distributed PETSc matrix is the next step.

struct OwnedCellsStrategy <: AssemblyStrategy
  part::Int
  gids::IndexSet
end

function Gridap.FESpaces.row_map(a::OwnedCellsStrategy,row)
  a.gids.lid_to_gid[row]
end

function Gridap.FESpaces.col_map(a::OwnedCellsStrategy,col)
  a.gids.lid_to_gid[col]
end

function Gridap.FESpaces.row_mask(a::OwnedCellsStrategy,row)
  true
end

function Gridap.FESpaces.col_mask(a::OwnedCellsStrategy,col)
  true
end

function remove_ghost_cells_when_needed(strategy::AssemblyStrategy,trian::Triangulation)
  @abstractmethod
end

function remove_ghost_cells_when_needed(strategy::RowsComputedLocally,trian::Triangulation)
  trian
end

function remove_ghost_cells_when_needed(strategy::OwnedCellsStrategy,trian::Triangulation)
  remove_ghost_cells(trian,strategy.part,strategy.gids)
end

function Triangulation(strategy::AssemblyStrategy,model::DiscreteModel,args...)
  trian = Triangulation(model,args...)
  remove_ghost_cells_when_needed(strategy,trian)
end

function BoundaryTriangulation(strategy::AssemblyStrategy,model::DiscreteModel,args...)
  trian = BoundaryTriangulation(model,args...)
  remove_ghost_cells_when_needed(strategy,trian)
end

function SkeletonTriangulation(strategy::AssemblyStrategy,model::DiscreteModel,args...)
  trian = SkeletonTriangulation(model,args...)
  remove_ghost_cells_when_needed(strategy,trian)
end

function OwnedCellsStrategy(V::DistributedFESpace)
  dgids = V.gids
  strategies = DistributedData(dgids) do part, gids
    OwnedCellsStrategy(part,gids)
  end
  DistributedAssemblyStrategy(strategies)
end

module DistributedPoissonDGTests

using Test
using Gridap
using Gridap.FESpaces
using GridapDistributed
using SparseArrays

# Select matrix and vector types for discrete problem
# Note that here we use serial vectors and matrices
# but the assembly is distributed
T = Float64
vector_type = Vector{T}
matrix_type = SparseMatrixCSC{T,Int}

# Manufactured solution
u(x) = x[1]*(x[1]-1)*x[2]*(x[2]-1)
f(x) = - Δ(u)(x)
ud(x) = zero(x[1])

# Discretization
subdomains = (2,2)
domain = (0,1,0,1)
cells = (4,4)
comm = SequentialCommunicator(subdomains)
model = CartesianDiscreteModel(comm,subdomains,domain,cells)
const h = (domain[2]-domain[1]) / cells[1]

# FE Spaces
order = 2
V = FESpace(
  vector_type, valuetype=Float64, reffe=:Lagrangian, order=order,
  model=model, conformity=:L2)

U = TrialFESpace(V)

# Chose parallel assembly strategy
strategy = OwnedCellsStrategy(V)

# Terms in the weak form
terms = DistributedData(model,strategy) do part, (model,gids,strategy)

  γ = 10
  degree = 2*order

  trian = Triangulation(strategy,model)
  btrian = BoundaryTriangulation(strategy,model)
  strian = SkeletonTriangulation(strategy,model)

  quad = CellQuadrature(trian,degree)
  bquad = CellQuadrature(btrian,degree)
  squad = CellQuadrature(strian,degree)
  
  bn = get_normal_vector(btrian)
  sn = get_normal_vector(strian)

  a(u,v) = inner(∇(v),∇(u))
  l(v) = v*f
  t_Ω = AffineFETerm(a,l,trian,quad)
  
  a_Γd(u,v) = (γ/h)*v*u  - v*(bn*∇(u)) - (bn*∇(v))*u
  l_Γd(v) = (γ/h)*v*ud - (bn*∇(v))*ud
  t_Γd = AffineFETerm(a_Γd,l_Γd,btrian,bquad)
  
  a_Γ(u,v) = (γ/h)*jump(v*sn)*jump(u*sn) - jump(v*sn)*mean(∇(u)) -  mean(∇(v))*jump(u*sn)
  t_Γ = LinearFETerm(a_Γ,strian,squad)

  (t_Ω,t_Γ,t_Γd)
end


# Assembly
assem = SparseMatrixAssembler(matrix_type, vector_type, U, V, strategy)
op = AffineFEOperator(assem,terms)

# FE solution
op = AffineFEOperator(assem,terms)
uh = solve(op)

# Error norms and print solution
sums = DistributedData(model,uh) do part, (model,gids), uh

  trian = Triangulation(model)

  owned_trian = remove_ghost_cells(trian,part,gids)
  owned_quad = CellQuadrature(owned_trian,2*order)
  owned_uh = restrict(uh,owned_trian)

  writevtk(owned_trian,"results_$part",cellfields=["uh"=>owned_uh])

  e = u - owned_uh

  l2(u) = u*u
  h1(u) = u*u + ∇(u)*∇(u)

  e_l2 = sum(integrate(l2(e),owned_trian,owned_quad))
  e_h1 = sum(integrate(h1(e),owned_trian,owned_quad))

  e_l2, e_h1
end

e_l2_h1 = gather(sums)

e_l2 = sum(map(i->i[1],e_l2_h1))
e_h1 = sum(map(i->i[2],e_l2_h1))

tol = 1.0e-9
@test e_l2 < tol
@test e_h1 < tol

end # module

fverdugo · 2020-05-25T11:41:51Z

@amartinhuertas I attach here the other code snippet of today's meeting (how to incorporate a petsc matrix in the current framework.)

struct PetscMat
 # All the required fields here
end

function PetscMat(Alocal,gids_rows,gids_cols,complte_cols
  # All initialization work here
end

function allocate_coo_vectors(::Type{<PetscMat},dn)
  DistributedData(dn) do part, n
    I = zeros(Int,n)
    J = zeros(Int,n)
    V = zeros(Float32,n)
    (I,J,V)
  end
end

function sparse_from_coo(
  ::Type{<:PetscMat},dIJV,dgids_rows,dgids_cols)

  I,J,V = dIJV.part

  gids_rows = dgids_rows.part
  gids_cols = dgids_cols.part

  # convert to lids (perhaps not needed if we already assemble with lids)
  _I = [ gids_rows.gid_to_lid[i] for i in I ]
  _J = [ gids_cols.gid_to_lid[i] for i in J ]

  # Compress I,J,V (Sub-assembled matrix)
  nlrows = length(gids_rows.lid_to_gid)
  nlcols = length(gids_cols.lid_to_gid)
  Alocal = sparse_from_coo(SparseMatixCSR{0,Float64,Int32},_I,_J,V,nlrows,nlcols)

  # This will complete with comms the nnz pattern
  # and eventually call Petsc.Assembly if needed (e.g., when constraints are present)
  complete_cols = false
  PetscMat(Alocal,gids_rows,gids_cols,complete_cols)
end

MPIPETScCommunicator: 1. The current algorithm does not compute, before actual numerical assembly, the (full) sparsity pattern of the rows which are locally owned by each processor. Instead, we preallocate some storage for the matrix (with a rough, upper estimate of the number of nonzero elements per row), and entries are dynamically introduced into the matrix; see [here](https://github.com/gridap/GridapDistributed.jl/blob/f595e3ebb45c7777a25dded4a205b9a80195c7a9/src/MPIPETScDistributedAssemblersInterfaces.jl#L25) and [here](https://github.com/gridap/GridapDistributed.jl/blob/f595e3ebb45c7777a25dded4a205b9a80195c7a9/src/MPIPETScDistributedAssemblersInterfaces.jl#L35). Our experience with FEMPAR reveals that this has a number of practical performance and memory implications that we want to avoid in the final version of the algorithm (see also issue #9). 2. As a complementary side-note to 1., let me recall (to not forget/reuse) that: (1) we discussed [here](#3 (comment)) a possible way to accommodate the final/performant algorithm such that sticks into the current structure of `assembly_matrix_vector`; (2) I already wrote an implementation of this algorithm, but it does not stick to the current structure of `assembly_matrix_vector`, see [here](#3 (comment)) for more details; (3) in the most general case (non-conforming meshes, hanging DoF constraints), the final/performant algorithm requires to perform an ad-hoc communication among nearest neighbours in order to set up the sparsity pattern of the locally owned rows, see [here](https://github.com/gridap/GridapDistributed.jl/blob/distributed_integration_restricted_to_owned_cells/distributed_integration_restricted_to_owned_cells.jl#L275) for more details. At first glance, I do not see how it can be implemented given the aforementioned limitations of `MPIPETScDistributedVector` mentioned above.

fverdugo mentioned this issue Apr 29, 2020

Implement localize function to remove ghost cells of a given triangulation #2

Closed

amartinhuertas self-assigned this May 2, 2020

amartinhuertas mentioned this issue Jun 15, 2020

Issues in MPI PETSc Communicator implementation #16

Closed

amartinhuertas mentioned this issue Jul 17, 2020

Issues found/lessons learned with MPI-parallel Stokes solver #24

Closed

amartinhuertas mentioned this issue Sep 17, 2020

Truly global FE spaces, including multifield FE spaces #26

Closed

amartinhuertas closed this as completed Aug 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc tasks (in the lack of a better name) #3

Misc tasks (in the lack of a better name) #3

amartinhuertas commented Apr 28, 2020 •

edited

Loading

fverdugo commented Apr 29, 2020

amartinhuertas commented May 23, 2020

fverdugo commented May 23, 2020

fverdugo commented May 25, 2020

fverdugo commented May 25, 2020

Misc tasks (in the lack of a better name) #3

Misc tasks (in the lack of a better name) #3

Comments

amartinhuertas commented Apr 28, 2020 • edited Loading

fverdugo commented Apr 29, 2020

amartinhuertas commented May 23, 2020

fverdugo commented May 23, 2020

fverdugo commented May 25, 2020

fverdugo commented May 25, 2020

amartinhuertas commented Apr 28, 2020 •

edited

Loading