-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misc tasks (in the lack of a better name) #3
Comments
I have implemented function I have implemented them only for volume triangulations since it was 4 lines of code and I needed them to have a first working driver. @amartinhuertas you can try to extend the functions to boundary and skeleton triangulations if you wish. |
Hi @fverdugo, @santiagobadia, I have finished a first prototype of the distributed (still sequential, though) algorithm for integration+assembly restricted to owned cells. By now, the code implementing this algorithm is provided in a separate source file which has not been yet integrated into the software architecture of I had to implement algorithm in a separate file because I could not (or, perhaps, I did not find how to) accommodate it (6-step sketch available here) into the current design pattern chosen for the Thus, I propose that the central topic in our next meeting, @fverdugo, will be to go over this file, and to determine whether it can be actually accommodated into the software architecture of Other issues to be discussed: (will be made clear during the meeting, hard to explain here):
|
@amartinhuertas I would say that the integration+assembly restricted to owned cells can be implemented in the current framework, and it can be done in few lines of code (it only requires to write a very simple specialization of Provably we had two different things in mind when talking about this topic. We can clarify them in the next meeting. |
Sketch of the code I was thinking of. (commented during meeting with @amartinhuertas on 2020-5-25) With this we control how and where (in which part) the row nnz entries are computed. How to assemble a distributed PETSc matrix is the next step. struct OwnedCellsStrategy <: AssemblyStrategy
part::Int
gids::IndexSet
end
function Gridap.FESpaces.row_map(a::OwnedCellsStrategy,row)
a.gids.lid_to_gid[row]
end
function Gridap.FESpaces.col_map(a::OwnedCellsStrategy,col)
a.gids.lid_to_gid[col]
end
function Gridap.FESpaces.row_mask(a::OwnedCellsStrategy,row)
true
end
function Gridap.FESpaces.col_mask(a::OwnedCellsStrategy,col)
true
end
function remove_ghost_cells_when_needed(strategy::AssemblyStrategy,trian::Triangulation)
@abstractmethod
end
function remove_ghost_cells_when_needed(strategy::RowsComputedLocally,trian::Triangulation)
trian
end
function remove_ghost_cells_when_needed(strategy::OwnedCellsStrategy,trian::Triangulation)
remove_ghost_cells(trian,strategy.part,strategy.gids)
end
function Triangulation(strategy::AssemblyStrategy,model::DiscreteModel,args...)
trian = Triangulation(model,args...)
remove_ghost_cells_when_needed(strategy,trian)
end
function BoundaryTriangulation(strategy::AssemblyStrategy,model::DiscreteModel,args...)
trian = BoundaryTriangulation(model,args...)
remove_ghost_cells_when_needed(strategy,trian)
end
function SkeletonTriangulation(strategy::AssemblyStrategy,model::DiscreteModel,args...)
trian = SkeletonTriangulation(model,args...)
remove_ghost_cells_when_needed(strategy,trian)
end
function OwnedCellsStrategy(V::DistributedFESpace)
dgids = V.gids
strategies = DistributedData(dgids) do part, gids
OwnedCellsStrategy(part,gids)
end
DistributedAssemblyStrategy(strategies)
end
module DistributedPoissonDGTests
using Test
using Gridap
using Gridap.FESpaces
using GridapDistributed
using SparseArrays
# Select matrix and vector types for discrete problem
# Note that here we use serial vectors and matrices
# but the assembly is distributed
T = Float64
vector_type = Vector{T}
matrix_type = SparseMatrixCSC{T,Int}
# Manufactured solution
u(x) = x[1]*(x[1]-1)*x[2]*(x[2]-1)
f(x) = - Δ(u)(x)
ud(x) = zero(x[1])
# Discretization
subdomains = (2,2)
domain = (0,1,0,1)
cells = (4,4)
comm = SequentialCommunicator(subdomains)
model = CartesianDiscreteModel(comm,subdomains,domain,cells)
const h = (domain[2]-domain[1]) / cells[1]
# FE Spaces
order = 2
V = FESpace(
vector_type, valuetype=Float64, reffe=:Lagrangian, order=order,
model=model, conformity=:L2)
U = TrialFESpace(V)
# Chose parallel assembly strategy
strategy = OwnedCellsStrategy(V)
# Terms in the weak form
terms = DistributedData(model,strategy) do part, (model,gids,strategy)
γ = 10
degree = 2*order
trian = Triangulation(strategy,model)
btrian = BoundaryTriangulation(strategy,model)
strian = SkeletonTriangulation(strategy,model)
quad = CellQuadrature(trian,degree)
bquad = CellQuadrature(btrian,degree)
squad = CellQuadrature(strian,degree)
bn = get_normal_vector(btrian)
sn = get_normal_vector(strian)
a(u,v) = inner(∇(v),∇(u))
l(v) = v*f
t_Ω = AffineFETerm(a,l,trian,quad)
a_Γd(u,v) = (γ/h)*v*u - v*(bn*∇(u)) - (bn*∇(v))*u
l_Γd(v) = (γ/h)*v*ud - (bn*∇(v))*ud
t_Γd = AffineFETerm(a_Γd,l_Γd,btrian,bquad)
a_Γ(u,v) = (γ/h)*jump(v*sn)*jump(u*sn) - jump(v*sn)*mean(∇(u)) - mean(∇(v))*jump(u*sn)
t_Γ = LinearFETerm(a_Γ,strian,squad)
(t_Ω,t_Γ,t_Γd)
end
# Assembly
assem = SparseMatrixAssembler(matrix_type, vector_type, U, V, strategy)
op = AffineFEOperator(assem,terms)
# FE solution
op = AffineFEOperator(assem,terms)
uh = solve(op)
# Error norms and print solution
sums = DistributedData(model,uh) do part, (model,gids), uh
trian = Triangulation(model)
owned_trian = remove_ghost_cells(trian,part,gids)
owned_quad = CellQuadrature(owned_trian,2*order)
owned_uh = restrict(uh,owned_trian)
writevtk(owned_trian,"results_$part",cellfields=["uh"=>owned_uh])
e = u - owned_uh
l2(u) = u*u
h1(u) = u*u + ∇(u)*∇(u)
e_l2 = sum(integrate(l2(e),owned_trian,owned_quad))
e_h1 = sum(integrate(h1(e),owned_trian,owned_quad))
e_l2, e_h1
end
e_l2_h1 = gather(sums)
e_l2 = sum(map(i->i[1],e_l2_h1))
e_h1 = sum(map(i->i[2],e_l2_h1))
tol = 1.0e-9
@test e_l2 < tol
@test e_h1 < tol
end # module
|
@amartinhuertas I attach here the other code snippet of today's meeting (how to incorporate a petsc matrix in the current framework.) struct PetscMat
# All the required fields here
end
function PetscMat(Alocal,gids_rows,gids_cols,complte_cols
# All initialization work here
end
function allocate_coo_vectors(::Type{<PetscMat},dn)
DistributedData(dn) do part, n
I = zeros(Int,n)
J = zeros(Int,n)
V = zeros(Float32,n)
(I,J,V)
end
end
function sparse_from_coo(
::Type{<:PetscMat},dIJV,dgids_rows,dgids_cols)
I,J,V = dIJV.part
gids_rows = dgids_rows.part
gids_cols = dgids_cols.part
# convert to lids (perhaps not needed if we already assemble with lids)
_I = [ gids_rows.gid_to_lid[i] for i in I ]
_J = [ gids_cols.gid_to_lid[i] for i in J ]
# Compress I,J,V (Sub-assembled matrix)
nlrows = length(gids_rows.lid_to_gid)
nlcols = length(gids_cols.lid_to_gid)
Alocal = sparse_from_coo(SparseMatixCSR{0,Float64,Int32},_I,_J,V,nlrows,nlcols)
# This will complete with comms the nnz pattern
# and eventually call Petsc.Assembly if needed (e.g., when constraints are present)
complete_cols = false
PetscMat(Alocal,gids_rows,gids_cols,complete_cols)
end
|
MPIPETScCommunicator: 1. The current algorithm does not compute, before actual numerical assembly, the (full) sparsity pattern of the rows which are locally owned by each processor. Instead, we preallocate some storage for the matrix (with a rough, upper estimate of the number of nonzero elements per row), and entries are dynamically introduced into the matrix; see [here](https://github.com/gridap/GridapDistributed.jl/blob/f595e3ebb45c7777a25dded4a205b9a80195c7a9/src/MPIPETScDistributedAssemblersInterfaces.jl#L25) and [here](https://github.com/gridap/GridapDistributed.jl/blob/f595e3ebb45c7777a25dded4a205b9a80195c7a9/src/MPIPETScDistributedAssemblersInterfaces.jl#L35). Our experience with FEMPAR reveals that this has a number of practical performance and memory implications that we want to avoid in the final version of the algorithm (see also issue #9). 2. As a complementary side-note to 1., let me recall (to not forget/reuse) that: (1) we discussed [here](#3 (comment)) a possible way to accommodate the final/performant algorithm such that sticks into the current structure of `assembly_matrix_vector`; (2) I already wrote an implementation of this algorithm, but it does not stick to the current structure of `assembly_matrix_vector`, see [here](#3 (comment)) for more details; (3) in the most general case (non-conforming meshes, hanging DoF constraints), the final/performant algorithm requires to perform an ad-hoc communication among nearest neighbours in order to set up the sparsity pattern of the locally owned rows, see [here](https://github.com/gridap/GridapDistributed.jl/blob/distributed_integration_restricted_to_owned_cells/distributed_integration_restricted_to_owned_cells.jl#L275) for more details. At first glance, I do not see how it can be implemented given the aforementioned limitations of `MPIPETScDistributedVector` mentioned above.
I would suggest the following tasks for @amartinhuertas
Create a new version of the assembling process that makes use of a GridPortion based on the owned cells (thus generating the owned vefs of lower dim) and thus consider a sub-assembled local matrix, which can be used for DD methods, we could provide a flag when creating the distributed fe space to decide which one use, or even compare both implementations (adding a comm in the subassembled one). I think he can learn many things about Gridap in this process (possibly related to some extent to issue Implement localize function to remove ghost cells of a given triangulation #2)
Create a new version of the current approach for multifield that only does two comms, and extend the sub-assembled approach + comm for this case too
The text was updated successfully, but these errors were encountered: