-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intrepid2: Support Full Exact Sequence in Structured Integration #10419
Intrepid2: Support Full Exact Sequence in Structured Integration #10419
Conversation
# This is the 1st commit message: This PR extends previous work in "Structured Integration", a term that generalizes sum factorization algorithms for finite element assembly. The main thing that was missing for full support of the exact sequence was pullbacks in FunctionSpaceTools for spaces other than H(grad). This PR adds those, as well as implementations of some sample formulations -- corresponding to the natural norms in each function space -- in both unit tests and performance tests. There are some changes to the new data structures as well as to the new integration kernels that were required for this full support. StructuredIntegrationTests has gotten an upgrade; for the assembly comparison tests (of which there are now five distinct formulations assembled: Poisson plus the natural norms for each space), we now use test templates. This simplifies declarations, makes it easier to see what cases are covered, and eliminates a fair amount of code redundancy. Modified handling of tensor bases, especially with regard to CellTopology. Initial (draft) implementation of arbitrary-dimensional H^1, L^2 bases on the hypercube. Note: the basis equivalence test added here fails at the moment; an exception is thrown in TensorBasis::getValues() (the five-argument version that subclasses are nominally supposed to implement). Made TensorBasis default implementation support OPERATOR_VALUE and OPERATOR_GRAD. (TODO: delete implementations in subclasses that handle these for the five-argument getValues().) Revised quadrature in BasisEquivalenceTests to deal correctly with extrusions. Right now, the new basis equivalence test fails with an exception in getCubatureImpl(), where there's an assertion that there are either two or three direct cubatures. (And the copy logic only handles those two cases.) So we also need (probably) to revise this to support arbitrary dimensions. Working to rewrite BasisEquivalenceTests using BasisValues instead of DynRankView; right now, things don't compile… Dropping implementation of getValues() in tensor-product DerivedBasis_HGRAD and _HVOL, since now TensorBasis handles these. But right now we fail during TensorBasis::allocateBasisValues() for OPERATOR_Dn. Wrote a TODO there regarding what needs to change to fix this. Implemented a default getSimpleOperatorDecomposition() within TensorBasis. Extended getDkCardinality() computation within Basis to support spaceDim up to 7. Revisions, including a TODO discussing one current test failure. (There are almost certainly more.) Commented out some debugging output. Switching from fixed-length Kokkos::Array to dynamic Kokkos::vector for vector components, to support OP_Dn use cases. Implemented TensorBasis::getOperatorDecomposition() built-in support for OP_D1 through OP_D10. More work on TensorBasis; now all MonolithicExecutable tests pass. Some other test failures: 150 - Intrepid2_unit-test_Projection_Serial_Test_InterpolationProjection_HEX_MPI_1 (Failed) 158 - Intrepid2_unit-test_Projection_Serial_Test_Convergence_HEX_MPI_1 (Failed) (This is locally on my machine, a Serial Release build. Have not tried OpenMP or CUDA.) Fixed the logic for setting shards topology and tags for H(curl) and H(div) bases on the hexahedron. Locally (in serial, CPU), all tests now pass. Added Intrepid2::CellTopology, with support for arbitrarily many tensorial extrusions. Revised doxygen to reflect the fact that tags are supported by raw TensorBasis now. Added support for tags to raw TensorBasis, in terms of component tags. Getting BasisEquivalenceTests into better shape for testing hypercube bases, including tests up to spaceDim = 7, and fixed an issue in which the cellTopo used for checking tags was incorrect (used baseTopo instead of the tensorial extrusion). Draft implementation of SerendipityBasis. *Not* even a complete draft: refers to BasisValues::setOrdinalFilter(), which is not yet declared, much less implemented. But because most methods here aren't referenced by code in a .cpp file, the changes do not interfere with a successful build. Added BasisValues::setOrdinalFilter() method, and support within operator() and extent_int() to honor this. For Serendipity, eliminated the pointType argument, which is ignored for all of our hierarchical basis families anyway. Fixed definition of HostBasis type. Added a first test against Serendipity bases: a check that the actual cardinality matches the formula for the cardinality. This test passes. Fixed a missing return in allocateBasisValues(). Added accessor for ordinal map. Fixed a couple issues that affect BasisValues when you have a non-trivial ordinal filter. Fixed a test failure that resulted from comparing on two different cell topologies as though they were the same. Added tests against Serendipity basis. These pass. Generalized getDkEnumeration() and getDkEnumerationInverse() to arbitrary dimensions. Made getTensorDkEnumeration() support up to total dimension of 7. Added support for POINTTYPE_DEFAULT for nodal H(div) Hex and Quad bases. Made triangle H(grad) basis fail cleanly when the order exceeds MaxOrder; was a buffer overrun. Made various bases fail cleanly when the order exceeds MaxOrder; in most cases, this could cause a buffer overrun. Added a check that points container has the right number of tensor components. Fixed a couple places where we incorrectly determined the space dim for a basis because we neglected the number of tensorial extrusions, as returned by getNumTensorialExtrusions(). Fixed an error in which a Kokkos::Array was potentially used uninitialized. Added getDomainDimension() method. Made TensorBasis::allocateBasisValues() a bit more flexible with regard to inputPoints argument (allows more tensorial components in points than there are basis components, so long as the basis lines up; this case can arise when you have a hypercube topology for one of the componentsand a non-tensor basis defined on it; e.g. HGRAD_HEX_Cn_FEM, extruded some number of times). Added tests against getDkEnumeration and getDkEnumerationInverse, and fixed the issues in getDkEnumerationInverse that these revealed. Fixed an error in the treatment of tensor points for the first component basis of a TensorBasis (as when a 3-component point input is provided to HGRAD_QUAD x HGRAD_LINE). Added a TensorPoints constructor that copies selected tensor components from another TensorPoints container. Tweaked doxygen for the new TensorPoints constructor. Removed extraneous template parameter from the subset constuctor for TensorPoints. Fixed an error in the TensorData-combining constructor. Added some sanity checks on TensorPoints components. Continued refactoring StructuredIntegrationPerformance in preparation for H(div) and H(curl) test cases, as well as adding a choice of basis family. Also, some ExecutionSpace --> DeviceType replacements. TransformedVectorData class --> TransformedBasisValues (still need to rename the files). Added preliminary support for getHGRADtransformVALUE(). Added preliminary support for H^1 -- that is, (grad,grad) + (value,value) -- assembly in StructuredIntegrationPerformance. Right now, this fails with an exception during the structured assembly of (value,value), due to the transforms being the identity and therefore unset, but integrate() assumes that they are valid Data objects. # This is the commit message trilinos#2: Tweaked comment to cover the scalar basis value case. # This is the commit message trilinos#3: Made composedTransform construction cognizant of the invalid-transform-means-identity convention. # This is the commit message trilinos#4: Fixed H1StandardAssembly to make it tolerant of the case when a workset size does not evenly divide the number of cells. # This is the commit message trilinos#5: Removed console output indicating that H1StructuredAssembly was not yet fully implemented -- it is now, though we haven't verified the values at all. # This is the commit message trilinos#6: Added storage of the assembled matrix results, such that they can be compared across algorithms to verify that the various algorithms agree. (Have not yet implemented that comparison, but have added a TODO where it should go.) # This is the commit message trilinos#7: Added comparisons of assembled stiffness matrices to verify correctness. For Poisson, these pass. For the new H^1 formulation, they do not pass yet. Fixed domainExtents when generating CellGeometry (was 0, leading to NaNs in Jacobians). # This is the commit message trilinos#8: Some fixes for integrate() to support (value, value) integral (for H^1 values, at least). I've run a couple H^1 test cases here, and these pass. # This is the commit message trilinos#9: Commented out my debugging test cases in favor of the standard (CI) test cases. # This is the commit message trilinos#10: Initial effort at H(div) performance tests. At present, fail with an exception during the structured integration. TODO in place for a fix. # This is the commit message trilinos#11: A bunch of fixes toward getting the H(div) formulation working. From the test I'm running right now (single-cell), it looks like everything is working except the Uniform case, and in that case, we have an incorrect jacobianDividedByJacobianDet (the second and third diagonal entries are missing). The likely culprit is the in-place product computation; maybe it doesn't handle the block-plus diagonal well? This includes some console output that shows the wrong values for jacobianDividedByJacobianDet. # This is the commit message trilinos#12: Added getWritableEntryWithPassThroughOption() and getEntryWithPassThroughOption(), allowing in-place combinations involving block-diagonal matrices. Fixed a bug in getDataExtent() in which the for loop iterated over the whole activeDims_ container, rather than the first numActiveDims_ entries. If the data there happened to match the dimension d in the argument, this could have led to incorrect behavior. Fixed a bug in Data(std::vector<DimensionInfo> dimInfoVector) constructor, in which blockPlusDiagonal dimensions would not be processed correctly. # This is the commit message trilinos#13: Added unit test InPlaceProduct_Matrix, which motivated the changes in the previous commit. # This is the commit message trilinos#14: Got H(div) structured assembly working. # This is the commit message trilinos#15: Got rid of the "hypercube" modifier in the various assembly routine names: so far, we only test hypercubes, but the assembly is agnostic to that. The fact that it's a hypercube is in the geometry object which is passed in. # This is the commit message trilinos#16: Added definitions, implementations of getHCURLtransformVALUE and getHCURLtransformCURL. # This is the commit message trilinos#17: Fixed file creation dates listed in headers (not exact; just don't want to claim that these were there last summer)… # This is the commit message trilinos#18: Added H(curl) formulations, and filled out support for this in StructuredIntegrationPerformance. These already pass! The H(curl) transformations look a lot like the H(div) ones... # This is the commit message trilinos#19: Made getDkEnumeration(), getDkTensorIndex() KOKKOS_INLINE_FUNCTIONs (had been merely inline). # This is the commit message trilinos#20: Eliminating use of Kokkos::vector to store elements in VectorData's vectorComponents_ member; evidently, Kokkos::vector's operator[] requires UVM to execute on device under CUDA; it becomes merely an inline function when KOKKOS_ENABLE_CUDA_UVM is unset. # This is the commit message trilinos#21: Removing now-incorrect/unnecessary construction of VectorArray objects (now that VectorArray is a fixed-length Kokkos::Array). # This is the commit message trilinos#22: Eliminated a couple of unused variables. # This is the commit message trilinos#23: Changed dimToComponent_, dimToComponentDim_, numDimsForComponent_ from Kokkos::vector to Kokkos::Array. # This is the commit message trilinos#24: Eliminated a few more incorrect/unnecessary VectorArray() constructor calls. # This is the commit message trilinos#25: Dropped reference (&) from a few Data object definitions; looks like these are causing problems with lambda capture on CUDA. # This is the commit message trilinos#26: Dropped some const qualifiers, as well as references, for the affine path in IntegrationTools. # This is the commit message trilinos#27: Added support for outputting timings data to file. # This is the commit message trilinos#28: StructuredIntegrationPerformance: Reworking the way that meshes get sized, so that the stiffness matrices fit within a 2 GB budget. # This is the commit message trilinos#29: Reworking calibration to use the "core integration" throughput as the thing to maximize. Also increased the max workset size to match the new max cell count (32768). # This is the commit message trilinos#30: Fixed a bug in which the various structured assembly methods would undercount flop estimates (we were clobbering one integration's flops with another's). # This is the commit message trilinos#31: Trying something with Calibration mode, in which we skip over workset sizes that were bigger than the optimal choice for prior polyOrders. # This is the commit message trilinos#32: Implemented L^2 formulation, including getHVOLtransformVALUE() implementation in FunctionSpaceTools. Loosened tolerances. Added some calibration results (these are not yet complete). # This is the commit message trilinos#33: Added missing fences, which meant that "Standard" timers were undercounting (dramatically for H(vol)). # This is the commit message trilinos#34: Added calibrations for all formulations/platforms except Serial. Running Serial now... # This is the commit message trilinos#35: Finished calibrations (BestSerial, BestOpenMP_16, BestCuda). # This is the commit message trilinos#36: Moved assembly headers to a common location; started refactoring StructuredIntegrationTests to use this. (So far, it only uses Poisson formulation in testing.) # This is the commit message trilinos#37: An attempt to fix a failure in HexahedronHierarchicalDGVersusHierarchicalCG_HGRAD -- by expanding the max compenents in VectorData. At the moment, this is leading to segfaults without much clarity in lldb as to what's causing it. Pushing so that I can build on other machines with other debuggers. # This is the commit message trilinos#38: Attempting to resolve some "missing return statement" warnings from nvcc. # This is the commit message trilinos#39: Fixing unused variable warning. # This is the commit message trilinos#40: Turning off tests for OPERATOR_D3 and above for derived basis classes. # This is the commit message trilinos#41: Set MaxVectorComponents = 7 = MaxTensorComponents. There's a weird issue exhibited by an Apple clang (debug) build of MonolithicExecutable: a seg fault during HexahedronHierarchicalDGVersusHierarchicalCG_HGRAD, with little or no information from the debugger on what went wrong, which evidently happens whenever MaxVectorComponents != MaxTensorComponents. # This is the commit message trilinos#42: Eliminating certain OPERATOR_Dn tests for compatibility with VectorData limitations. # This is the commit message trilinos#43: Fixed a couple issues with CUDA build. # This is the commit message trilinos#44: Got rid of some extraneous "using" lines in Standard Assembly headers. # This is the commit message trilinos#45: Tweaked doxygen. # This is the commit message trilinos#46: Reverting some changes that are specific to the Serendipity branch -- want to exclude those from this branch, and do a separate PR for Serendipity. # This is the commit message trilinos#47: Deleted Serendipity basis. # This is the commit message trilinos#48: More serendipity reversion. # This is the commit message trilinos#49: Some tweaks to exception tests, code formatting. # This is the commit message trilinos#50: More Serendipity reversion (possibly done, but I haven't tried building yet!).
…t generalizes sum factorization algorithms for finite element assembly. The main thing that was missing for full support of the exact sequence was pullbacks in FunctionSpaceTools for spaces other than H(grad). This PR adds those, as well as implementations of some sample formulations -- corresponding to the natural norms in each function space -- in both unit tests and performance tests. There are some changes to the new data structures as well as to the new integration kernels that were required for this full support. StructuredIntegrationTests has gotten an upgrade; for the assembly comparison tests (of which there are now five distinct formulations assembled: Poisson plus the natural norms for each space), we now use test templates. This simplifies declarations, makes it easier to see what cases are covered, and eliminates a fair amount of code redundancy. Modified handling of tensor bases, especially with regard to CellTopology. Initial (draft) implementation of arbitrary-dimensional H^1, L^2 bases on the hypercube. Note: the basis equivalence test added here fails at the moment; an exception is thrown in TensorBasis::getValues() (the five-argument version that subclasses are nominally supposed to implement). Made TensorBasis default implementation support OPERATOR_VALUE and OPERATOR_GRAD. (TODO: delete implementations in subclasses that handle these for the five-argument getValues().) Revised quadrature in BasisEquivalenceTests to deal correctly with extrusions. Right now, the new basis equivalence test fails with an exception in getCubatureImpl(), where there's an assertion that there are either two or three direct cubatures. (And the copy logic only handles those two cases.) So we also need (probably) to revise this to support arbitrary dimensions. Working to rewrite BasisEquivalenceTests using BasisValues instead of DynRankView; right now, things don't compile… Dropping implementation of getValues() in tensor-product DerivedBasis_HGRAD and _HVOL, since now TensorBasis handles these. But right now we fail during TensorBasis::allocateBasisValues() for OPERATOR_Dn. Wrote a TODO there regarding what needs to change to fix this. Implemented a default getSimpleOperatorDecomposition() within TensorBasis. Extended getDkCardinality() computation within Basis to support spaceDim up to 7. Revisions, including a TODO discussing one current test failure. (There are almost certainly more.) Commented out some debugging output. Switching from fixed-length Kokkos::Array to dynamic Kokkos::vector for vector components, to support OP_Dn use cases. Implemented TensorBasis::getOperatorDecomposition() built-in support for OP_D1 through OP_D10. More work on TensorBasis; now all MonolithicExecutable tests pass. Some other test failures: 150 - Intrepid2_unit-test_Projection_Serial_Test_InterpolationProjection_HEX_MPI_1 (Failed) 158 - Intrepid2_unit-test_Projection_Serial_Test_Convergence_HEX_MPI_1 (Failed) (This is locally on my machine, a Serial Release build. Have not tried OpenMP or CUDA.) Fixed the logic for setting shards topology and tags for H(curl) and H(div) bases on the hexahedron. Locally (in serial, CPU), all tests now pass. Added Intrepid2::CellTopology, with support for arbitrarily many tensorial extrusions. Revised doxygen to reflect the fact that tags are supported by raw TensorBasis now. Added support for tags to raw TensorBasis, in terms of component tags. Getting BasisEquivalenceTests into better shape for testing hypercube bases, including tests up to spaceDim = 7, and fixed an issue in which the cellTopo used for checking tags was incorrect (used baseTopo instead of the tensorial extrusion). Draft implementation of SerendipityBasis. *Not* even a complete draft: refers to BasisValues::setOrdinalFilter(), which is not yet declared, much less implemented. But because most methods here aren't referenced by code in a .cpp file, the changes do not interfere with a successful build. Added BasisValues::setOrdinalFilter() method, and support within operator() and extent_int() to honor this. For Serendipity, eliminated the pointType argument, which is ignored for all of our hierarchical basis families anyway. Fixed definition of HostBasis type. Added a first test against Serendipity bases: a check that the actual cardinality matches the formula for the cardinality. This test passes. Fixed a missing return in allocateBasisValues(). Added accessor for ordinal map. Fixed a couple issues that affect BasisValues when you have a non-trivial ordinal filter. Fixed a test failure that resulted from comparing on two different cell topologies as though they were the same. Added tests against Serendipity basis. These pass. Generalized getDkEnumeration() and getDkEnumerationInverse() to arbitrary dimensions. Made getTensorDkEnumeration() support up to total dimension of 7. Added support for POINTTYPE_DEFAULT for nodal H(div) Hex and Quad bases. Made triangle H(grad) basis fail cleanly when the order exceeds MaxOrder; was a buffer overrun. Made various bases fail cleanly when the order exceeds MaxOrder; in most cases, this could cause a buffer overrun. Added a check that points container has the right number of tensor components. Fixed a couple places where we incorrectly determined the space dim for a basis because we neglected the number of tensorial extrusions, as returned by getNumTensorialExtrusions(). Fixed an error in which a Kokkos::Array was potentially used uninitialized. Added getDomainDimension() method. Made TensorBasis::allocateBasisValues() a bit more flexible with regard to inputPoints argument (allows more tensorial components in points than there are basis components, so long as the basis lines up; this case can arise when you have a hypercube topology for one of the componentsand a non-tensor basis defined on it; e.g. HGRAD_HEX_Cn_FEM, extruded some number of times). Added tests against getDkEnumeration and getDkEnumerationInverse, and fixed the issues in getDkEnumerationInverse that these revealed. Fixed an error in the treatment of tensor points for the first component basis of a TensorBasis (as when a 3-component point input is provided to HGRAD_QUAD x HGRAD_LINE). Added a TensorPoints constructor that copies selected tensor components from another TensorPoints container. Tweaked doxygen for the new TensorPoints constructor. Removed extraneous template parameter from the subset constuctor for TensorPoints. Fixed an error in the TensorData-combining constructor. Added some sanity checks on TensorPoints components. Continued refactoring StructuredIntegrationPerformance in preparation for H(div) and H(curl) test cases, as well as adding a choice of basis family. Also, some ExecutionSpace --> DeviceType replacements. TransformedVectorData class --> TransformedBasisValues (still need to rename the files). Added preliminary support for getHGRADtransformVALUE(). Added preliminary support for H^1 -- that is, (grad,grad) + (value,value) -- assembly in StructuredIntegrationPerformance. Right now, this fails with an exception during the structured assembly of (value,value), due to the transforms being the identity and therefore unset, but integrate() assumes that they are valid Data objects. Tweaked comment to cover the scalar basis value case. Made composedTransform construction cognizant of the invalid-transform-means-identity convention. Fixed H1StandardAssembly to make it tolerant of the case when a workset size does not evenly divide the number of cells. Removed console output indicating that H1StructuredAssembly was not yet fully implemented -- it is now, though we haven't verified the values at all. Added storage of the assembled matrix results, such that they can be compared across algorithms to verify that the various algorithms agree. (Have not yet implemented that comparison, but have added a TODO where it should go.) Added comparisons of assembled stiffness matrices to verify correctness. For Poisson, these pass. For the new H^1 formulation, they do not pass yet. Fixed domainExtents when generating CellGeometry (was 0, leading to NaNs in Jacobians). Some fixes for integrate() to support (value, value) integral (for H^1 values, at least). I've run a couple H^1 test cases here, and these pass. Commented out my debugging test cases in favor of the standard (CI) test cases. Initial effort at H(div) performance tests. At present, fail with an exception during the structured integration. TODO in place for a fix. A bunch of fixes toward getting the H(div) formulation working. From the test I'm running right now (single-cell), it looks like everything is working except the Uniform case, and in that case, we have an incorrect jacobianDividedByJacobianDet (the second and third diagonal entries are missing). The likely culprit is the in-place product computation; maybe it doesn't handle the block-plus diagonal well? This includes some console output that shows the wrong values for jacobianDividedByJacobianDet. Added getWritableEntryWithPassThroughOption() and getEntryWithPassThroughOption(), allowing in-place combinations involving block-diagonal matrices. Fixed a bug in getDataExtent() in which the for loop iterated over the whole activeDims_ container, rather than the first numActiveDims_ entries. If the data there happened to match the dimension d in the argument, this could have led to incorrect behavior. Fixed a bug in Data(std::vector<DimensionInfo> dimInfoVector) constructor, in which blockPlusDiagonal dimensions would not be processed correctly. Added unit test InPlaceProduct_Matrix, which motivated the changes in the previous commit. Got H(div) structured assembly working. Got rid of the "hypercube" modifier in the various assembly routine names: so far, we only test hypercubes, but the assembly is agnostic to that. The fact that it's a hypercube is in the geometry object which is passed in. Added definitions, implementations of getHCURLtransformVALUE and getHCURLtransformCURL. Fixed file creation dates listed in headers (not exact; just don't want to claim that these were there last summer)… Added H(curl) formulations, and filled out support for this in StructuredIntegrationPerformance. These already pass! The H(curl) transformations look a lot like the H(div) ones... Made getDkEnumeration(), getDkTensorIndex() KOKKOS_INLINE_FUNCTIONs (had been merely inline). Eliminating use of Kokkos::vector to store elements in VectorData's vectorComponents_ member; evidently, Kokkos::vector's operator[] requires UVM to execute on device under CUDA; it becomes merely an inline function when KOKKOS_ENABLE_CUDA_UVM is unset. Removing now-incorrect/unnecessary construction of VectorArray objects (now that VectorArray is a fixed-length Kokkos::Array). Eliminated a couple of unused variables. Changed dimToComponent_, dimToComponentDim_, numDimsForComponent_ from Kokkos::vector to Kokkos::Array. Eliminated a few more incorrect/unnecessary VectorArray() constructor calls. Dropped reference (&) from a few Data object definitions; looks like these are causing problems with lambda capture on CUDA. Dropped some const qualifiers, as well as references, for the affine path in IntegrationTools. Added support for outputting timings data to file. StructuredIntegrationPerformance: Reworking the way that meshes get sized, so that the stiffness matrices fit within a 2 GB budget. Reworking calibration to use the "core integration" throughput as the thing to maximize. Also increased the max workset size to match the new max cell count (32768). Fixed a bug in which the various structured assembly methods would undercount flop estimates (we were clobbering one integration's flops with another's). Trying something with Calibration mode, in which we skip over workset sizes that were bigger than the optimal choice for prior polyOrders. Implemented L^2 formulation, including getHVOLtransformVALUE() implementation in FunctionSpaceTools. Loosened tolerances. Added some calibration results (these are not yet complete). Added missing fences, which meant that "Standard" timers were undercounting (dramatically for H(vol)). Added calibrations for all formulations/platforms except Serial. Running Serial now... Finished calibrations (BestSerial, BestOpenMP_16, BestCuda). Moved assembly headers to a common location; started refactoring StructuredIntegrationTests to use this. (So far, it only uses Poisson formulation in testing.) An attempt to fix a failure in HexahedronHierarchicalDGVersusHierarchicalCG_HGRAD -- by expanding the max compenents in VectorData. At the moment, this is leading to segfaults without much clarity in lldb as to what's causing it. Pushing so that I can build on other machines with other debuggers. Attempting to resolve some "missing return statement" warnings from nvcc. Fixing unused variable warning. Turning off tests for OPERATOR_D3 and above for derived basis classes. Set MaxVectorComponents = 7 = MaxTensorComponents. There's a weird issue exhibited by an Apple clang (debug) build of MonolithicExecutable: a seg fault during HexahedronHierarchicalDGVersusHierarchicalCG_HGRAD, with little or no information from the debugger on what went wrong, which evidently happens whenever MaxVectorComponents != MaxTensorComponents. Eliminating certain OPERATOR_Dn tests for compatibility with VectorData limitations. Fixed a couple issues with CUDA build. Got rid of some extraneous "using" lines in Standard Assembly headers. Tweaked doxygen. Reverting some changes that are specific to the Serendipity branch -- want to exclude those from this branch, and do a separate PR for Serendipity. Deleted Serendipity basis. More serendipity reversion. Some tweaks to exception tests, code formatting. More Serendipity reversion (possibly done, but I haven't tried building yet!). Resolving some CUDA build warnings. TransformedVectorData.hpp --> TransformedBasisValues.hpp. TransformedVectorData.hpp --> TransformedBasisValues.hpp. Renamed TransformedVectorDataTests -> TransformedBasisValuesTests. Fixed an error in the copy-like constructor in which the numCells value was not copied. Fixed another couple issues in BasisValuesTests (had operator GRAD for HVOL). Added getHCURLtransformCURL2D(). Made BasisValues::tensorData() const, and made it return a const reference. Added support for 2D integration to HCURLStandardAssembly and HCURLStructuredAssembly. Added default constructor for TransformedBasisValues. Changed TensorData objects for leftComponent and rightComponent within integrate() to be copied, rather than const references. This is generally good practice for CUDA especially, but for some reason I was getting a stack use after return error on a serial, CPU build prior to making this change. After the change, there is no error... Converted the structured-versus-standard quadrature tests to use the new assembly-examples assembly methods. Made these templated tests, and created tests for all relevant formulations (Poisson, H(grad), H(curl), H(div), L^2). The tests pass locally for me. Converted the structured-versus-standard quadrature tests to use the new assembly-examples assembly methods. Made these templated tests, and created tests for all relevant formulations (Poisson, H(grad), H(curl), H(div), L^2). The tests pass locally for me. Reworking the defaults for workset size (in "Best" modes). NodalBasisFamily --> BasisFamily. (Honor selection of hierarchical vs nodal) DerivedNodalBasisFamily --> BasisFamily. Deleted commented-out debugging code. Removed stale TODOs. Removed stale TODOs. Reverting some changes that I made locally to other packages for the sake of my local builds (things for which I should really issue separate PRs).
152c01a
to
fee53d4
Compare
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Using Repos:
Pull Request Author: CamelliaDPG |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 7170 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 4666 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 5185 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 12228 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 4929 (click to expand)
Console Output (last 100 lines) : python-3 # 1492 (click to expand)
Console Output (last 100 lines) : _cuda_10.1.243 # 727 (click to expand)
|
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Using Repos:
Pull Request Author: CamelliaDPG |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 7179 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 4675 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 5194 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 12236 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 4937 (click to expand)
Console Output (last 100 lines) : python-3 # 1500 (click to expand)
Console Output (last 100 lines) : _cuda_10.1.243 # 735 (click to expand)
|
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Using Repos:
Pull Request Author: CamelliaDPG |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 7191 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 4687 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 5206 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 12245 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 4946 (click to expand)
Console Output (last 100 lines) : python-3 # 1509 (click to expand)
Console Output (last 100 lines) : _cuda_10.1.243 # 744 (click to expand)
|
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Using Repos:
Pull Request Author: CamelliaDPG |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 7202 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 4698 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 5217 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 12256 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 4957 (click to expand)
Console Output (last 100 lines) : python-3 # 1520 (click to expand)
Console Output (last 100 lines) : _cuda_10.1.243 # 755 (click to expand)
|
…e a cell dimension.
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job _cuda_10.1.243 to start: Total Wait = 603
|
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job _cuda_10.1.243 to start: Total Wait = 603
|
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job _cuda_10.1.243 to start: Total Wait = 603
|
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job _cuda_10.1.243 to start: Total Wait = 603
|
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Using Repos:
Pull Request Author: CamelliaDPG |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 7342 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 4838 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 5357 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 12396 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 5098 (click to expand)
Console Output (last 100 lines) : python-3 # 1659 (click to expand)
Console Output (last 100 lines) : _cuda_10.1.243 # 767 (click to expand)
|
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Using Repos:
Pull Request Author: CamelliaDPG |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 7350 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 4846 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 5365 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 12404 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 5106 (click to expand)
Console Output (last 100 lines) : python-3 # 1667 (click to expand)
Console Output (last 100 lines) : _cuda_10.1.243 # 775 (click to expand)
|
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Using Repos:
Pull Request Author: CamelliaDPG |
I think that in principle for simple terms like that you could apply the orientation after you build the local matrix, but that's not what we do, we apply orientations before we transform the basis from reference to physical elements. Also for nonlinear terms it doesn't seem possible to do that. |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 7357 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 4853 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 5372 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 12411 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 5113 (click to expand)
Console Output (last 100 lines) : python-3 # 1674 (click to expand)
Console Output (last 100 lines) : _cuda_10.1.243 # 782 (click to expand)
|
I guess one could apply the transposed orientation matrix to the basis coefficients of FE fields and left-multiply the local matrix by the orientation matrix. This would be equivalent to modify the gather and scatter methods in Panzer, and would avoid performing the mapping of the basis functions at each quadrature point, and for each basis function evaluation (value, gradient, divergence, etc.). |
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job Trilinos_pullrequest_intel_17.0.1 to start: Total Wait = 603
|
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Using Repos:
Pull Request Author: CamelliaDPG |
Yes, let's talk further about this sometime soon. I do think you can do this in a perfectly general way at the global assembly stage -- I do it this way in Camellia, and I think I was imitating Demkowicz's code in doing so. But I also think that there's a pretty natural way to add orientations into the new data structures; specifically, the orientation for a tensor-product geometry will decompose into orientations on the tensorial components. One idea would be to include an optional Orientation array in the "transform" methods, and incorporate into the TransformedBasisValues structure included in this PR. The assembly could then apply the orientations inline, probably without too much change required in the core kernels. |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_8.3.0 # 7379 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_serial # 4875 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0_debug # 5394 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 12426 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_clang_10.0.0 # 5128 (click to expand)
Console Output (last 100 lines) : python-3 # 1689 (click to expand)
Console Output (last 100 lines) : _cuda_10.1.243 # 797 (click to expand)
|
…rrors in the PR testing).
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
Using Repos:
Pull Request Author: CamelliaDPG |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_serial
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: python-3
Jenkins Parameters
Build InformationTest Name: _cuda_10.1.243
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ GrahamBenHarper ]! |
Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR... |
…s:develop' (3ff94a4). * trilinos-develop: Intrepid2: Support Full Exact Sequence in Structured Integration (trilinos#10419) Tpetra: Use numVectors==1 specialization for local parts of norms
…s:develop' (3ff94a4). * trilinos-develop: Intrepid2: Support Full Exact Sequence in Structured Integration (trilinos#10419) Tpetra: Use numVectors==1 specialization for local parts of norms
…Trilinos:develop' (01a4ee0). * trilinos/develop: Intrepid2: Support Full Exact Sequence in Structured Integration (trilinos#10419) Tpetra: Use numVectors==1 specialization for local parts of norms
@trilinos/intrepid2
Motivation
This PR extends previous work in "Structured Integration", a term that generalizes sum factorization algorithms for finite element assembly. The main thing that was missing for full support of the exact sequence was pullbacks in FunctionSpaceTools for spaces other than H(grad). This PR adds those, as well as implementations of some sample formulations -- corresponding to the natural norms in each function space -- in both unit tests and performance tests. There are some changes to the new data structures as well as to the new integration kernels that were required for this full support.
StructuredIntegrationTests has gotten an upgrade; for the assembly comparison tests (of which there are now five distinct formulations assembled: Poisson plus the natural norms for each space), we now use test templates. This simplifies declarations, makes it easier to see what cases are covered, and eliminates a fair amount of code redundancy.
Notably, we observe worse performance for standard Intrepid2 integration under Cuda on Weaver than we previously saw on Ride (which is no longer available). It is not clear to me what the cause is, but I have verified that Trilinos develop exhibits similar behavior; I have seen no evidence that this PR is in any way implicated in the performance degradation (which would be unlikely in any case, since this PR makes no changes to standard integration kernels; its target is the new structured integration kernels).
(A document with full performance results is available on request.)
Testing
The new features introduced here are thoroughly tested.