forked from trilinos/Trilinos
-
Notifications
You must be signed in to change notification settings - Fork 0
/
RELEASE_NOTES
5977 lines (4395 loc) · 256 KB
/
RELEASE_NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
###############################################################################
# #
# Trilinos Release 12.2 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 12.2 general release contains 58 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Domi, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, MueLu, NOX, Optika, OptiPack,
Pamgen, Phalanx, Pike, Piro, Pliris, PyTrilinos, ROL*, RTOp, Rythmos, Sacado,
SEACAS, Shards, ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Teuchos,
ThreadPool, Thyra, Tpetra, TriKota, TrilinosCouplings, Trios, Triutils, Xpetra,
Zoltan, Zoltan2.
(* denotes package is being released externally as a part of Trilinos for the
first time.)
Domi
- Input arguments that come in the form of const Domi::MDArrayView<
int > have been changed to be const-correct, i.e., they are now
const Domi::MDArrayView< const int >. This allowed for more logical
wrappers in PyTrilinos.
- A new Domi::DefaultNode class has been added. If Tpetra is enabled,
it uses the Tpetra default node. If Tpetra is not enabled, it uses
the serial wrapper node from Teuchos.
- Domi's required dependencies have been simplified, to Teuchos,
Kokkos, and TeuchosKokkosCompat. Optional dependencies are Epetra,
TpetraClassic, and TpetraCore.
MueLu
- Allow to dynamically switch transfer operators between different multigrid
levels. Can be used in context of semi-coarsening.
- Enabled semi-coarsening (using Ray's line detection algorithm).
- Add support for line smoothing (Ifpack/Ifpack2) [EXPERIMENTAL]
- New AMGX Adapter [EXPERIMENTAL]
New experimental adapter which allows a user with AMGX installed to utilize
this software for the preconditioning and solution of linear systems. If a
user provides AMGX configuration options instead of a MueLu input deck, the
adapter will be called. Currently supported with Tpetra objects.
- Matlab interface for MueLu [EXPERIMENTAL]
Setup and solve hierarchies from Matlab and use Matlab functions as MueLu
factories.
PyTrilinos
- General
- Updated the Developers Guide
- Teuchos
- Made Teuchos a required dependency for PyTrilinos.
- Domi
- Domi wrappers now use new Domi::DefaultNode class
- Added HAVE_DOMI as a macro in PyTrilinos_config.h
- Fixed docstrings for Domi package
- Added a simple Domi example
- Fixed Domi.MDVector extensions
- Tpetra
- Enabled Tpetra wrappers in the release branch
- Fixed a dynamic typing problem
- Bug fixes in Tpetra.Map
- Added HAVE_TPETRA as a macro in PyTrilinos_config.h
- Fixed docstrings for Tpetra package
- Got wrappers for all requiredTpetra constructors to work
- Expanded Tpetra.Vector unit tests
- Added unit tests for Tpetra.MultiVector
ROL
Rapid Optimization Library (ROL) is a C++ package for large-scale
optimization. It is used for the solution of optimal design, optimal control
and inverse problems in large-scale engineering applications. Other uses
include mesh optimization and image processing.
ROL aims to combine flexibility, efficiency and robustness. Key features:
- Matrix-free application programming interfaces (APIs) --enable direct use
of application data structures and memory spaces, linear solvers,
nonlinear solvers and preconditioners.
- State-of-the-art algorithms for unconstrained optimization, constrained
optimization and optimization under uncertainty --enable inexact and
adaptive function evaluations and iterative linear system solves.
- Special APIs for simulation-based optimization --enable a streamlined
embedding into engineering applications, rigorous implementation
verification and efficient use.
- Modular interfaces throughout the optimization process --enable custom
and user-defined algorithms, stopping criteria, hierarchies of algorithms,
and selective use of a variety of tools and components.
For a detailed description of user interfaces and algorithms included in this
release, see the presentation ROL-Trilinos-12.2.pptx (or .pdf) in the
doc/presentations directory.
Tpetra
- Improvements to the "local" part of Tpetra::Map
Tpetra::Details::FixedHashTable implements the "local" part of
Tpetra::Map, where the "local part" is that which does not use MPI
communication. For example, FixedHashTable knows how to convert from
global indices to local indices, for all the global indices known by
the calling process.
FixedHashTable now uses Kokkos for its data structures. Its
initialization is completely Kokkos parallel, and its conversions
between global and local indices are Kokkos device functions. This
achieves an important goal of making the local part of Tpetra::Map
functionality available for Kokkos parallel operations.
- Many Tpetra classes now split instantiations into multiple files
This matters only when explicit template instantiation (ETI) is ON.
(This _should_ be ON by default, but is not ON by default yet.)
The largest Tpetra classes (e.g., CrsGraph, CrsMatrix, and
MultiVector) now split their explicit instantiations into multiple
.cpp files. This helps reduce build times and memory usage when ETI
is ON, and makes setting ETI ON an even more attractive option for
applications.
- Fixed Bugs 6335, 6336, 6377, and others
- Improved tests to catch errors on processes other than Process 0
- Improved CMake output and internal ETI-related documentation
###############################################################################
# #
# Trilinos Release 12.0 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 12.0 general release contains 57 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Domi, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, MueLu, NOX, Optika, OptiPack,
Pamgen, Phalanx, Pike, Piro, Pliris, PyTrilinos, RTOp, Rythmos, Sacado, SEACAS,
Shards, ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Teuchos, ThreadPool,
Thyra, Tpetra, TriKota, TrilinosCouplings, Trios, Triutils, Xpetra, Zoltan,
Zoltan2.
Domi
Domi provides distributed data structures for multi-dimensional data.
The inspirational use case is parallel domain decomposition for finite
difference applications. To that end, Domi provides the following
classes:
- MDArray, MDArrayView, MDArrayRCP
These classes define multi-dimensional arrays, with arbitrary
runtime-defined dimensions, built on top of the Teuchos Array,
ArrayView, and ArrayRCP classes, respectively. These are serial
in nature and provide the mechanism for data manipulation on each
processor of distributed MDVectors.
- Slice
The Slice class is inspired by the Python slice object, which
stores a start, stop, and step index, and can be constructed to
utilize a variety of logical default values. Slices can be used
on all multi-dimensional objects to return views into subsets of
those objects.
- MDComm
Multi-dimensional communicator. This communicator provides a map
between a multi-dimensional array of processors and their ranks,
and can be queried for neighbor ranks. MDComms can be sliced,
which returns a sub-sommunicator.
- MDMap
An MDMap describes the decomposition of an MDVector on an MDComm.
It stores the start and stop indexes along each dimension,
including boundary padding for algorithms that require extra
indexes along boundaries, and communication padding used to
update values from neighboring processors. MDMaps can be sliced,
and the resulting MDMap may reside on a sub-communicator. An
MDMap can be converted to an equivalent Epetra or Tpetra Map.
- MDVector
An MDVector is a multi-dimensional array, distrubuted in a
structured manner across multiple processors. This distribution
is described by an MDVector's MDMap. The MDVector's data is
stored locally on each processor with an MDArrayRCP. An MDVector
can update its communication padding from neighboring processors
automatically. An MDVector can be sliced, and the resulting
MDVector may reside on a sub-communicator. An MDVector can be
converted to equivalent Epetra or Tpetra Vectors or MultiVectors.
If there are no stride gaps in the data due to slicing, these
converted Epetra and Tpetra objects may be views of the original
data.
MueLU
- Hierarchy::Iterate now understands tolerance
When MueLu::Hierarchy is being used as a standalone solver, and not as a
preconditioner, a user may now create a stopping criteria based on a
provided tolerance for the relative residual in addition to the maximum
number of iterations
- New reuse options option: "tP"
This reuse option allows reuse of only tentative prolongators, while
rebuilding smoothed prolongator and coarse level operators.
- Selected bugfixes:
6301: Operator complexity was computed incorrectly for large size
problems
Pike
PIKE: Physics Integration KErnels
Pike is a blackbox multiphysics coupling tool. It provides basic
interfaces and implementations for building high level multiphysics
coupling strategies. In particular, PIKE provides utilities for
Picard-style couplings. For Newton-based couplings, use the NOX and
Thyra packages to build the block phsyics systems. In the future,
interoperability tools between NOX, PIKE and PIRO will be added.
- Initial release!
- Supports block Jacobi and block Gauss-Seidel coupling.
- Supports global and local convergence criteria.
- Supports hierarchical solves.
- Supports subcycled transient solves and steady-state.
- Support both Parameter and Response interfaces.
- Contains a multiphysics distributor to support parallel
distribution of applications.
- Provides abstract factories for solvers and global status tests
- Supports observers for user injection of code. Special observers
for logging and debugging are implemeted.
- Pure virtual interfaces for applications and data transfers.
- Adapter for nesting a solver as a model evaluator for hierarchical
solves.
****************************************
PyTrilinos
- General
- Changed to BSD license
- Mpi4Py support has been made optional. Previously, if Mpi4Py was
found, it was automatically enabled within PyTrilinos. Now that
behavior can be turned off.
- LOCA
- The LOCA module has been refactored and now has been demonstrated
to work for the Chan problem. We have two example problems
working, one without preconditioning, and one with.
- Tpetra
- Package is still experimental. The recent refactor has broken
MultiVectors and Vectors.
- Map support has been improved
- Anasazi
- Fixed a bug where return eigenvalues are converted to a NumPy
array, but the dimension used the wrong type.
- Kokkos
- Fixed a macro issue
- NOX
- Started PETSc compatibility. This is still experimental, and
includes compatibility with petsc4py.
- STK
- Removed PyPercept, as it is currently not a part of the new STK.
- Domi
- Added package
Tpetra
- Tpetra now requires C++11
This requirement comes in part from Tpetra itself, and in part from
the Kokkos package, on which Tpetra depends.
- "Kokkos refactor" (new) version of Tpetra is the only version
We no longer enable or support the old ("classic") version of Tpetra.
The new ("Kokkos refactor") implementation of Tpetra is now the only
supported version.
Do not use any of the Node types in the KokkosClassic namespace. We
do not support any of those Node types anymore. Instead, use any of
the following Node types:
- Kokkos::Compat::KokkosOpenMPWrapperNode (OpenMP)
- Kokkos::Compat::KokkosCudaWrapperNode (NVIDIA CUDA)
- Kokkos::Compat::KokkosSerialWrapperNode (no threads)
- Kokkos::Compat::KokkosThreadsWrapperNode (Pthreads)
Each of these is a typedef for
Kokkos::Compat::KokkosDeviceWrapperNode<ExecSpace>, for the
corresponding Kokkos execution space.
- Set / rely on the default Node type as much as possible
Tpetra classes have a template parameter, "Node", which determines
what thread-level parallel programming model Tpetra will use. This
corresponds directly to the "execution space" concept in Kokkos.
Tpetra classes have a default Node type. Users do NOT need to specify
this explicitly. I cannot emphasize this enough:
IF YOU ONLY EVER USE THE DEFAULT VALUES OF TEMPLATE PARAMETERS, DO NOT
SPECIFY THEM EXPLICITLY.
If you need to refer to the default values of template parameters, ask
Tpetra classes. For example, 'Tpetra::Map<>::node_type' is the
default Node type.
Tpetra pays attention to Kokkos' build configuration when determining
the default Node type. For example, it will not use a disabled
execution space. If you do not like the default Node type, but you
only ever use one Node type in your application, you should change the
default Node type at Trilinos configure time. You may do this by
setting the 'KokkosClassic_DefaultNode' CMake option. Here is a list
of reasonable values:
"Kokkos::Compat::KokkosSerialWrapperNode": use Kokkos::Serial
execution space (execute in a single thread on the CPU)
"Kokkos::Compat::KokkosOpenMPWrapperNode": use Kokkos::OpenMP
execution space (use OpenMP for thread-level parallelism on the CPU)
"Kokkos::Compat::KokkosThreadsWrapperNode": use Kokkos::Threads
execution space (use Pthreads (the POSIX Threads library) for
thread-level parallelism on the CPU)
"Kokkos::Compat::KokkosCudaWrapperNode": use Kokkos::Cuda execution
space (use NVIDIA's CUDA programming model for thread-level
parallelism on the CPU)
You must use the above strings with the 'KokkosClassic_DefaultNode'
CMake option. If you choose (unwisely, in many cases) to specify the
Node template parameter directly in your code, you may use those
names. Alternately, you may let the Kokkos execution space determine
the Node type, by using the templated class
Kokkos::Compat::KokkosDeviceWrapperNode. This class is templated on
the Kokkos execution space. The above four types are typedefs to
their corresponding specializations of KokkosDeviceWrapperNode. For
example, KokkosSerialWrapperNode is a typedef of
KokkosDeviceWrapperNode<Kokkos::Serial>. This may be useful if your
code already makes use of Kokkos execution spaces.
- Removed (deprecated classes) Tpetra::VbrMatrix, Tpetra::BlockMap,
Tpetra::BlockCrsGraph, and Tpetra::BlockMultiVector.
All these classes relate to VBR (variable-block-size block sparse
matrix) functionality. We may reimplement that at some point, but for
now it's going away.
- Removed (deprecated class) Tpetra::HybridPlatform
Teuchos
- Fixed Teuchos::Ptr::operator=() to catch dangling references (6 April 2015)
###############################################################################
# #
# Trilinos Release 11.14 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 11.14 general release contains 55 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, MueLu, NOX, Optika, OptiPack,
Pamgen, Phalanx, Piro, Pliris, PyTrilinos, RTOp, Rythmos, Sacado, SEACAS,
Shards, ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Teuchos, ThreadPool,
Thyra, Tpetra, TriKota, TrilinosCouplings, Trios, Triutils, Xpetra, Zoltan,
Zoltan2.
Muelu
- Support for Amesos2 native serial direct solver "Basker".
- ML parameters can be used through MueLu::CreateEpetraPreconditioner and
MueLu::CreateTpetraPreconditioner interfaces
- Several bug fixes:
6256: ML parameter "coarse: type" does not work in
MueLu::MLParameterListInterpreter
6255: Multiple issues in MueLu::MLParameterListInterpreter
- Explicit template instantiation (ETI) changes
The new version of MueLu uses Tpetra macros for specifying the desired
template instantiations values (scalars, local ordinals, global ordinals and
note types). As such, Tpetra instantiation configure options provide the
necessary MueLu instantiations. For instance, instead of the previous option
-D MueLu_INST_DOUBLE_INT_LONGLONGINT=ON
a user should write
-D Tpetra_INST_INT_LONG_LONG
See Tpetra documentation for a full set of options.
- New reuse feature [EXPERIMENTAL]
MueLu introduced a new experimental reuse feature. A user may specify
partial preservation of a multigrid hierarchy through the "reuse: type"
option. Few variants have been implemented:
- "none"
No reuse, the preconditioner is constructed from scratch
- "emin"
Reuse old prolongator as an initial guess to energy minimization, and
reuse the prolongator pattern.
- "RP"
Reuse smoothed prolongator and restrictor. Smoothers and coarse grid
operators are recomputed.
- "RAP"
Recompute only the finest level smoother.
- "full"
Reuse full hierarchy, no changes.
The current user interface is as follows:
// User constructs a hierarchy for the first time
Teuchos::RCP<MueLu::TpetraOperator<SC,LO,GO,NO> > H =
MueLu::CreateTpetraPreconditioner<SC,LO,GO,NO>(A0, xmlFileName);
...
// User reuses existing hierarchy for consequent steps
MueLu::ReuseTpetraPreconditioner(A1, *H);
- Support for user-provided data [EXPERIMENTAL]
New release of MueLu allows user to provide the data for the first few
levels of the multigrid Hierarchy, while allowing MueLu to construct
remaining levels. At the minimum, user needs to provide the data for
fine-level operator A, prolongation operator (P), restriction operator (R)
and coarse-level operator (Ac). These operator are required to derive from
Xpetra::Operator class. This scenario is driven through a ParameterList
interface (see muelu/example/advanced/levelwrap for some use cases).
Tpetra
- Public release of "Kokkos refactor" version of Tpetra
The "Kokkos refactor" version of Tpetra is a new implementation of
Tpetra. It is based on the new Kokkos programming model in the
KokkosCore subpackage. It coexists with the "classic" version of
Tpetra, which has been DEPRECATED and will be removed entirely in the
12.0 major release of Trilinos. Thus, the Kokkos refactor version
will become the /only/ version of Tpetra at that time.
The Kokkos refactor version of Tpetra maintains mostly backwards
compatibility [SEE NOTE BELOW] with the classic version's interface.
Its interface will continue to evolve. For this first public release,
we have prioritized backwards compatibility over interface innovation.
The implementation of the Kokkos refactor version of Tpetra currently
lives in tpetra/core/src/kokkos_refactor. It works by partial
specialization on the 'Node' template parameter, and by a final 'bool'
template parameter (which users must NEVER SPECIFY EXPLICITLY). The
"classic" version of Tpetra uses the old ("classic") Node types that
live in the KokkosClassic namespace. All of the classic Node types
have been DEPRECATED, which is how users can see that classic Tpetra
has been deprecated.
If you wish to disable the Kokkos refactor version of Tpetra, set the
Tpetra_ENABLE_Kokkos_Refactor CMake option to OFF. Please note that
this will result in a large number of warnings about deprecated
classes. This CMake option will go away in the 12.0 release.
- Note on backwards compatibility of Tpetra interface
In the new version of Tpetra, MultiVector and Vector implement /view
semantics/. That is, the one-argument copy constructor and the
assignment operator (operator=) perform shallow copies. (By default,
in the classic version of Tpetra, they did deep copies.) For deep
copies, use one of the following:
- Two-argument "copy constructor" with Teuchos::Copy as the second
argument (to create a new MultiVector or Vector which is a deep
copy of an existing one)
- Tpetra::deep_copy (works like Kokkos::deep_copy)
- What if I have trouble building with Scalar=std::complex<T>?
The new version of Tpetra should be able to build with Scalar =
std::complex<float> or std::complex<double>. If you have trouble
building, you may disable explicit template instantiation (ETI) and
tests for those Scalar types, using the following CMake options:
Tpetra_INST_COMPLEX_FLOAT:BOOL=OFF
Tpetra_INST_COMPLEX_DOUBLE:BOOL=OFF
- Accessing and changing the default Node type
Tpetra classes have a template parameter, "Node", which determines
what thread-level parallel programming model Tpetra will use. This
corresponds directly to the "execution space" concept in Kokkos.
Tpetra classes have a default Node type. Users do NOT need to specify
this explicitly. I cannot emphasize this enough:
IF YOU ONLY EVER USE THE DEFAULT VALUES OF TEMPLATE PARAMETERS, DO NOT
SPECIFY THEM EXPLICITLY.
If you need to refer to the default values of template parameters, ask
Tpetra classes. For example, 'Tpetra::Map<>::node_type' is the
default Node type.
Tpetra pays attention to Kokkos' build configuration when determining
the default Node type. For example, it will not use a disabled
execution space. If you do not like the default Node type, but you
only ever use one Node type in your application, you should change the
default Node type at Trilinos configure time. You may do this by
setting the 'KokkosClassic_DefaultNode' CMake option. Here is a list
of reasonable values:
"Kokkos::Compat::KokkosSerialWrapperNode": use Kokkos::Serial
execution space (execute in a single thread on the CPU)
"Kokkos::Compat::KokkosOpenMPWrapperNode": use Kokkos::OpenMP
execution space (use OpenMP for thread-level parallelism on the CPU)
"Kokkos::Compat::KokkosThreadsWrapperNode": use Kokkos::Threads
execution space (use Pthreads (the POSIX Threads library) for
thread-level parallelism on the CPU)
"Kokkos::Compat::KokkosCudaWrapperNode": use Kokkos::Cuda execution
space (use NVIDIA's CUDA programming model for thread-level
parallelism on the CPU)
You must use the above strings with the 'KokkosClassic_DefaultNode'
CMake option. If you choose (unwisely, in many cases) to specify the
Node template parameter directly in your code, you may use those
names. Alternately, you may let the Kokkos execution space determine
the Node type, by using the templated class
Kokkos::Compat::KokkosDeviceWrapperNode. This class is templated on
the Kokkos execution space. The above four types are typedefs to
their corresponding specializations of KokkosDeviceWrapperNode. For
example, KokkosSerialWrapperNode is a typedef of
KokkosDeviceWrapperNode<Kokkos::Serial>. This may be useful if your
code already makes use of Kokkos execution spaces.
- Changes to subpackages
Tpetra is now divided into subpackages. What was formerly just
"Tpetra" is now "TpetraCore". Other subpackages of Kokkos have moved,
some into Teuchos and some into Tpetra. Those subpackages have
changed from Experimental (EX) to Primary Tested (PT), so that they
build by default if Tpetra is enabled.
The most important change is that Tpetra now has a required dependency
on the Kokkos programming model. See below.
If your application links against Trilinos using either the
Makefile.export.* system or the CMake FIND_PACKAGE(Trilinos ...)
system, you do not need to worry about this. Just enable Tpetra and
let Trilinos' build system handle the rest.
- New required dependency on Kokkos
Tpetra now has a required dependency on the Kokkos programming model.
In particular, TpetraCore (see above) has required dependencies on the
KokkosCore, KokkosContainers, and KokkosAlgorithms subpackages of
Kokkos.
This means that Tpetra is now subject to Kokkos' build requirements.
C++11 support is still optional in this release, but future releases
will require C++11 support. Please refer to Kokkos' documentation for
more details.
- Deprecated variable-block-size classes (like VbrMatrix).
We have deprecated the following classes in the Tpetra namespace:
- BlockCrsGraph
- BlockMap
- BlockMultiVector (NOT Tpetra::Experimental::BlockMultiVector)
- VbrMatrix
These classes relate to "variable-block-size" vectors and matrices.
Tpetra::BlockMultiVector (NOT the same as
Tpetra::Experimental::BlockMultiVector) implements a
variable-block-size block analogue of MultiVector. Each row of a
MultiVector corresponds to a single degree of freedom; each block row
of a BlockMultiVector corresponds to any number of degrees of freedom.
"Variable block size" means that different block rows may have
different numbers of degrees of freedom. An instance of
Tpetra::BlockMap represents the block (row) Map of a BlockMultiVector.
Tpetra::VbrMatrix implements a variable-block-size block sparse matrix
that corresponds to BlockMultiVector. Each (block) entry of a
VbrMatrix is it own dense matrix. These dense matrices are not
distributed; they are locally stored and generally "small" (think
"fits in cache"). An instance of Tpetra::BlockCrsGraph represents the
block graph of a VbrMatrix.
Here are the reasons why we are deprecating these classes:
- Their interfaces as well as their implementations need a
significant redesign for MPI+X, e.g., for efficient use of
multiple levels of parallelism.
- They are poorly exercised, even in comparison to their Epetra
equivalents.
- They have poor test coverage, and have outstanding known bugs: see
e.g., Bug 6039.
- Most users don't need a fully general VBR [1].
- We would prefer to name the VBR classes consistently, both to
emphasize the V (variable) part and to distinguish them from the
new constant-block-size classes.
[1] Many users' block matrices have blocks which are all the same
size. They would get best performance by using the new
constant-block-size classes that currently live in the
Tpetra::Experimental namespace. Others usually only have a small
number of different block sizes per matrix (e.g., 3 degrees of
freedom per interior mesh point; 2 for boundary mesh points). The
latter users could get much better performance by a data structure
that represents the sparse matrix as a sum of constant-block-size
matrices.
Zoltan2
- The PartitioningSolution class's interface has changed.
- methods getPartList and getProcList have been renamed to
getPartListView and getProcListView to emphasize that a view, not a copy,
is being returned.
- method getPartListView now returns the part identifiers in the same order
that the local data was provided. The user's localData[i] is assigned
to getPartListView()[i]. Conversions from global identifiers
from PartitioningSolution::getIdList() to local identifiers are no longer
needed.
- methods getIdList and getLocalNumberOfIds have been removed.
- method convertSolutionToImportList has been removed and replaced
by the helper function getImportList in Zoltan2_PartitioningHelpers.hpp.
- pointAssign and boxAssign methods have been added for some geometric
partitioners. Support is provided through MultiJagged (MJ) partitioning.
pointAssign returns a part number that contains a given geometric point.
boxAssign returns all parts that overlap a given geometric box.
- New graph coloring options:
- The parameter color_choice can be used to obtain a more balanced coloring.
Valid values are FirstFit, Random, RandomFast, and LeastUsed.
- New partitioning options:
- Scotch interface updated to Scotch v6 or later (Tested against v6.0.3.)
- Interface to ParMETIS v4 or later added. (Tested against v4.0.3.)
- Miscellaneous:
- Parameter "rectilinear_blocks" has been renamed "rectilinear".
###############################################################################
# #
# Trilinos Release 11.12 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 11.12 general release contains 55 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, MueLu, NOX, Optika, OptiPack,
Pamgen, Phalanx, Piro, Pliris, PyTrilinos, RTOp, Rythmos, Sacado, SEACAS,
Shards, ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Teuchos, ThreadPool,
Thyra, Tpetra, TriKota, TrilinosCouplings, Trios, Triutils, Xpetra, Zoltan,
Zoltan2.
Framework Release Notes:
- Changed minimum version of CMake from 2.7 to 2.8.11.
MueLu
Trilinos 11.12 is the initial release of MueLu.
MueLu is an extensible multigrid library that is part of the Trilinos project.
MueLu works with Epetra (32- and 64-bit versions) and Tpetra matrix types. The
library is written in C++ and allows for different ordinal (index) and scalar
types. MueLu is designed to be efficient on many different computer
architectures, from workstations to supercomputers. While it is MPI based,
MueLu is relies on the "MPI+X" principle, where "X" can be threading or CUDA.
MueLu's software design allows for the rapid introduction of new multigrid
algorithms.
MueLu provides a number of different multigrid algorithms:
- smoothed aggregation algebraic multigrid (AMG), appropriate for
Poisson-like
and elasticity problems
- Petrov-Galerkin aggregation AMG for convection-diffusion problems
- aggregation-based AMG for problems arising from the eddy current
formulation of Maxwell's equations
A PDF user's guide is located in muelu/doc/UsersGuide. To compile it, simply
run "make".
PyTrilinos
- General
* Got rid of NestedEpetra paradigm and now use relative imports to
address the problem that NestedEpetra wass supposed to solve.
* Changed build system for docstrings. The docstrings are no longer
stored in the repository. If the user wants docstrings, then
doxygen needs to be installed. Docstrings are built during the
configuration phase.
* Fixed warnings due to Epetra 32/64 bit handling
* Added mpi4py support. Specifically, the Eptra_MpiComm
constructors and Teuchos::MpiComm<> constructors can now take MPI
sub-communicators, provided by mpi4py.MPI.Comm class. Ignored if
mpi4py is not found.
* Updated Developers Guide
- Teuchos module
* Support for the Teuchos::DataAccess enumeration. This enables
certain Tpetra and Domi constructors
* Add Teuchos::Arrays of int, long, float and double, as valid types
for PyTrilinos ParameterLists
- DistArray Protocol
* Added support for the DistArray Protocol. This is preliminary and
unfortunatley, does not provide any functionality with this
release.
- LOCA module
* Re-instated the LOCA wrappers. These are still experimentaland
require SWIG 3.0.0 and Python 2.5.
* Re-introduction of the LOCA interface required the introduction of
relative imports, which require Python 2.5 or higher.
- Isorropia module
* Refactor of Isorropia directory structure. This should not affect
users.
Tpetra
- Kokkos refactor version of Tpetra
The "Kokkos refactor" version of Tpetra is the new version of Tpetra,
based on the new Kokkos programming model in the KokkosCore
subpackage. It coexists with the "classic" version of Tpetra, which
is currently the default version. We plan to deprecate the "classic"
version of Tpetra in the 11.14 minor release in January, and to remove
it entirely in the 12.0 major release. Thus, the "Kokkos refactor"
version of Tpetra will become the /only/ version of Tpetra at that
time.
The implementation of the Kokkos refactor version of Tpetra currently
lives in src/kokkos_refactor. It works by partial specialization on
the Node template parameter. If you would like to enable this version
of Tpetra, here is a suggested set of CMake options:
# Enable OpenMP, and enable Kokkos' OpenMP backend
-D Trilinos_ENABLE_OpenMP:BOOL=ON
# Set Tpetra's default Node type to use new Kokkos with OpenMP.
# You could also use KokkosThreadsWrapperNode or even
# KokkosSerialWrapperNode here.
-D KokkosClassic_DefaultNode:STRING="Kokkos::Compat::KokkosOpenMPWrapperNode"
# Enable the Kokkos refactor version of Tpetra.
-D Tpetra_ENABLE_Kokkos_Refactor:BOOL=ON
In a debug build, you might like to enable Kokkos' run-time bounds
checking. Here's how you do that. These are _optional_ parameters
and their default values are both OFF (not enabled).
-D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON
-D Kokkos_ENABLE_DEBUG:BOOL=ON
The following options may reduce build times if ETI is enabled:
# Disable KokkosClassic::OpenMPNode
-D KokkosClassic_ENABLE_OpenMP:BOOL=OFF
# Disable KokkosClassic::TPINode
-D KokkosClassic_ENABLE_ThreadPool:BOOL=OFF
# Shut off Kokkos' Pthreads back-end in favor of OpenMP
-D Kokkos_ENABLE_Pthread:BOOL=OFF
You must also enable the following subpackages explicitly, since they
are not Primary Tested at the moment:
- KokkosCore
- KokkosCompat
- KokkosContainers
- KokkosLinAlg
- KokkosAlgorithms
- KokkosMpiComm
If Tpetra_ENABLE_Kokkos_Refactor is ON but any of those subpackages
are not enabled, CMake will stop with an error message that tells you
what subpackages to enable.
If you would like to build with the above subpackages enabled, but
would /not/ like to build Tpetra with any of the new Kokkos Nodes, you
may try setting the CMake KokkosClassic_ENABLE_KokkosCompat to OFF.
This works for me as of 07 Oct 2014, but I do not recommend it, and it
is not supported.
Fun fact: there are three relevant combinations of (new Kokkos
enabled?, Kokkos refactor enabled?), and we test them all! You can
use the new Kokkos Node types with "classic" Tpetra, or you can use
them with "Kokkos refactor" Tpetra.
Most Tpetra tests exercise all enabled Node types, or just use the
default Node type. Ifpack2 tests only use the default Node type
currently. That's why the above build configuration changes the
default Node type. That way, all packages that depend on Tpetra will
use the Kokkos refactor version of Tpetra in /their/ tests by default.
- Full set of default values of template parameters
Usability improvement! Most Tpetra classes now come with a full set
of default values of template parameters. In many cases, you need no
longer specify _any_ template parameters' values, if you only intend
to use their defaults. For example, you may now write the following:
// All default template parameters!
Tpetra::Map<> map (...);
// No "typename" because Map<> is a concrete type.
typedef Tpetra::Map<>::local_ordinal_type LO;
typedef Tpetra::Map<>::global_ordinal_type GO;
for (LO i_lcl = map.getMinLocalIndex ();
i_lcl <= map.getMaxLocalIndex (); ++i_lcl) {
const GO i_gbl = map.getGlobalElement (i_lcl);
// ...
}
// All default template parameters!
// Scalar defaults to double.
// LocalOrdinal, GlobalOrdinal, and Node default
// to the same values as those of Map<> above.
Tpetra::MultiVector<> X (...);
Also, if you need to specify (say) GlobalOrdinal explicitly, you don't
have to specify Node explicitly. For example:
// Don't need to specify Node; it takes its default value.
Tpetra::Map<int, long long> map (...);
Tpetra::MultiVector<double, int, long long> X (...);
You may specify the default value of Node at Trilinos configure time
(that is, when running CMake). The current default is
KokkosClassic::SerialNode (no threads; MPI only). This will change,
but it will always have a reasonable value for conventional multicore
processors.
Please, _please_ prefer default values of template parameters! This
will make your code shorter, allow more flexibility at configure time,
and might even make builds a bit faster. All Tpetra classes come with
public typedefs, so you can pick up scalar_type (if applicable),
local_ordinal_type, global_ordinal_type, and node_type from Tpetra
directly, rather than specifying them explicitly.
- Removed the LocalMatOps template parameter
CrsGraph, CrsMatrix, VbrMatrix, and other classes used to have a
LocalMatOps template parameter. This was the fourth template
parameter of CrsGraph and the fifth template parameter of CrsMatrix.
It was always optional. Chris Baker intended it as an extension point
for users or third-party vendors to insert their own sparse
matrix-vector multiply or triangular solve routines. However, no one
ever used it for this purpose as far as we know. When it started to
hinder the Kokkos refactor effort (see release notes for Trilinos
11.10 below), we removed it. This should speed up compilation times.
Lesson: It's always easier to _add_ a template parameter (at the end,
if it's optional) than it is to remove one.
Getting rid of LocalMatOps does amount to a backwards incompatible
interface change. However, we deemed it a harmless change, for the
following reasons:
1. LocalMatOps has a reasonable default value.
2. As far as I know, no one other than Chris Baker and myself ever
wrote or used alternate implementations of LocalMatOps.
3. Trilinos packages or applications which bothered to specify
LocalMatOps never used anything other than the default value.
Thus, it never even crossed my mind that applications would bother to
specify this thing. Unfortunately, some applications may still
LocalMatOps explicitly. This typedef is unnecessary. You do not need
to specify this template parameter. The default value was always
perfectly fine and has been for years.
###############################################################################
# #
# Trilinos Release 11.10 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 11.10 general release contains 54 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, NOX, Optika, OptiPack, Pamgen,
Phalanx, Piro, Pliris, PyTrilinos, RTOp, Rythmos, Sacado, SEACAS, Shards,
ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Teuchos, ThreadPool, Thyra,
Tpetra, TriKota, TrilinosCouplings, Trios, Triutils, Xpetra, Zoltan, Zoltan2.
Anasazi
- Improved examples and added more explanatory comments.
Thanks to Alicia Klinvex (Purdue) for review and suggestions.
Ifpack2
- Fixed bug in ReorderFilter (related to Bug 6117).
ReorderFilter was computing the number of entries for the wrong row,
which caused temporaries (either in Ifpack2::ReorderFilter or in
MueLu) to be allocated with the wrong size. This may be related to
Bug 6117.
- LocalFilter::apply now checks whether X aliases Y.
- Performance improvements to apply() of BlockRelaxation, LocalFilter,
and Relaxation (10-11 Apr 2014).