-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flang] Add runtime interface for NUM_IMAGES #1076
base: fir-dev
Are you sure you want to change the base?
Conversation
…project into fir-dev Getting ready to merge changes for the CSHIFT intrinsic.
This change implements the lowering runtime portion of the CSHIFT intrinsic. The implementation of the runtime was done in revision D106292.
[flang] Implement the lowering portion of the CSHIFT intrinsic
…roduct Add lowering for the DOT_PRODUCT intrinsic I confirmed that it builds and runs without issue (regression tests pass and an end-to-end dot product test runs correctly too). Thanks @banach-space !
Initial implementation with assumption that rank will be always 2
In processing the UNPACK intrinsic, my implementation had an unneeded extra call to createBox(). This change removes it.
-> Handle case where rank of one the matrices is 1. -> Add test for the above case with logical type. -> Remove return value from the matmul call.
This change fixes a bug in the runtime portion of the CSHIFT intrinsic that happens when the value of the SHIFT argument is negative. Differential Revision: https://reviews.llvm.org/D106292
[flang] Remove an unneeded call to createBox
…soc-review-comments [NFC] address comments from flang-compiler#907
When instantiating variables, identify the ones that must be default initialized at runtime (locals and intent(out) dummies). Call the default initialization runtime routine to perform the initialization. Global variables that have no explicit initialization and have a default initialization are left TODO. The runtime cannot be used for them, the fir.global generation will need to be updated for these variables (it was generating an undefined undefined until now). For now, simply catch and add a hard TODO for these cases. For information, it is firing 13 times in NAG F95 tests. Runtime interface for finalization is added, but nothing is done with it yet (Final procedures are not F95).
…-type-init-2 Lower default initialization of local and dummy entities
[OpenMP][Flang]Add support for modifiers monotonic and non-monotonic
Also insert casts in fir.array_coor op (error revealed by the first fix).
…issue Fix fir.alloca codegen for array with dynamic type parameters
It does not make sense to embox a box. Instead we can convert the box's signature with a fir.convert, which is idempotent on the boxed value.
…faces Since BOZ literal arguments are typeless, we cannot know how to pass them as actual arguments to procedures with implicit interfaces. This change avoids the problem by emitting an error message in such situations. This change stemmed from the following issue -- flang-compiler#794 Differential Revision: https://reviews.llvm.org/D106831
According to C7109, "A boz-literal-constant shall appear only as a data-stmt-constant in a DATA statement, or where explicitly allowed in 16.9 as an actual argument of an intrinsic procedure." This change enforces that constraint for output list items. I also added a general interface to determine if an expression is a BOZ literal constant and changed all of the places I could find where it could be used. I also added a test. This change stemmed from the following issue -- https://gitlab-master.nvidia.com/fortran/f18-stage/issues/108 Differential Revision: https://reviews.llvm.org/D106893
When a WRITE overwrites an endfile record, we need to forget that there was an endfile record. When doing a BACKSPACE after an explicit ENDFILE statement, the position afterwards must be upon the endfile record. Attempts to join list-directed delimited character input across record boundaries was due to a bad reading of the standard and has been deleted, now that the requirements are better understood. This problem would cause a read attempt past EOF if a delimited character input value was at the end of a record. It turns out that delimited list-directed (and NAMELIST) character output is required to emit contiguous doubled instances of the delimiter character when it appears in the output value. When fixed-size records are being emitted, as is the case with internal output, this is not possible when the problematic character falls on the last position of a record. No two other Fortran compilers do the same thing in this situation so there is no good precedent to follow. Because it seems least wrong, with this patch we now emit one copy of the delimiter as the last character of the current record and another as the first character of the next record. (The second-least-wrong alternative might be to flag a runtime error, but that seems harsh since it's not an explicit error in the standard, and the output may not have to be usable later as input anyway.) Consequently, the output is not suitable for use as list-directed or NAMELIST input. If a later standard were to clarify this case, this behavior will of course change as needed to conform. Differential Revision: https://reviews.llvm.org/D106695
NAMELIST I/O formatting uses the runtime infrastructure for list-directed I/O. List-directed input processing has same state that requires reinitialization for each successive NAMELIST input item. This patch fixes bugs with "null" items and repetition counts on NAMELIST input items after the first in the group. Differential Revision: https://reviews.llvm.org/D106694
cherry pick runtime I/O fixes for namelist array input, backspace, rewind
Lowering for the Matmul Intrinsic
All tests from flang/test/Lower/intrinsic-procedures.f90 are moved to dedicated files (with the exception of `lge`, `lgt`, `lle` and `llt`). With this change, every filename clearly defines which intrinsic procedure is being tested. After the initial split, we discovered that some tests were invalid (e.g. `! CHECK` was used instead of `! CHECK:`). These tests are now fixed. Some tests are additionally hardened (e.g. some `CHECK-DAG` directives was replaced with `CHECK`). Some tests are updated with checks generated with mlir/utils/generate-test-checks.py. In other words, this patch _moves_ and _reformats_ the tests. The documentation is updated accordingly (with some extra clarifications, as suggested by the reviewers). Please see the following PRs for a complete discussion: * flang-compiler#927 * flang-compiler#914
…ate_tests [flang][nfc] Move tests from intrinsic-procedures.f90 to dedicated files
Fix the verify test.
…he builder elided those extents. Fixes flang-compiler#942.
Delete code that was redundantly removing the elided extents before t…
…r with a scalar result. See the following Phabricator review for more information: https://reviews.llvm.org/D106820
Fix evaluation semantics of FORALL constructs per 10.2.4.2.4.
The added pass works when mlir::LLVM::LLVMDialect, mlir::acc::OpenACCDialect, and mlir::omp::OpenMPDialect are present, which adds a dependency on those dialect libraries. https://github.com/flang-compiler/f18-llvm-project/blob/bb1ab876a5fec1eb3f2ab0603756f9124515a68a/flang/include/flang/Optimizer/Transforms/Passes.td#L189
Define a VectorSubscriptBox class that allow representing and working with a lowered Designator containing vector subscripts while ensuring all the subscripts expression are only lowered once. It's a bit of a super ExtendedValue (but is is not added as such because it is heavy and its use case is only restricted to IO input so far). The key point of this class is that it has members functions `loopOverElements` and `loopOverElementsWhile` that allow creating loops over each element of the designator, and call a provided callback with the address of the element. This class is used in input IO to create an IO runtime call for each element of the designator, since it is not possible to build a descriptor for it. The `loopOverElementsWhile` version is required when error recovery is enabled in IO. A hidden VectorSubscriptBoxBuilder is in charge of making a custom Designator<T> visit and create the VectorSubscriptBox. Once this is done, the VectorSubscriptBox does not equire any front-end data structures to work with. The motivation for creating such tool is that the current lowering array lowering infrastructure is centered around assignment (to a variable or temporary), and could not be used here to created the loops over the IO runtime calls without some non trivial modification to it. Adding complexity to the array expression lowering framework to cover a corner case did not appear a good idea. The option of creating a temp, passing it to the runtime, and copying it back was explored, but it was not possible to guarantee that the subscript would be evaluated only once given there was no way to "keep the lowered representation" of the designator between the temp creation and the copy back (the temp creation requires evaluating the susbcripts to compute the shape in general). However, note that with the added VectorSubscriptBox, it would actually now also be possible to deploy a temp + copy back mechanism. This can be done later if it appeared beneficial in real world program. The only cases left TODOs are the cases when one of the Components is a parent type reference (not yet handled properly in the general case), the coarray case, and the PDT case (not sure exactly how type params will be threaded in fir.field_index). All other cases are covered, and I tried to add exhaustive tests since I do not expect real world program to be very harsh on this utility (most will just do READ(*, *) x(y)).
…-io-2 Lower IO input with vector subscripts
Also fix some tests that were walls of CHECK-DAG checks.
Changes so that temporaries do not use uniq_name.
- Some file name size made it into transfer.f90 test in flang-compiler#1070 test update through the fir.char<?,90> type. This makes the test dependent on where LLVM directory is. Replace fir.char<?,90> by fir.char<?,{{.*}}> - vector-subscript-io.f90 was not updated with the uniq_name -> bindc_name change in flang-compiler#1070.
Use the array lowering framework to lower elemental subroutine. This required: - Splitting the part of genIterSpace that creates the implicit loops into a new genImplicitLoops function. genIterspace deals with the destination and other explicit contexts set-ups that do not matter in the subroutine case (there is no destination). Other than having no destination, a big difference is that there is no threaded innerArg. genImplicitLoops handles this variation. - Fixing a bug in the iteration shape determination. The code was assuming there would always be at least one array_load, which is not true if there are only RefTransparent operands. This is always true in the subroutine case, but was actually also an issue for things like `print *, elem_func(array)` with the previous code (compile crash). Add a new ArrayOperands structure to keep track of what matters to later deduce the shape of the array expression instead of keeping track of array_loads for that purpose. - Elemental subroutine with intent(out)/intent(inout) arguments must be applied "in array element order" (15.8.3). It turns out this is also the case for impure elemental functions (10.1.4 p5). Instead of always creating loops as "unordered=true" add an unordered field to the ArrayExprLowering class. It is set to true by default, but will be set to false if any impure elemental function is lowered, or in case of intent(out)/intent(inout) arguments in elemental subroutines. - Last, instead of using createFirExpr to lowering subroutine calls, use a new createSubroutineCall that deals with the array/scalar dispatching. A TODO is added for the user defined elemental assignment, because overlap analysis between RHS/LHS must be done, and this require somehow plugging this special and very restricted case of elemental calls into the array_load/array_update/array_merge framework or some ad-hoc lowering.
…ub-2 Lower elemental subroutine calls (outside of user assignment case)
This interface looks good to me, however it looks like it did not make it in LLVM as part of https://reviews.llvm.org/D109547 (I see the runtime discussed in the thread, but I cannot see the runtime change itself in the last diff or in https://github.com/flang-compiler/f18-llvm-project/tree/fir-dev/flang/include/flang/Runtime). Was this intended ? |
NUM_IMAGES takes optional arguments TEAM and TEAM_NUMBER. This patch provides an interface for only the optional TEAM_NUMBER dummy argument. It is likely that a different function will be required to support TEAM arguments. Reviewed By: PeteSteinfeld, klausler Differential Revision: https://reviews.llvm.org/D109547
I don't recall at this point why I thought I had to commit it to fir-dev. Probably because I have the implementation for NUM_IMAGES done and this will definitely have to go in fir-dev. However, I previously pushed
on Thu Sep 16 17:21:40 and this isn't if fir-dev yet. So perhaps I thought that by submitting a pull request to fir dev I could use the the file as soon as it was accepted rather that waiting around for the file to show up in fir-dev. I'm a bit confused at the moment how this all works. I have several intrinsic functions I've been working on that I'd like to get into fir-dev and I'm finding that the coding is the easiest part. Now that I have commit privileges on the llvm-project itself it is straight forward for me to push stuff there. I think I'll just go ahead and submit a pull request for the whole implementation of NUM_IMAGES since the interface has already been accepted. |
This commit provides a single-image implementation of NUM_IMAGES. It returns 1.
I've added the implementation for NUM_IMAGES. It will likely be easier to understand the interface for num_images() in coarray.h in context of the implementation. The implementation is for execution on a single image only, in which case the function returns 1. |
Hi Craig, At this time we are typically cherry picking changes from the main branch onto the fir-dev branch when and where it makes sense to do so. Lowering code is not upstreamed, so changes to that part of the compiler come in as PRs like this one. It looks like this PR could've cherry picked over a change or two, but don't worry about it at this point. Thanks for the help with coarrays. |
|
||
int RTNAME(NumImages)(int team, | ||
const char *sourceFile, int sourceLine) { | ||
std::fputs("Fortran NUM_IMAGES: single image value is 1", stderr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should go away along with the include of C++ header file.
const char *sourceFile, int sourceLine) { | ||
std::fputs("Fortran NUM_IMAGES: single image value is 1", stderr); | ||
// return value for single image (default coarray runtime) | ||
return 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error checking needs to be added. I need to think through what values team can take on a single image.
This pull request should wait until https://reviews.llvm.org/D110356 is merged into the main branch. It makes several changes related to TEAM_TYPE in flang/lib/Evaluate/intrinsics.cpp that will be useful here. |
Hi, I have just rebased fir-dev on LLVM 85bf221 from last week. You can update the PR as follow (assuming your branch is called my-branch):
Sorry for the inconvenience, I could not wait for every PR to get in and solve all new conflicts on my side, the rebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR requires a rebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
NUM_IMAGES takes optional arguments TEAM and TEAM_NUMBER. This
patch provides an interface for only the optional TEAM_NUMBER
dummy argument. It is likely that a different function will
be required to support TEAM arguments.
Reviewed By: PeteSteinfeld, klausler
Differential Revision: https://reviews.llvm.org/D109547