Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUM Function is Partially Broken #2042

Closed
oleibman opened this issue Apr 29, 2021 · 3 comments
Closed

SUM Function is Partially Broken #2042

oleibman opened this issue Apr 29, 2021 · 3 comments

Comments

@oleibman
Copy link
Collaborator

This is:

- [x] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

In Excel, =SUM(5.7, "x", 30) returns #VALUE. However, if A1/A2/A3 contain 5.7, x, and 30, =SUM(A1:A3) returns 35.7. With the recent re-factoring of MathTrig, both formulas evaluate to #VALUE. @MarkBaker and @PowerKiKi, I have no idea how to distinguish the 2 situations from each other in the code. It would, of course, be easy to change so that both returned 35.7, which is probably better.

What is the current behavior?

All samples using template sampleSpreadsheet.php are affected. When writing those samples as Xlsx, Xls, and Ods, they nevertheless appear okay on open because Excel etc. re-evaluate the function. However, writing those samples to Csv, Html, or Pdf will show #VALUE! when a number is expected.

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

After running the sample suite, 16_Csv.csv and 17_Html.html demonstrate this problem.

Which versions of PhpSpreadsheet and PHP are affected?

Master.

@oleibman
Copy link
Collaborator Author

I may have a solution. It won't be ready for a few hours. For a preview, and to give you the opportunity to suggest why this might not work, here's what I'm thinking:

        foreach ($args as $arg) {
            if (!is_array($arg) && !is_numeric($arg)) {
                return Functions::VALUE();
            }
        }

Add that before the main loop, and eliminate return VALUE in the main loop.

@MarkBaker
Copy link
Member

MarkBaker commented Apr 29, 2021

We do actually have a mechanism to handle this, although it is a bit clumsy, and MS aren't very good at documenting which functions work this way and which don't.
In the case of SUM(): if the value "X" is a literal, then it is treated as an error in the argument list; if it is a cell value, then it is treated as valid, but with a zero value.

Rather than using the standard call to Functions::flattenArray(), we use Functions::flattenArrayIndexed() instead. The flattenArray() call returns a simple enumerated array of all the arguments; but flattenArrayIndexed() returns an associative indexed array, when the index is a reference to the worksheet/row/column. Then in the processing loop, we iterate using both key and value (foreach ($aArgs as $k => $arg))... and inside the loop we can then do a check using Functions::isCellValue($k) which will return true if the value is a cell values, and false if it is a literal; and the code logic inside the loop can then determine whether to return a VALUE! error, or process the string as a 0 value.

As an example of a function that does use this, take a look at COUNTA() in the Statistical Counts, or at the averageDeviations() method in Statistical Averages
There's even unit test cases forCOUNTA()`

@oleibman
Copy link
Collaborator Author

Thank you. I will use the COUNTA model, and expect to have have a PR, with tests, available tonight (North America west coast).

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Apr 30, 2021
As issue PHPOffice#2042 documents, SUM behaves differently with invalid strings depending on whether they come from a cell or are used as literals in the formula. SUM is not alone in this regard; COUNTA is another function within this behavior, and the solution to this one is modeled on COUNTA. New tests are added for SUM, and the resulting tests are duplicated to confirm correct behavior for both cells and literals.

Samples 16 (CSV), 17 (Html), and 21 (PDF) were adversely affected by this problem. 17 and 21 were immediately fixed, but 16 had another problem - Excel was not interpreting the UTF8 currency symbols correctly, even though the file was saved with a BOM. After some experimenting, it appears that the `sep=;` line generated by setExcelCompatibility(true) causes Excel to mis-handle the file. This seems like a bug - there is apparently no way to save a UTF-8 CSV with non-ASCII characters which specifies a non-standard separator which Excel will open correctly. I don't know if this is a recent change or if it is just the case that nobody noticed this problem till now. So, I changed Sample 16 to use setUseBom rather than setExcelCompatibility, which solved its problem. I then added new tests for setExcelCompatibility, with documentation of this problem.
MarkBaker pushed a commit that referenced this issue May 3, 2021
As issue #2042 documents, SUM behaves differently with invalid strings depending on whether they come from a cell or are used as literals in the formula. SUM is not alone in this regard; COUNTA is another function within this behavior, and the solution to this one is modeled on COUNTA. New tests are added for SUM, and the resulting tests are duplicated to confirm correct behavior for both cells and literals.

Samples 16 (CSV), 17 (Html), and 21 (PDF) were adversely affected by this problem. 17 and 21 were immediately fixed, but 16 had another problem - Excel was not interpreting the UTF8 currency symbols correctly, even though the file was saved with a BOM. After some experimenting, it appears that the `sep=;` line generated by setExcelCompatibility(true) causes Excel to mis-handle the file. This seems like a bug - there is apparently no way to save a UTF-8 CSV with non-ASCII characters which specifies a non-standard separator which Excel will open correctly. I don't know if this is a recent change or if it is just the case that nobody noticed this problem till now. So, I changed Sample 16 to use setUseBom rather than setExcelCompatibility, which solved its problem. I then added new tests for setExcelCompatibility, with documentation of this problem.
@oleibman oleibman closed this as completed May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants