Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named formula implementation, and improved handling of Defined Names generally #1535

Merged
merged 151 commits into from
Jul 26, 2020

Conversation

MarkBaker
Copy link
Member

@MarkBaker MarkBaker commented Jun 17, 2020

This is:

- [X] a bugfix
- [X] a new feature

Checklist:

Why this change is needed?

  • UTF-8 names (already extracted as a separate PR and merged)

  • Defined Names are now case-insensitive

  • Better support for ranges defined in the format sheetName!A1:sheetName!B2, as long as both parts reference the same worksheet (already extracted as a separate PR and merged for straight range references, support in named ranges is part of this PR)

  • distinction between named ranges and named formulae

  • correct handling of union and intersection operators in named ranges

  • correct evaluation of named range operators in calculations

  • fix resolution of relative named range values in the calculation engine; previously all named range values had been treated as absolute.

  • calculation support for named formulae

  • support for nested ranges and formulae (named ranges and formulae that reference other named ranges/formulae) in calculations

  • Introduction of a helper to convert address formats between R1C1 and A1 (and the reverse)

  • Proper support for both named ranges and named formulae in all appropriate Readers

    • Xlsx (Previously only simple named ranges were supported)
    • Xls (Previously only simple named ranges were supported)
    • Gnumeric (Previously neither named ranges nor formulae were supported)
    • Ods (Previously neither named ranges nor formulae were supported)
    • Xml (Previously neither named ranges nor formulae were supported)
  • Proper support for named ranges and named formulae in all appropriate Writers

    • Xlsx (Previously only simple named ranges were supported)
    • Xls (Previously neither named ranges nor formulae were supported) - Still not supported, but some parser issues resolved that previously failed to differentiate between a defined name and a function name
    • Ods (Previously neither named ranges nor formulae were supported)

IMPORTANT NOTE: This Introduces a BC break in the handling of named ranges. Previously, a named range cell reference of B2 would be treated identically to a named range cell reference of $B2 or B$2 or $B$2 because the calculation engine treated then all as absolute references. These changes "fix" that, so the calculation engine now handles relative references in named ranges correctly.
This change that resolves previously incorrect behaviour in the calculation may affect users who have dynamically defined named ranges using relative references when they should have used absolute references.

MarkBaker added 30 commits June 13, 2020 12:33
…y the calculation engine

This should provide better support for:
  - both union and intersection operators in composite named range values
  - MS Excel implementation of the union operator duplicating values
  - named formulae
  - named ranges and formulae that reference other named ranges and formulae
  - ranges and formulae that reference multiple ranges across multiple worksheets
…Named-Formula-Implementation

# Conflicts:
#	src/PhpSpreadsheet/Calculation/Calculation.php
…curately reflects all defined names, not simply ranges
…ae) correctly

 - UTF-8 names (already extracted as a separate PR and merged)
 - distinction between named ranges and named formulae
 - correct handling of union and intersection operators in named ranges
 - correct evaluation of named range operators in calculations
 - calculation support for named formulae
 - support for nested ranges and formulae (named ranges and formulae that reference other named ranges/formulae) in calculations
# Conflicts:
#	src/PhpSpreadsheet/Calculation/Calculation.php
#	tests/PhpSpreadsheetTests/Calculation/Engine/RangeTest.php
…ren't actually supported previously... this is still ongoing work
# Conflicts:
#	src/PhpSpreadsheet/Calculation/Calculation.php
…as the references are both to the same worksheet
…ulation engine. Catch and replace with `#NAME?`
@MarkBaker MarkBaker marked this pull request as ready for review July 13, 2020 19:48
MarkBaker added 3 commits July 14, 2020 10:58
Copy link
Member

@PowerKiKi PowerKiKi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually try to avoid adding binary files, such as NamedFormulae.xlsx, to the repo to avoid increasing its size too much. Do you think it would be reasonable to create that file from scratch, in-memory, instead of reading a pre-existing file ? or should keep the file as is ?

src/PhpSpreadsheet/DefinedName.php Outdated Show resolved Hide resolved
src/PhpSpreadsheet/DefinedName.php Outdated Show resolved Hide resolved
@MarkBaker
Copy link
Member Author

I'd tried to keep the binaries as small as possible, and with as many variations and permutations for testing as possible; and I do think they're essential for testing how the Reader handles them (it's only possible to cover so much with simulated files, or with partial xml). Although the tests themselves are in Calculation, I'd considered adding specific Reader tests using those same files, with an @depends (in the end, I decided that was more awkward,and that Reader failures would trigger in the existing tests that use those files anyway). If I'd followed that route, then I'd also have felt I needed to add binaries for xls, xml, ods and gnumeric as well (ods is particularly awkward with three levels of pointers spread across two files within the zip) to cover testing those Readers as well.

I do think we need some binaries to test the Readers. We can't always create/write then read what we've written, because that doesn't verify that we can read a genuine file, only one that we've created.
I know the existing binaries are filled with a myriad of different types and styles, so one file can be used to test may different features, but felt that new files were more appropriate here because of the complexities of defined names.

My feeling is to keep them (even still tempted to add xls, xml, ods and gnumeric files as well) but with some additional Reader specific unit tests running against the same files.

@MarkBaker
Copy link
Member Author

One of my big concerns was the potential BC break in the behaviour of relative ranges, if users define their named ranges as relative, expecting them to be treated as absolute... MS Excel would always have treated these ranges as relative in the generated file, so it only really affects results returned by the calculation engine itself for a getCalculatedValue() call, so I wasn't sure whether that should require a 2.0.0 release, or was minor and internal enough that it could just be part of the standard release cycle.

@PowerKiKi
Copy link
Member

IMHO what you describe as BC break actually is bug fixing. Previous behavior was incorrect, and was changed to conform to what Excel does. I don't think that require a major version. I'd be ok with releasing this in next minor version.

Copy link
Member

@PowerKiKi PowerKiKi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as long as we resolve conflict

MarkBaker added 5 commits July 26, 2020 09:50
# Conflicts:
#	CHANGELOG.md
#	src/PhpSpreadsheet/Calculation/Calculation.php
#	src/PhpSpreadsheet/Calculation/LookupRef.php
#	src/PhpSpreadsheet/Reader/Gnumeric.php
#	src/PhpSpreadsheet/Reader/Ods.php
#	src/PhpSpreadsheet/Reader/Xml.php
#	src/PhpSpreadsheet/Worksheet/Worksheet.php
#	src/PhpSpreadsheet/Writer/Xls/Parser.php
@MarkBaker MarkBaker merged commit 8b0aaf7 into master Jul 26, 2020
@oleibman oleibman mentioned this pull request Dec 3, 2020
@MarkBaker MarkBaker deleted the Named-Formula-Implementation branch April 26, 2022 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BREAKING CHANGE calculation engine reader/gnumeric Reader for Gnumeric spreadsheet files reader/ods Reader for Open/LibreOffice spreadsheet files (OASIS) reader/xls Reader for MS BIFF-format (xls) spreadsheet files reader/xlsx Reader for MS OfficeOpenXML-format (xlsx) spreadsheet files reader/xml Reader for MS SpreadsheetML spreadsheet files writer/ods Writer for Open/LibreOffice spreadsheet files (OASIS) writer/xls Writer for MS BIFF-format (xls) spreadsheet files writer/xlsx Writer for MS OfficeOpenXML-format (xlsx) spreadsheet files
Development

Successfully merging this pull request may close these issues.

2 participants