Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Removing column when next column value is null shifts cell value from deleted column into the next column #3933

Closed
1 of 8 tasks
orkhanahmadov opened this issue Mar 5, 2024 · 6 comments · Fixed by #3940

Comments

@orkhanahmadov
Copy link
Contributor

orkhanahmadov commented Mar 5, 2024

This is:

- [x] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

Removing the column when the next column value is null shifts the cell value from the deleted column into the next column.

What is the current behavior?

The removed column should not touch other columns' values.

What are the steps to reproduce?

<?php

require __DIR__ . '/vendor/autoload.php';

$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();

$sheet = $spreadsheet->getActiveSheet();
$sheet->fromArray([
    ['a1', 'b1', 'c1', 'd1'],
    ['a2', 'b2', null, 'd2'],
]);

$sheet->removeColumn('B');

$cells = $sheet->toArray();

$cells output right now will be:

a1, c1, d1
a2, b2, d2

As you can see, the second column's second row contains the value 'b2', which is the value from the deleted column.

$cells output should be:

a1, c1, d1
a2, null, d2

What features do you think are causing the issue

  • Reader
  • Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

All formats

Which versions of PhpSpreadsheet and PHP are affected?

PhpSpreadsheet: 2.0.0

@orkhanahmadov
Copy link
Contributor Author

Previously there was a bug similar bug that seemed to only affect the last column: #2535

But with this bug, seems like the same issue is relevant also for any columns not just the last one.

@orkhanahmadov
Copy link
Contributor Author

orkhanahmadov commented Mar 5, 2024

Looks like the issue is related to the fromArray() method's behavior when a null value is provided. It ignores and does not create a cell when the value for the cell is null. So with the above example source data like:

['a1', 'b1', 'c1', 'd1'],
['a2', 'b2', null, 'd2'],

Cell with coordinate C2 never gets created in the cell collection. This creates that mistaken column value shift column B gets deleted.

I'm still trying to understand what's the purpose of null checking and $strictNullComparison in the fromArray() method...
If I pass $strictNullComparison as true and then remove the if statement from this line, all test cases still seem to pass without failure and this bug also goes away. I'm happy to open a PR to remove that if or maybe better would be to have feedback on it and what purpose it serves. Maybe you can help here, @oleibman?

Currently, our workaround to this bug is to pass an "unrealistic" value as $nullValue and also enable $strictNullComparison. This way we make sure null cells always get created:

$sheet->fromArray(
    source: [
        ['a1', 'b1', 'c1', 'd1'],
        ['a2', 'b2', null, 'd2'],
    ],
    nullValue: 'THIS IS A NULL VALUE',
    strictNullComparison: true
);

@orkhanahmadov
Copy link
Contributor Author

Oh, also, removing that if statement or passing "fake/unrealistic" $nullValue also solves the previous issue related to #2535, making possible to remove these lines completely: https://github.com/PHPOffice/PhpSpreadsheet/blob/master/src/PhpSpreadsheet/ReferenceHelper.php#L404-L416

@orkhanahmadov orkhanahmadov changed the title Removing column when next column value is null shifts cell value from deleted column into the next column [Bug] Removing column when next column value is null shifts cell value from deleted column into the next column Mar 5, 2024
@oleibman
Copy link
Collaborator

oleibman commented Mar 6, 2024

I do not believe there is anything wrong with toArray. I think this problem might have something to do with the following code in ReferenceHelper:

        // Find missing coordinates. This is important when inserting column before the last column
        $cellCollection = $worksheet->getCellCollection();
        $missingCoordinates = array_filter(
            array_map(fn ($row): string => "{$highestDataColumn}{$row}", range(1, $highestDataRow)),
            fn ($coordinate): bool => $cellCollection->has($coordinate) === false
        );

        // Create missing cells with null values
        if (!empty($missingCoordinates)) {

I think B2 should be showing up in missingCoordinates, but isn't showing up there. I don't really understand how this logic is supposed to work, and I'm traveling so may not get to this for a while.

@orkhanahmadov
Copy link
Contributor Author

@oleibman Problem is not in toArray(), but in fromArray(). Because of null value checking, cells with null values never get created.

That $missingCoordinates does not solve the issue because it only compares against the highest data column, in my example, column D. If D2 was null then $missingCoordinates would catch it and add the D2 with a null value. But here I remove column B, and C2 after that column is null, $missingCoordinates does not get C2, because:

  • Deleted row - B in this case, is not the last column
  • $cellCollection does not contain C2 at all

The way I see it, there are 2 possible solutions:

  • Understanding why that null check on fromArray() is needed. Can we get rid of it, or maybe add fromArray() another parameter to allow creating cells with null values
  • Updating how $missingCoordinates is being calculated, instead of relying on $cellCollection (which is already missing cells with null values), finding an alternative to returning all coordinates on a worksheet

@oleibman
Copy link
Collaborator

oleibman commented Mar 8, 2024

Sorry for the typo in my original response - I meant there is nothing wrong with fromArray. The problem arises when the column is deleted. You can see the exact same symptom by using equivalent code which does not use fromArray:

$sheet = $spreadsheet->getActiveSheet();
$sheet->setCellValue('A1', 'a1');
$sheet->setCellValue('B1', 'b1');
$sheet->setCellValue('C1', 'c1');
$sheet->setCellValue('D1', 'd1');
$sheet->setCellValue('A2', 'a2');
$sheet->setCellValue('B2', 'b2');
$sheet->setCellValue('D2', 'd2');
$sheet->removeColumn('B');

The expected result should be a1, c1, d1, a2, null, d2; but the actual result is a1, c1, d1, a2, b2, d2. You are probably correct that the code which I suggested as erroneous probably isn't the culprit; I think the error has already happened before it gets there (but it might instead be an error in toArray).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants