Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid signature check for XML files #1916

Closed
christof-b opened this issue Mar 11, 2021 · 4 comments
Closed

Invalid signature check for XML files #1916

christof-b opened this issue Mar 11, 2021 · 4 comments

Comments

@christof-b
Copy link
Contributor

This is:

- [ x ] a bug report
- [   ] a feature request
- [ x ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

The PhpOffice\PhpSpreadsheet\Reader\Xml class should read all valid SpreadsheetML (xml) files.

What is the current behavior?

  • It declines to read valid xml files not containing microsoft product information in the form of <?mso-application progid="Excel.Sheet"?>, which is not a requirement or part of the schema
  • it allows to read valid xml files, using different schmemas than xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet", which will also fail later, as the current implementation works only for this namespace

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

// Create new Spreadsheet object
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();

// add code that show the issue here...
$testData = [
    'Test currently valid file'          => '<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <Worksheet ss:Name="Test"><Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1"
   x:FullRows="1">
   <Column ss:AutoFitWidth="0" ss:Width="110"/>
   <Row>
    <Cell><Data ss:Type="String">test</Data></Cell>
   </Row></Table></Worksheet></Workbook>',
    'Test valid file currently declined' => '<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <Worksheet ss:Name="Text"><Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1"
   x:FullRows="1">
   <Column ss:AutoFitWidth="0" ss:Width="110"/>
   <Row>
    <Cell><Data ss:Type="String">test</Data></Cell>
   </Row></Table></Worksheet></Workbook>',
];

$reader = new PhpOffice\PhpSpreadsheet\Reader\Xml();

foreach ($testData as $test => $data) {
    $tmpFileHandle = tmpfile();
    $tmpFilePath   = stream_get_meta_data($tmpFileHandle)['uri'];
    fwrite($tmpFileHandle, $data);
    rewind($tmpFileHandle);

    echo PHP_EOL . $test . PHP_EOL;
    $reader->load($tmpFilePath);
    echo PHP_EOL;
}

Which versions of PhpSpreadsheet and PHP are affected?

all

Further Information

The currently supported SpreadsheetML as XML does not require the <?mso-application progid="Excel.Sheet"?> to be present, but it requires the namespace xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" to be present. The part <?mso-application progid="Excel.Sheet"?> is only be added by Excel themselfs, exports from other tools will not contain this information.

See https://docs.microsoft.com/en-us/previous-versions/office/developer/office-xp/aa140062(v=office.10)#excel-xml-schema-namespaces
and for an example: https://de.wikipedia.org/wiki/SpreadsheetML

Fix

Replace

$signature = [
            '<?xml version="1.0"',
            '<?mso-application progid="Excel.Sheet"?>',
        ];

with

$signature = [
            '<?xml version="1.0"',
            'xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet',
        ];

I will create a pull request for this.

christof-b pushed a commit to christof-b/PhpSpreadsheet that referenced this issue Mar 11, 2021
Replace the unrequired product signature by the
required namespace definition for XML Spreadsheet.
@christof-b
Copy link
Contributor Author

First detected in issue #522

christof-b pushed a commit to christof-b/PhpSpreadsheet that referenced this issue Mar 11, 2021
MarkBaker pushed a commit that referenced this issue Mar 13, 2021
* Fix SpreadsheetML (xml) detection (#1916)

Replace the unrequired product signature by the
required namespace definition for XML Spreadsheet.

* Add summary to changelog (#1916)

Co-authored-by: Christof Bachmann <[email protected]>
@oleibman
Copy link
Collaborator

oleibman commented Apr 8, 2021

Does the push mentioned above solve this problem, and is it okay to close this ticket now?

@MarkBaker
Copy link
Member

That push should resolve the problem. You can test against the master branch to confirm

@oleibman
Copy link
Collaborator

Fixed by PR 1917.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants