saveWorkbook
produces invalid XML when manipulating libreoffice based XLSX files
#403
Labels
saveWorkbook
produces invalid XML when manipulating libreoffice based XLSX files
#403
The issue:
I am using
loadWorkbook()
,writeData()
andsaveWorkbook()
tomodify an existing XLSX file, originally created with libreoffice on linux.
The first time (manipulating the XLSX file once with
openxlsx
) works well, thesecond time the XLSX gets corrupted.
openxlsx::read.xlsx
is still able to readthe data, however, libreoffice only shows empty sheets, and
readxl::read_excel
refused to read the data (
Error: unexpected end of data
).I may have found the problem (unclosed XML tag) as shown in the last part of this issue. Note that this error does not occur when I start with an XLSX file created (saved) by/via MS Excel.
System information:
4.2.2
on Windows and Ubuntu linuxopenxlsx
version `4.2.5.2' (current CRAN release)openxlsx
version4.2.5.9000
(main branch of this repos)Minimal:
I have put together a small minimal which uses a file test_libreoffice.xlsx,
a simple XLSX file with two sheets. This XLSX file has been created using
LibreOffice 7.3.7.2 30(Build:2) on an Ubuntu Ubuntu 22.04.1 LTS.
The R script below does the following:
"test_libreoffice.xlsx"
, stores modified version as"test1.xlsx"
"test1.xlsx"
"test1.xlsx"
, stores modified version as"test2.xlsx"
"test2.xlsx"
Testing readability (
test_read()
) tries to read the first sheet with both,openxlsx
andreadxl
(testing). On both, my linux system as well as onmy Windows 10 it fails in (5) where
readxl::read_excel()
is no longer toread the data. At this point, libreoffice will also only show empty sheets (no data).
Test files:
I am attaching test1.xlsx and test2.xlsx as created by the minimal above on my machine.
Cause of the problem:
I have been looking into the XLSX files to se where the two files
"test1.xlsx"
(whichworks as expected) and
"test2.xlsx"
(which fails) differ and found a problem in theXML definition of
xl/worksheets/sheet1.xml
(as well asxl/worksheets/sheet2.xml
);a
<sheetPr>
tag that is not properly closed. Namely in this part:I've manually corrected this XML (in
"test2.xlsx"
) which seem to resolve the problem.I can't attach the XML files here but they are located in
xl/worksheets/
in both the file"test1.xlsx"
and"test2.xlsx"
(simply unzip).I don't think this issue is related to #133 (#74, #81), hope opening this one is OK.
The text was updated successfully, but these errors were encountered: