Merged cells #5

mericano1 · 2024-04-17T11:26:25Z

Hi there,
this is a great little tool, thanks for sharing. I have tested on some sheets and I have noticed the table boundaries are not correct when two cells are merged. That seems to be probably due to the fact that the information is lost when converted to a dataframe ?

What do you think would be the best way to handle this case ?

ChrisPappalardo · 2024-04-18T15:00:21Z

Hello,

Thanks for opening this and your interest in eparse. I'm traveling at the moment but I'll look at this when I return. Can you provide a specific example of what you mean by the table boundaries are not correct when a cell is merged? The short answer is that we take whatever pandas gives by default when converting xlsx to dataframe. In my experience this is only an issue on header rows, but I'd like to see your specific use case.

ChrisPappalardo · 2024-05-22T13:27:03Z

Hi. At PyCon sprints following up on this issue, which appears to be a limitation of pandas:

The proposed workaround in pandas does not work for all cells (we would probably not want to fill NA values across the entire sub-table), but there may be a workaround when handling table headers in df_find_tables and/or df_parse_table.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merged cells #5

Merged cells #5

mericano1 commented Apr 17, 2024

ChrisPappalardo commented Apr 18, 2024

ChrisPappalardo commented May 22, 2024

Merged cells #5

Merged cells #5

Comments

mericano1 commented Apr 17, 2024

ChrisPappalardo commented Apr 18, 2024

ChrisPappalardo commented May 22, 2024