You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there,
this is a great little tool, thanks for sharing. I have tested on some sheets and I have noticed the table boundaries are not correct when two cells are merged. That seems to be probably due to the fact that the information is lost when converted to a dataframe ?
What do you think would be the best way to handle this case ?
The text was updated successfully, but these errors were encountered:
Thanks for opening this and your interest in eparse. I'm traveling at the moment but I'll look at this when I return. Can you provide a specific example of what you mean by the table boundaries are not correct when a cell is merged? The short answer is that we take whatever pandas gives by default when converting xlsx to dataframe. In my experience this is only an issue on header rows, but I'd like to see your specific use case.
Hi. At PyCon sprints following up on this issue, which appears to be a limitation of pandas:
The proposed workaround in pandas does not work for all cells (we would probably not want to fill NA values across the entire sub-table), but there may be a workaround when handling table headers in df_find_tables and/or df_parse_table.
Hi there,
this is a great little tool, thanks for sharing. I have tested on some sheets and I have noticed the table boundaries are not correct when two cells are merged. That seems to be probably due to the fact that the information is lost when converted to a dataframe ?
What do you think would be the best way to handle this case ?
The text was updated successfully, but these errors were encountered: