Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with "carriage return" inside a cell #192

Closed
aborruso opened this issue Nov 3, 2018 · 2 comments
Closed

Problem with "carriage return" inside a cell #192

aborruso opened this issue Nov 3, 2018 · 2 comments
Labels
Milestone

Comments

@aborruso
Copy link

aborruso commented Nov 3, 2018

Hi,
first of all thank you for this great tool.

I have a PDF with come "carriage return" inside cells (I'm attaching it).

image

If I run camelot -f csv -o output.csv lattice input.pdf the cells output does not have any "carriage return", then I have a partially unusable output because in example I have [email protected]@fondazionelavoro.it and not [email protected]\[email protected], and than it's difficult to find a way to split the cell content.

I'm using camelot-py-0.3.1. Is there a parameter to pass to solve this kind of problem?

Thank you

@vinayak-mehta
Copy link
Contributor

Hey @aborruso, thanks for the report. Currently, each text line is stripped off newlines before assigning it to a cell, which can be undesired behavior in cases such as above. I'm labeling current behavior as a bug since everything should be extracted from a PDF as is. Expect a fix soon!

@vinayak-mehta
Copy link
Contributor

@aborruso This is fixed on master now. After #229, this will be more configurable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants