-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Option for reading files with a variable number of comment lines at start #2685
Comments
@wesm: I'm the author of the stackoverflow question.
If you can point me to where this is implemented, so I can give a look, see what I understand and if I can contribute? |
|
I have spent a number of hours in the last days trying to figure out which function calls which(, and is not an easy task).
The error comes from within the If this makes sense, I'll implement it asap. It looks to me that the functions in At last. When reading a file of the kind
with Thanks for the time reading (and hopefully replying) to this long post |
I would very much like to see this feature implemented as I have to deal with workarounds for it often. Is there a reason that the above implementation would not work? It seems simple enough. I have never contributed to pandas before and would like some feedback before proceeding to implement the feature. |
the c-parser is the primary parser; this change would have to be made there (actually in src/parser.pyx). its not that hard, but a bit non-trivial |
I'll give a look at |
this is an already existing feature to skip comments at the end of the line http://pandas.pydata.org/pandas-docs/dev/io.html#comments IIUC you want to skip a line if there is a comment at the beginning? e.g.
if you specified |
the last one would be tough FYI, tokenzier only takes a single character ATM |
@montefra Whichever of us gets to it first then. The race is on. ;-) @jreback That is correct, I want to skip lines with a comment character at the beginning. I would imagine the last case could not be handled in the event the file is space delimited. Basically, comments probably need to either be declared at the very beginning or end of the line. |
I have implemented comment skipping and also fixed a related problem with CSV format sniffing. I did so by modifying A similar problem is whether pandas should ignore empty lines by default. It would be very easy to implement as a slight extension of ignoring comments. I will open up an issue related on this to get some feedback. |
gre8! you can put your changes up as a PR, be sure to enable travis, lmk if you need help |
Only one build failed in Travis CI, but it does not appear to have anything to do with my changes. Is this expected, or did I break something related to pytables somehow? |
@holocronweaver rebase on master again I just pushed some code to 'fix' that failure (although net net the changes didn't actually do antything)...but seemed to fix it to force travis to rebuild
this resets the last commit to a new hash forcing a rebuild |
I did as you suggested and the latest build (3) is all good. I will go ahead and submit a pull request. |
@montefra doing a PR for this? |
@jreback: ehm. |
ok...thanks |
@holocronweaver @montefra can either of you update the PR for this? |
@jreback Will do so the moment I have a chance. Long time coming, I know. =) High on my priority list. |
http://stackoverflow.com/questions/14276661/python-pandas-read-file-skipping-commented
The text was updated successfully, but these errors were encountered: