-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding problem #69
Comments
Do you have an example? |
Update, I found I walkaround:
The problem was with the function Not specifying while importing solves the issue, though However there are still some weird charachters under "full text" I think because of emojis These are unicode emojis and I'm ok with that But what about this? iOS emoji?
|
If you can give me a tweet id that will help me test. |
|
Hello, I've a pretty large dataset (> 2 TB) split in six files.
I assumed that UTF-8 were the text encoding of jsonl files. However there are some charachters that apparently are non-UTF.8 and this causes R to fail when I specify the encoding.
Not specifying the encoding results in a messy full_text output
The text was updated successfully, but these errors were encountered: