You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using twarc hydrate ran for a long time (there were a lot of files) and generated no output because twarc expects tweet id files to just contain, well, tweet ids. The problem is that twarc doesn't really catch that the line doesn't contain an ID and throws it at Twitter's API anyway. This results in no error message in the log other than messages like this:
2020-05-05 02:58:07,685 INFO loading None profile from config /rigel/home/inh2102/.twarc
2020-05-05 02:58:07,689 INFO creating http session
2020-05-05 02:58:07,690 INFO getting ('https://api.twitter.com/1.1/account/verify_credentials.json',) {'params': {'tweet_mode': 'extended'}}
2020-05-05 02:58:08,120 INFO hydrating 100 ids
2020-05-05 02:58:08,120 INFO posting ('https://api.twitter.com/1.1/statuses/lookup.json',) {'data': {'id': '"1123436826570760192","1"\t1123436821189419008,"2"\t1123436818471489536,"3"\t1123436801736134656,"4"\t1123436800796712960,"5"\t1123436798468816896,"6"\t1123436791913242624,"7"\t1123436791795728384,
---
(etc.)
Ideally I think twarc should:
inspect the line and if it doesn't appear to contain a tweet id report it to the log and move on
never throw what look like not IDs at the Twitter API for hydration
The text was updated successfully, but these errors were encountered:
I received an email from a researcher who had to split a large tweet id dataset into multiple chunks, and ended up with a files that looked like:
Using
twarc hydrate
ran for a long time (there were a lot of files) and generated no output because twarc expects tweet id files to just contain, well, tweet ids. The problem is that twarc doesn't really catch that the line doesn't contain an ID and throws it at Twitter's API anyway. This results in no error message in the log other than messages like this:Ideally I think twarc should:
The text was updated successfully, but these errors were encountered: