Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returned 0 result #54

Open
junwenhu opened this issue Dec 25, 2022 · 19 comments
Open

Returned 0 result #54

junwenhu opened this issue Dec 25, 2022 · 19 comments

Comments

@junwenhu
Copy link

Using pmaw I got 0 submission returned today. It worked before. Don't know why. I exactly followed the medium post Matt wrote. I checked the status of pushshift server and it says it is fine. Any idea what has happened?

@chengren
Copy link

Did you upgrade to the latest version?

@architectdrone
Copy link

I think this has to do with some kind of update that pushshift is doing. Sending parameters through the URL doesn't seem to work anymore - only sending parameters in the body of the GET request works, it seems.

@mattpodolak
Copy link
Owner

mattpodolak commented Dec 28, 2022

Using pmaw I got 0 submission returned today. It worked before. Don't know why. I exactly followed the medium post Matt wrote. I checked the status of pushshift server and it says it is fine. Any idea what has happened?

hey @junwenhu, I have to update the medium post, the issue you encountered is likely due to changes that have been made to the API parameters, before -> until and after -> since.

Can you try using these new parameters, if the problem persists can you share the code that is returning 0 results and the version of PMAW that you are using?

@junwenhu
Copy link
Author

junwenhu commented Jan 1, 2023

That is interesting. I have not tried that yet.

@junwenhu
Copy link
Author

junwenhu commented Jan 1, 2023

Using pmaw I got 0 submission returned today. It worked before. Don't know why. I exactly followed the medium post Matt wrote. I checked the status of pushshift server and it says it is fine. Any idea what has happened?

hey @junwenhu, I have to update the medium post, the issue you encountered is likely due to changes that have been made to the API parameters, before -> until and after -> since.

Can you try using these new parameters, if the problem persists can you share the code that is returning 0 results and the version of PMAW that you are using?

Oh! I kept using the before and after and I didn't know there have been updates! Thank you for telling me. I will try when I have time to and get back in here soon. Happy new year.

@MackBlackburn
Copy link

MackBlackburn commented Jan 4, 2023

I am also seeing this issue. Search_comments returns lots of results and search_submissions returns 0 with the exact same inputs. Search_submissions returns results if I do not specify before/after/until/since, but when I pass epoch values with before/after or until/since, it returns 0. Using the same epochs works for search_comments.

@hienvantran
Copy link

I encountered the same issue, changing before/after to until/since but still got 0 results. Can anyone give updates?

@MackBlackburn
Copy link

MackBlackburn commented Jan 5, 2023

I actually think this is an issue with PushShift rather than PMAW, since I also get 0 results directly querying PushShift. This should get a year of data matching the query "science" but it returns 0 results. Replacing "submission" with "comment" returns plenty. Seems like there must be a problem. https://api.pushshift.io/reddit/search/submission?q=science&since=1619236800&until=1650772800&limit=10

@junwenhu
Copy link
Author

junwenhu commented Jan 6, 2023

Using pmaw I got 0 submission returned today. It worked before. Don't know why. I exactly followed the medium post Matt wrote. I checked the status of pushshift server and it says it is fine. Any idea what has happened?

hey @junwenhu, I have to update the medium post, the issue you encountered is likely due to changes that have been made to the API parameters, before -> until and after -> since.

Can you try using these new parameters, if the problem persists can you share the code that is returning 0 results and the version of PMAW that you are using?

Hi I tried until and after, it doesn't work either. It returns 0 submissions.

import pandas as pd

from pmaw import PushshiftAPI
api = PushshiftAPI()

import datetime as dt
until = int(dt.datetime(2021,2,1,0,0).timestamp())
since = int(dt.datetime(2020,12,1,0,0).timestamp())

subreddit='science'
limit=1000000
submissions = api.search_submissions(subreddit=subreddit, limit=limit, until=until, since=since)
print(f'Retrieved {len(submissions)} submissions from Pushshift')

I'm also tried using the codes your shared on your github. They didn't work anymore. For example, this returns 0 submissions.

from pmaw import PushshiftAPI

api = PushshiftAPI()
posts = api.search_submissions(subreddit="science", limit=700000, until=1613234822, safe_exit=True)
print(f'{len(posts)} posts retrieved from Pushshift')

Like @MackBlackburn said, can it be a problem of PushShift? It so, this problem has lasted a long time (at least two weeks). Is there anything we can do about it if we'd still like to get some data from Reddit? Thank you!

@junwenhu
Copy link
Author

junwenhu commented Jan 6, 2023

Did you upgrade to the latest version?

Yes I install pmaw every time I use it

@junwenhu
Copy link
Author

junwenhu commented Jan 6, 2023

I actually think this is an issue with PushShift rather than PMAW, since I also get 0 results directly querying PushShift. This should get a year of data matching the query "science" but it returns 0 results. Replacing "submission" with "comment" returns plenty. Seems like there must be a problem. https://api.pushshift.io/reddit/search/submission?q=science&since=1619236800&until=1650772800&limit=10

It's weird. I can get results from requests (provided that size is smaller than 1000).

@chengren
Copy link

chengren commented Jan 6, 2023

Please refer this https://www.reddit.com/r/pushshift/comments/zuclhb/psa_pmaw_has_been_updated_to_handle_the_api/
"Submissions earlier than November 3rd still have not been loaded so any searches for submissions earlier than that will fail."

@shanktt
Copy link

shanktt commented Jan 6, 2023

Anyone still having trouble pulling submissions and comments that took place after November 3rd when using PMAW 3.0.0? Here's my request that is still returning zero results:

from pmaw import PushshiftAPI
import datetime as dt
import pandas as pd
import numpy as np

start_epoch = int(dt.datetime(2023, 1, 1).timestamp())
end_epoch = int(dt.datetime(2023, 1, 6).timestamp())

api = PushshiftAPI()
gen1 = api.search_submissions(subreddit="science", since=start_epoch, until=end_epoch)
gen2 = api.search_comments(subreddit="science", since=start_epoch, until=end_epoch)

@eddvrs
Copy link
Contributor

eddvrs commented Jan 6, 2023

@AshankKumar, your code works fine for me here (with a few print-outs added):

import pmaw
import datetime as dt
#import pandas as pd
#import numpy as np

print(pmaw.__version__)

start_epoch = int(dt.datetime(2023, 1, 1).timestamp())
end_epoch = int(dt.datetime(2023, 1, 6).timestamp())

api = pmaw.PushshiftAPI()
gen1 = api.search_submissions(subreddit="science", since=start_epoch, until=end_epoch)

print("gen1:", len(gen1))

gen2 = api.search_comments(subreddit="science", since=start_epoch, until=end_epoch, limit=100)

print("gen2:", len(gen2))

Yields:

3.0.0
gen1: 330
gen2: 100

And occasionally this:

3.0.0
Not all PushShift shards are active. Query results may be incomplete.
gen1: 0
gen2: 100

The Pushshift API has been patchy recently- look at all this red:
Daily status: Submissions
Daily status: Comments

Do you definitely have the latest version of PMAW installed? Additionally, is your project definitely referencing the latest version?

@shanktt
Copy link

shanktt commented Jan 6, 2023

Ah geez thank you for the sanity check. Of course I forgot to start up the virtual environment

@ibnahmadbello
Copy link

Hello,
I am still facing same issue. I do get response if my limit <= 1000, but once I change limit to 1001, I get 0 response.
@junwenhu How were you able to get pass this issue?

@junwenhu
Copy link
Author

junwenhu commented Feb 5, 2023

Hello, I am still facing same issue. I do get response if my limit <= 1000, but once I change limit to 1001, I get 0 response. @junwenhu How were you able to get pass this issue?

I didn't. sorry

@ibnahmadbello
Copy link

Hello, I am still facing same issue. I do get response if my limit <= 1000, but once I change limit to 1001, I get 0 response. @junwenhu How were you able to get pass this issue?

I didn't. sorry

Okay. Thanks

@junwenhu
Copy link
Author

junwenhu commented Feb 6, 2023

Updates: I found the problem. Pip install kept installing what was in the computer, so even though I kept installing the newest version, it won’t update.

I uninstalled and reinstalled anaconda and the problem was solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants