Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting older data - is there a usage limit? #151

Open
EnigmaNZ opened this issue Feb 20, 2023 · 14 comments
Open

Getting older data - is there a usage limit? #151

EnigmaNZ opened this issue Feb 20, 2023 · 14 comments
Labels
help wanted Extra attention is needed

Comments

@EnigmaNZ
Copy link

Describe the bug
I apologise if this has already been covered but I couldn't find a title that appeared to match:

I am grabbing a bit of data using the different functions listed below (in one script) and am getting up to date information on the initial one function but older info in the later functions (compared with the live values in Yahoo Finance). I've listed below the functions used and whether up to date or not:
Within financial_data:
currentPrice = Up to Date
totalCash = Up to Date
totalDebt = Up to Date
Within index_trend:
+5y growth estimates = Out of Date
Within key_stats:
priceToBook = Out of Date
enterpriseValue = Out of Date
Within summary_detail:
trailingPE = Out of Date
priceToSalesTrailing12Months = Out of Date
marketCap = Out of Date
Within cash_flow:
FreeCashFlow = Out of Date

To Reproduce
Attached a text file with the for function that's pulling the data of 15 stocks

Expected behavior
Up-to-date information

Desktop (please complete the following information):

  • OS: Ubuntu 22.04.1 LTS
  • Browser: Firefox
  • Version: 110.0
@EnigmaNZ
Copy link
Author

YahooQueryIssue.txt

Unable to load the text file in the main so have attached here

@dpguthrie
Copy link
Owner

Do you have the symbol or symbols where we can view this behavior? And, what you’re expecting vs what’s being returned.

@EnigmaNZ
Copy link
Author

Yep, list of symbols are aapl, amzn, axp, baba, bac, bcs, cepu, goog, icl, jnj, jpm, msft, oxy, ozk, pnc, rf, wfc, zim

I tried to upload the Database to the

Main stock I have been comparing with is 'aapl' (first in the list).

For the functions above (just tested 20 minutes ago - aapl):
Within financial_data:
currentPrice = 152.55 (matches yahoo finance)
totalCash = 5.1355B (yahoo finance = 51.36B - matches)
totalDebt = 111.109997B (yahoo finance = 111.1B - matches)
Within index_trend:
+5y growth estimates = 8.284% (yahoo finance = 8.13%)
Within key_stats:
priceToBook = 42.5998 (yahoo finance = 42.87)
enterpriseValue = 2.4734B (yahoo finance = 2.49T)
Within summary_detail:
trailingPE = 25.8998 (yahoo finance = 26.1)
priceToSalesTrailing12Months = 6.228 (yahoo finance = 6.42)
marketCap = 2.414T (yahoo finance = 2.43T)
Within cash_flow:
FreeCashFlow = 111,443,000 (yahoo finance = 111,443,000 - matches) sorry this one was also correct

Attached is a screenshot of all the data that has been produced
image

@EnigmaNZ
Copy link
Author

It seems there may be a fix if you wait longer between calls. Is this something that would be worth trying?

@dpguthrie
Copy link
Owner

Ya, my guess is that YF is providing bad data when it detects programmatic requests with little time between. Might be a good idea to put in some time.sleep() in between requests. Probably more important to get accurate data than it is to get it super fast. This should most likely be a class level property that you set during instantiation. Something like:

class Ticker:
    def __init__(self, ...):
        ... # all the other stuff
        time.sleep_between = kwargs.get('sleep_between', SLEEP_BETWEEN_DEFAULT)

Then the internal requests methods would need to be refactored slightly to account for that.

@dpguthrie dpguthrie added the help wanted Extra attention is needed label Feb 20, 2023
@ValueRaider
Copy link

YF is providing bad data when it detects programmatic requests with little time between

Odd data issues do arise when spamming Yahoo, but cannot say if intentional e.g. might be different data sources?

IMO best way to rate-limit is a specialised module like pyrate-limiter, or requests_ratelimiter if want to combine with requests-cache (example on yfinance README). Then just pass the session object to yq.

@dpguthrie
Copy link
Owner

IMO best way to rate-limit is a specialised module like pyrate-limiter, or requests_ratelimiter if want to combine with requests-cache (example on yfinance README). Then just pass the session object to yq.

Much better way than I described above. Thanks @ValueRaider

@EnigmaNZ
Copy link
Author

Awesome, cheers @dpguthrie and @ValueRider. I have tried the time.sleep at 1 sec and 5 sec and didn't work for me. Someone else in the discussions said that 1sec (time.sleep(1)) worked for them with only financial_data.

Planning to give the IMO a trial now thank you!

@EnigmaNZ
Copy link
Author

EnigmaNZ commented Feb 25, 2023

Hi guys, not sure if I'm doing something wrong but I trialed the following with @ValueRaider @dpguthrie:

`import requests
import requests_cache
from ratelimiter import RateLimiter

requests_cache.install_cache('yahoo_api_cache', expire_after=3600)

rate_limiter = RateLimiter(max_calls=1, period=5)
session = requests_cache.CachedSession()
session.headers.update({'User-Agent': 'Mozilla/5.0'})

for ticker in Stocks:
temp_T = Ticker(ticker, session=session)
with rate_limiter:
tempFinData = temp_T.financial_data
tempIndexTrend = temp_T.index_trend
tempKeyStats = temp_T.key_stats
tempSummDet = temp_T.summary_detail
MyStocks.txt

	tempSummProf = temp_T.summary_profile
	temp_cashFlow = temp_T.cash_flow(trailing=False)'

@EnigmaNZ
Copy link
Author

Mistakenly closed

@EnigmaNZ EnigmaNZ reopened this Feb 25, 2023
@ValueRaider
Copy link

@EnigmaNZ Your code looks like what ChatGPT would generate - nonsense. Just copy-paste the example in yfinance README.

@EnigmaNZ
Copy link
Author

@ValueRaider Very true about ChatGPT. I pulled the logic from ChatGPT (ended up being nonsense as you said) but I could understand the logic behind what was produced by it so could attempt to fix in comparison to yfinance example.

Prior to the mess around with ChatGPT, I attempted to copy-paste the example from yfinance readme but ended up with the screenshot errors attached (unfortunately for me this is where I turned to ChatGPT to see if I could try combine them and understand a bit more of how this function failed - didn't work though...). Also trialed with "yfinance.cache" to see if I needed to directly copy paste to get the same error. I'm pretty confident there is going to be a really simple fix to this but I have no idea where sorry.
image

Also attached two versions of the code (with and without the commented out code for simplicity if it helps to not have the mess)
MyStocks_withoutMess.txt
MyStocks.txt

@ValueRaider
Copy link

ValueRaider commented Feb 25, 2023

Ah, with yahooquery you also need to set user-agent. Look inside yfinance source code data.py

@dpguthrie Might be worth yqautomatically setting session user-agent if missing.

@EnigmaNZ
Copy link
Author

That definitely solved this issue for me. Ended up just using session.headers.update({'User-Agent': 'Mozilla/5.0.}) just below the yfinance software as per screenshot.
image

Unfortunately I have got no further though. It makes me think that YahooFinance is messing with this data somehow. As you can see I updated the requestrate to 1 request per minute and it is still receiving bad data. Will look through some of the other functions and see if I can pull some accurate data in other ways.

AAPL current P/B = 41.53 (website). Read P/B = 40.969 for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants