Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disruption of equities data :: pandas_datareader dependency on Yahoo and Google Finance API #7

Open
rsvp opened this issue Sep 7, 2017 · 36 comments

Comments

@rsvp
Copy link
Owner

rsvp commented Sep 7, 2017

Description of specific issue

We are expecting major disruption in getting data on
equities, mutual funds, and ETFs via pandas_datareader
due its dependency on API of both Yahoo and Google Finance.

  • [/] Fail quote retrieval via our module lib/yi_stock.py, if no fix upstream.
  • [/] Enhancement, if fecon235 takes independent action in its yi_stock module.

Observed behavior

Yahoo employee has confirmed that the free End-Of-Day data has been terminated, 2017
presumably due to acquisition by Verizon.

/r/algotrading on Google JSON termination
presumably due to cost-cutting by new product manager.
Our yi_stock module may appear to be working, but
please tail() your dataframe to verify whether quote retrieval
goes beyond 2017-09-05 [silent fail to get() current stock quotes].

Alternatives to enhance behavior

  1. Switch over to Quandl, using our lib/yi_quandl.py module

  2. Alpha Vantage: https://www.alphavantage.co/documentation - Requires user to get free API key

  3. tiingo: https://api.tiingo.com/docs/general/overview - Requires user to get free API key

  4. $$$ EODhistoricaldata: https://eodhistoricaldata.com

  5. Scrape Google Finance pages: https://github.com/CNuge/general_use_functions/blob/master/international_stock_scraper.py (currently also works for international stocks) -- which is not a robust solution (relative to API code) since the page layout is subject to UI changes: International market data - functionality addition request pydata/pandas-datareader#408

Why would the improvement be useful to most users?

Information from the equities markets is vital for financial economics.

Before releasing our own independent solution, we expect to
make a pull request to the pandas_datareader repository.

Please kindly propose alternative solutions
below, or at https://gitter.im/rsvp/fecon235
... and look into pandas_datareader issues

Check for revisions upstream

It is possible that a solution has been merged into pandas_datareader
and all that is necessary is an update of the package, see its CHANGELOG:
https://pandas-datareader.readthedocs.io/en/latest/whatsnew.html
and make sure the update is compatible with the latest fecon235.

Note: Anaconda distribution uses hyphen, not underscore:

$ conda update pandas-datareader

Additional helpful details for bugs

  • [/] Problem started recently

  • [/] Problem can be reliably reproduced

  • fecon235 version: v5.17.0603

  • pandas version: 0.19.2

  • pandas_datareader version: 0.2.1

@rsvp
Copy link
Owner Author

rsvp commented Sep 16, 2017

@paintdog
Copy link

What means:

Our yi_stock module may appear to be working, but
please tail() your dataframe to verify whether quote retrieval
goes beyond 2017-09-05 [silent fail to get() current stock quotes].

Will there be no new quotes after 2017-09-05 or no data before 2017-09-05? I don't understand it.

I want to mention, that the new sources are not suitable for all users and do not cover all fields of interest :(

@rsvp
Copy link
Owner Author

rsvp commented Sep 17, 2017

@paintdog tail() retrieves the tail end of a dataframe.
Our yi_stock module currently has a dependency on pandas_reader
to get() equities data, but it may be re-route requests by other means
(and to other vendors besides Google Finance).

All our sources for suitable for all users in the field of financial economics.

@rsvp
Copy link
Owner Author

rsvp commented Sep 22, 2017

StackOverflow: DataReader google finance date not working points to change in URL at Google as cause for data disruption.

But unfortunately, for the PR to modify said URL, pydata/pandas-datareader@eac67a4 ,
the required Travis CI build has failed.

@rsvp
Copy link
Owner Author

rsvp commented Sep 27, 2017

Google Finance website currently displays a Yellow Warning Banner which reads as follows:

Google Finance is under renovation. As a part of this process, the Portfolios feature won't be available after mid-November 2017.

Their notice at https://support.google.com/finance indicates:

  • Portfolios will be deprecated
  • To download available data, follow these steps:

Why? Their reasoning: "to make Google Finance more accessible and user-friendly for a wider audience." Yeah, right 👎 See also Quora: Why is the Google Finance portfolio feature disabled?

Thus any tentative fixes now for data retrieval may go to waste after renovation is completed:

  • Wait to see the renovated source API for upstream repo.

Noteworthy is the secure https redirect from the "www" to "finance" subdomain for google.com.

@rsvp
Copy link
Owner Author

rsvp commented Oct 8, 2017

Upstream fixes not passing the tests in Travis CI build has been resolved: pydata/pandas-datareader#404

This is especially pertains to:
pydata/pandas-datareader#394
pydata/pandas-datareader#395
pydata/pandas-datareader#402

Noteworthy snippet for Google Finance URL fix:
pytest.skip("Google quote api is offline as of Oct 1, 2017")
and for Yahoo Finance, drop NaN missing values in pandas DataFrame:
return df.sort_index().dropna(how='all')

@paintdog
Copy link

paintdog commented Oct 10, 2017

What does

pytest.skip("Google quote api is offline as of Oct 1, 2017")

mean? At the moment I can still retrieve data from Google!

@rsvp
Copy link
Owner Author

rsvp commented Oct 10, 2017

@paintdog That line is used to avoid testing designated parts of the
Travis CI build upstream at pandas-datareader. In the interim,
suppressing the bug catchers is not a good idea...

As you have observed, the current code is still operational,
so the use of pytest.skip for the Google Finance test fix is indeed
too definitive in the sense that Google may reconsider its API changes
during the reconstruction period. Tests could use pytest.xfail or even more
explicitly @pytest.mark.xfail() with condition and reason attached.
See https://docs.pytest.org/en/latest/skipping.html for details.

That way upstream, we can later run: pytest --runxfail
to see if Google has reverted without much hassle
and just remove the decorators,
or rewrite tests to conform to the new API.
(Historically, Google has reversed its pronouncements on quote
retrieval via Sheets, so there's hope if there's enough public outcry :-).

@rsvp
Copy link
Owner Author

rsvp commented Oct 16, 2017

Alternative: Morningstar

Some work has begun: pydata/pandas-datareader#411
which includes quotes and data from financial statement.
But the Travis CI build has failed.
Here's the key URL which was used by @dtemkin :
http://globalquote.morningstar.com/globalcomponent/RealtimeHistoricalStockData.ashx
Addendum, 2017-10-27: tests and docs are in progress.

At fecon235, we may integrate partial code directly from:
https://gist.github.com/hahnicity/45323026693cdde6a116
which is far more informative than anything from the
origin https://www.moringstar.com -- thanks to @hahnicity
(checked his repos, but nothing Morningstar specific, so just that gist).


On 2018-07-31, pydata/pandas-datareader#557 (comment) @hubbins wrote:

I have confirmation from Morningstar that this service has been discontinued, it is not a temporary outage.

  • MORNINGSTAR API

@rsvp
Copy link
Owner Author

rsvp commented Oct 22, 2017

  • Test pandas-datareader: Change from skip to mark.xfail in test_google

See next to previous comment for details
and pydata/pandas-datareader#413

@rsvp
Copy link
Owner Author

rsvp commented Oct 23, 2017

Alternative: Tiingo

Tiingo has a REST and Real-Time Data API includes support for
Stock Market Ticker Closing Prices + Metadata.

At fecon235, we may integrate partial code directly from:
https://github.com/hydrosquall/tiingo-python
which outputs CSV or JSON, so conversion to DataFrame
format is the next step. Thanks to @hydrosquall

From our Gitter, https://gitter.im/rsvp/fecon235 on 2017-10-24:

Hi all - I'm Rishi founder of tiingo.com. I was told by a few members of the community some were looking for data replacements. The api is available at https://api.tiingo.com. Will take a look at the repo tomorrow and see where I can help to provide optimizations and also make the API play nicely. In the interim, looking forward to learning about everyone here. If anybody has some algo questions let me know. Was the first employee at AlphaParity, traded exotics, and treasuries, and built trading systems that realized sharpe of 2 on about 150mm before leaving that world. Will try to help where I can.

The free API permits up to 20,000 requests per day on over 56,000 securities globally.
Sounds quite generous for a database going back 30 years,
see https://api.tiingo.com/pricing -- Thanks, Rishi !

On 2017-10-25, Rishi responds:

Ah - its limited to 500 tickers of your choice which is the only limit on that. My goal is to make it disruptive but also a sustainable business. So as time continues, we can keep offering more but keep prices as constant as possible. [I] want the API to exist a long time and be reliable but also be absurdly cheap

@rsvp
Copy link
Owner Author

rsvp commented Oct 25, 2017

Alternative: Alpha Vantage

  • Alpha Vantage API

Documentation https://www.alphavantage.co/documentation
Requires user to get free API key

@RomelTorres commented:

I don't think AlphaVantage itself will release a python wrapper for its API call.

Hence his Python module to get stock data and cryptocurrencies:
https://github.com/RomelTorres/alpha_vantage

2018 development at pandas-datareader:
pydata/pandas-datareader#490
which includes AVTimeSeriesReader and AVForexReader.

We would like to invite more discussion comparing the data vendors
at Gitter https://gitter.im/rsvp/fecon235

@rsvp
Copy link
Owner Author

rsvp commented Oct 29, 2017

Alternative, not supported: Bloomberg API

Documentation https://www.bloomberg.com/professional/support/api-library/
Requires user to have Bloomberg Terminal access,
which will cost $$$ (thus does not share core fecon235 objectives).

For Python pandas interface, we refer to the repo by @matthewgilbert
https://github.com/matthewgilbert/pdblp

@rsvp
Copy link
Owner Author

rsvp commented Nov 7, 2017

pandas-datareader RemoteDataError

For 0.5.0, failures reported for both get_quote_yahoo and get_quote_google,
see pydata/pandas-datareader#418 as of 2017-11-07.

Alternative: Barchart API

Documentation: https://www.barchart.com/ondemand/api
Requires registration: https://www.barchart.com/ondemand/free-market-data-api
but includes stock, futures, FX, and cryptocurrencies
(presumably the CME Bitcoin contract upon regulatory approval).
Free histories up to 2 years, where End-of-day available for:

  • Equities: AMEX, LSE, NASDAQ, NYSE
  • Futures: CBOT, CME, COMEX, NYMEX
  • Forex: Global Currencies

  • Barchart API, CSV -> Python pandas DataFrame
  • Barchart API, JSON -> Python pandas DataFrame
#  JSON gist
import json
import requests
quote = json.loads(requests.get('http://marketdata.websol.barchart.com\
    /getQuote.json?apikey=<api_key>&symbols={}'.format(tic)).text)\
    ['results'][0]
#  Replace <api_key> with your own key.

Thanks to @liuyigh !


@BlackArbsCEO has provided a working gist:
https://gist.github.com/BlackArbsCEO/2394808dcb1f7c1bdd4e
with commentary on the code here:
http://www.blackarbs.com/blog/how-to-get-free-intraday-stock-data-with-python-and-barcharts-ondemand-api/9/22/2015


@femtotrader a pandas-datareader contributor,
informed by the last gist developed:
Python client for Barchart OnDemand
https://github.com/femtotrader/barchart-ondemand-client-python
Then apparently @lanshark modified it into a PyPI package


@paintdog
Copy link

paintdog commented Nov 8, 2017

For 0.5.0, failures reported for both get_quote_yahoo and get_quote_google,
see pydata/pandas-datareader#418 as of 2017-11-07.

This is too bad - no hope that Google will deliver data again in the soon future???

@rsvp
Copy link
Owner Author

rsvp commented Nov 9, 2017

Re: Thanks for all the fish

Walt, November 2, 2017 at 21:41, On a Yahoo forum an administrator commenting on the 999 error message initially stated it was a technical issue and would be fixed. Two hours ago he updated his message to say that the use of downloads.finance.yahoo.com (which has been going on for 2 decades without a yahoo whimper) violates the ToS and would not be restored.

[F]or ETFs they [Quandl] want 50 bucks per month.

Scraping by URL

The cookie field is not for preventing robots, but for disabling all API access by software. You can indeed step around the cookie method with some tricks, even automated, but that’s not really a professional solution. // That “crumb” in your URL is a code connected to a cookie on your PC. No cookie, no data.

  • Yahoo changed their URL scheme to require an established session with cookies. Potentially you could use ‘curl’ to establish a connection to the main site, get a cookie, and then use that to get a crumb. With that crumb, now you can form your download URL. Because the Crumb is part of the cookie. Do note as well that times changed to POSIX timestamps. You’ll need to convert your dates.

Notes on Alpha Vantage

Alpha Vantage does not support bulk download the way Yahoo did, i.e. stock=A,AA,AAP,AAPL,… in the URL. Downloading each stock in separate URL is too slow.

Alpha Vantage does not adjust for splits and dividends. Their reply: "We have just beta-launched the adjusted close price feature for our daily prices API." // Alpha Vantage has a dedicated JSON API service for split/dividend-adjusted EOD historical data. For example: https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=MSFT&outputsize=full&apikey=demo

Jenn: While Alpha Vantage has great data and a well-documented API, it’s pretty sketchy that they don’t describe their business model or have an “About Us” page that gives a company address and information about the actual human beings who work there. I’d love to use it, but it’s hard to trust it at the moment.

Comments last updated: 2017-11-07

🔢 REQUEST Emoji on the "Alternatives" above to express your reactions,

or please kindly write out your full opinion here regarding your preferences.
Thank you.

@rsvp
Copy link
Owner Author

rsvp commented Nov 12, 2017

Yahoo cookie / crumb : other Python solutions

Crumb is just part of the cookie, and here are some Python URL scraping solutions:

fix-yahoo-finance by Ran Aroussi

@ranaroussi claims his fix also works independently of pandas_datareader:

fix-yahoo-finance offers a temporary fix to the problem by scraping the data from Yahoo! finance using and return a Pandas DataFrame/Panel in the same format as pandas_datareader's get_data_yahoo(). [I]t can also be used as a stand-alone library (without pandas_datareader).

For details, see https://github.com/ranaroussi/fix-yahoo-finance (> 130 stars)
or https://pypi.python.org/pypi/fix-yahoo-finance
which requires Python >=3.4 and multitasking >= 0.0.3.


The fragility of the using a non-API solution is illustrated by:

  • Andrea Galeazzi's gist
  • Jack Threadfin's gist where a single character change by Yahoo in the cookie syntax can break a "solution."

Corey Goldberg @cgoldberg started ystockquote five years ago:
https://github.com/cgoldberg/ystockquote (over 400 stars)
2018-03-18 Update: still experiencing difficulties over 2017 issues,
see cgoldberg/ystockquote#52 (comment)

Thanks all ! And be sure to PR upstream.

@rsvp
Copy link
Owner Author

rsvp commented Nov 12, 2017

Alternatives: Misc. ETC.

@wilsonfreitas provides an extensive listing of data sources:
https://github.com/wilsonfreitas/awesome-quant#data-sources
(which is also a great resource for quants and traders).

Any preferences therein which are reliable for equities data?

URL shortcut to awesome-quant page: https://git.io/eqdata

@rsvp
Copy link
Owner Author

rsvp commented Nov 21, 2017

Alternative: IEX API

  • No key registration.
  • Free data for redistribution.
  • Tickers are a superset of Nasdaq/NYSE/AMEX.
  • ETFs and most mutual funds supported.
  • No retrieval limits / throttling, but limit large request to 1/second.
  • Supports websockets, i.e. pushes data.

Main page: https://iextrading.com/developer
API: https://iextrading.com/developer/docs/#getting-started

Thanks to @iexg and @lockefox
Inquiries to [email protected]

Thanks to @addisonlynch who notes historical datasets are available
through IEX chart endpoint: https://iextrading.com/developer/docs/#chart
but data must be within a 5-year historical period
(a limitation if migrating from Yahoo or Google finance).

@rsvp
Copy link
Owner Author

rsvp commented Nov 25, 2017

Some interim Google functionality: November 2017

import pandas_datareader.data as web
import datetime

start = datetime.datetime(2017, 1, 1)
end = datetime.date.today()

google = False

if google:
    f = web.DataReader("ETR:SIE", 'google', start, end)
else:
    f = web.DataReader("SIE.DE", 'yahoo', start, end)

print(f.Close)

@paintdog
Copy link

I hope that the info from @VicTangg will be used to repair pandas datareader. It seems that Google is still delivering data in an acceptable quality.

@bsolomon1124
Copy link

bsolomon1124 commented Dec 1, 2017

A new issue has surfaced in the last few days regarding google pulls with pandas-datareader. (See 424.)

However, the API itself still seems intact:

import datetime
import requests
from io import StringIO
# This is just a wrapper importing the compatible version of
#     urllib's urlencode--see pandas docs
from pandas.io.common import urlencode
import pandas as pd

BASE = 'http://finance.google.com/finance/historical'


# There seems to be confusion over whether the date api has changed.
# https://github.com/pydata/pandas-datareader/pull/425
# Both formats seem to work, but I'll use the "newer" one here to be safe
def get_params(symbol, start, end):
    params = {
        'q': symbol,
        'startdate': start.strftime('%Y/%m/%d'),
        'enddate': end.strftime('%Y/%m/%d'),
        'output': "csv"
    }
    return params


def build_url(symbol, start, end):
    params = get_params(symbol, start, end)
    return BASE + '?' + urlencode(params)


start = datetime.datetime(2010, 1, 1)
end = datetime.datetime.today()   # made around 10:30 am EST
sym = 'SPY'
url = build_url(sym, start, end)

data = requests.get(url).text
data = pd.read_csv(StringIO(data), index_col='Date', parse_dates=True)

print(data.head())
#               Open    High     Low   Close     Volume
# Date
# 2017-11-30  263.76  266.05  263.67  265.01  127894389
# 2017-11-29  263.02  263.63  262.20  262.71   77512102
# 2017-11-28  260.76  262.90  260.66  262.87   98971719
# 2017-11-27  260.41  260.75  260.00  260.23   52274922
# 2017-11-24  260.32  260.48  260.16  260.36   27856514

@rsvp
Copy link
Owner Author

rsvp commented Dec 1, 2017

RE: UnicodeDecodeError: 'utf-8' codec can't decode byte: invalid start byte:
@bsolomon1124 hi Brad, I would be inclined to think that
Google intends UTF-8, not bytes, at the end of the wire,
so thanks for your modification from requests.get(url).content
to requests.get(url).text

Decoding issue was also reported at Reddit

In regards to your proposals for Google URL date format
in pydata/pandas-datareader#347 and pydata/pandas-datareader#425
%2F within the URL is just the longer machine version
of - which is human readable, e.g.:
http://finance.google.com/finance/historical?q=SPY&startdate=2010%2F01%2F01&enddate=2017%2F12%2F01&output=csv

Brad, your fixes are very much appreciated. Thank you!

@bsolomon1124
Copy link

bsolomon1124 commented Dec 5, 2017

@rsvp I'm still not sure if bytes v. string is the issue here, though, the more I look into it. Reading in bytes is explicitly covered/addressed. Instead, I'm starting to think it's just that the GitHub code is not reflected in PyPI, despite both ostensibly being version 0.5.0.

I.e.:

import datetime

from pandas.compat import StringIO, bytes_to_str
from pandas.io.common import urlencode
import requests


BASE = 'http://finance.google.com/finance/historical'


def _get_params(symbol, start, end):
    params = {
        'q': symbol,
        'startdate': start.strftime('%Y/%m/%d'),
        'enddate': end.strftime('%Y/%m/%d'),
        'output': "csv"
    }
    return params


def build_url(symbol, start, end, form='new'):
    params = _get_params(symbol, start, end)
    return BASE + '?' + urlencode(params)


sym = 'AAPL'
start = date(2010, 1, 1)
end = date.today()

url = build_url(sym, start, end)
# http://finance.google.com/finance/historical?q=AAPL&startdate=Jan+01%2C+2010&enddate=Dec+05%2C+2017&output=csv


session = requests.Session()
byts = session.get(url).content
out = StringIO()
out.write(bytes_to_str(byts))
out.seek(0)
data = pd.read_csv(out, index_col=0, parse_dates=True).sort_index()

data 
              Open    High     Low   Close     Volume
# Date                                                 
# 2010-01-04   30.49   30.64   30.34   30.57  123432050
# 2010-01-05   30.66   30.80   30.46   30.63  150476004
# 2010-01-06   30.63   30.75   30.11   30.14  138039594
# 2010-01-07   30.25   30.29   29.86   30.08  119282324
# 2010-01-08   30.04   30.29   29.87   30.28  111969081
# 2010-01-11   30.40   30.43   29.78   30.02  115557365
# 2010-01-12   29.88   29.97   29.49   29.67  148614774
# 2010-01-13   29.70   30.13   29.16   30.09  151472335
# 2010-01-14   30.02   30.07   29.86   29.92  108288411
# 2010-01-15   30.13   30.23   29.41   29.42  148584065
# 2010-01-19   29.76   30.74   29.61   30.72  182501620
# 2010-01-20   30.70   30.79   29.93   30.25  153037892
#            ...     ...     ...     ...        ...
# 2017-11-16  171.18  171.87  170.30  171.10   23637484
# 2017-11-17  171.04  171.39  169.64  170.15   21899544
# 2017-11-20  170.29  170.56  169.56  169.98   16262447
# 2017-11-21  170.78  173.70  170.78  173.14   25131295
# 2017-11-22  173.36  175.00  173.05  174.96   25588925
# 2017-11-24  175.10  175.50  174.65  174.97   14026673
# 2017-11-27  175.05  175.08  173.34  174.09   20716802
# 2017-11-28  174.30  174.87  171.86  173.07   26428802
# 2017-11-29  172.63  172.92  167.16  169.48   41666364
# 2017-11-30  170.43  172.14  168.44  171.85   41527218
# 2017-12-01  169.95  171.67  168.50  171.05   39759288
# 2017-12-04  172.48  172.62  169.63  169.80   32542385

@rsvp
Copy link
Owner Author

rsvp commented Dec 6, 2017

@bsolomon1124 hi Brad, interesting detective work there with PyPI.

The question of code replication would then extend to those who have
installed pandas_datareader via Anaconda's conda install command.
By further extension, it would apply to those using Anaconda's Docker image
to create containers where such an auto-install is scripted.

@bsolomon1124
Copy link

bsolomon1124 commented Dec 6, 2017

@rsvp Not just conda, but pip also. Take a look in the tar from PyPI and you'll notice the old link in place. (pandas-datareader/google/daily/_GoogleDailyReader.url)

@rsvp
Copy link
Owner Author

rsvp commented Dec 13, 2017

Some fixes on forthcoming pandas_datareader 0.6.0

Thanks to @davidastephens

Yahoo: cryptocurrency quotes

When people are taking out mortgages on their homes to make bets...

Bitcoin futures

CME BTC quotes and charts: http://www.cmegroup.com/trading/equity-index/us-index/bitcoin.html

Cboe XBT quotes: http://cfe.cboe.com/cfe-products/xbt-cboe-bitcoin-futures

The notional value of a CME contract is 5 times greater than a Cboe contract.

@rsvp
Copy link
Owner Author

rsvp commented Jan 16, 2018

UPGRADE to development version of pandas_datareader

Is there a way to mitigate interim disruptions with Yahoo
and Google, and get the latest documentation, without waiting
around for a major release? Often a fix simply involves
revising an URL address, but the fix arrives downstream
months later.

Anaconda distribution

It is possible to conda install the HEAD of the master branch:
https://conda.io/docs/commands/build/conda-develop.html
Using: conda develop /path/to/pandas_datareader
installs a Python package in "development mode."
This works by creating a conda.pth file in site-packages.
Even --uninstall is provided to back out of development.
Thanks to @ralexx

@bashtage notes:

Conda-forge can be triggered by users though, and since this package doesn't have any binary requirements, PyPi is as good as anything else.

Direct from source using git

#  To get master HEAD:
git clone https://github.com/pydata/pandas-datareader
cd pandas-datareader
python setup.py install

It is an open question whether setup.py will properly
install with respect to both the system's and Anaconda's
versions of Python.

Installation using pip

pip install git+https://github.com/pydata/pandas-datareader

@rsvp
Copy link
Owner Author

rsvp commented Jan 18, 2018

pandas_reader Deprecations

As of 2018-01-18, leading up to their 0.6.0 release,
contributor @bashtage has deprecated:

Fallback vendor(s) for reliable equities data has not been clarified
at https://github.com/pydata/pandas-datareader thus fecon235
may have to develop the alternatives directly and independently.

@rsvp
Copy link
Owner Author

rsvp commented Jan 23, 2018

Alternative: Robinhood

Robinhood Markets Inc. is a commission-free, online securities brokerage.
Documentation https://github.com/sanko/Robinhood
No API key required. Thanks to @sanko and /r/Robinhood

Snippet for current quotes:
key https://api.robinhood.com/quotes/?symbols=MSFT,FB,TSLA

Snippet for historical data:
https://api.robinhood.com/quotes/historicals/?symbols=MSFT,FB,TSLA&interval=day

@rsvp
Copy link
Owner Author

rsvp commented Jan 23, 2018

@rsvp
Copy link
Owner Author

rsvp commented Jan 23, 2018

Alternative: Interactive Brokers

Official: https://interactivebrokers.github.io

The official Interactive Brokers Python API has a few design choices which make it run slowly. Specifically: excessive debug logging, and an overly cautious lock on the socket connection. ibapi-grease provides monkey patches that eliminate these bottlenecks by turning off the logging and removing the locks. -- @quantrocket-llc

IB's Zipline only supports backtesting, while IB's Catalyst supports backtesting and live trading. Zipline and Catalyst support a separate data ingestion step which downloads the data once:

http://www.zipline.io/bundles.html
https://github.com/quantopian/zipline/blob/master/zipline/data/
https://github.com/enigmampc/catalyst/tree/master/catalyst/data

Above, thanks to @westurner

@rsvp
Copy link
Owner Author

rsvp commented Jan 28, 2018

pandas_datareader v0.6.0 Release

@bashtage released this 3 hours ago -- tremendous work!

Warning: Yahoo!, Google Options, Google Quotes and EDGAR have been immediately deprecated.

But Google finance is still functioning for historical price data, although there are frequent reports of failures. Google failure is frequently encountered when bulk downloading historical price data.

Highlights include:

  • Immediate deprecation of Yahoo!, Google Options and Quotes and EDGAR. The end points behind these APIs have radically changed and the existing readers require complete rewrites. In the case of most Yahoo! data the endpoints have been removed. PDR would like to restore these features, and pull requests are welcome.

  • A new connector for Tiingo was introduced. Tiingo provides historical end-of-day data for a large set of equities, ETFs and mutual funds. Free registration is required to get an API key.

  • A new connector for Robinhood was introduced. This provides up to 1 year of historical end-of-day data. It also provides near real-time quotes.

  • A new connector for Morningstar Open, High, Low, Close and Volume was introduced.

  • A new connector for IEX daily price data was introduced. The majority of the IEX API was introduced.

  • A new data connector for stock index data provided by Stooq was introduced.

  • A new data connector for data provided by the Bank of Canada was introduced.

Documentation: https://pandas-datareader.readthedocs.io/en/latest/remote_data.html

Example, given pandas_datareader v0.6.0

Let's suppose we want quotes for the S&P500 ETF called "SPY"
using fecon235:

>>> spy = get("s4spy")
/home/yaya/net/anaconda/lib/python2.7/site-packages/pandas_datareader/google/daily.py:40: 
UnstableAPIWarning: The Google Finance API has not been stable since late 2017. 
Requests seem to fail at random. Failure is especially common when bulk downloading. 
warnings.warn(UNSTABLE_WARNING, UnstableAPIWarning)
 ::  Retrieved from Google Finance: SPY

We successfully got a pandas DataFrame in a variable called spy,
but with a warning which we can ignore since fecon235 does not
bulk download (instead, individual series are ensembled
together into a DataFrame by a function called paste).

[pandas_datareader v0.7.0 Release]

But wait, Yahoo Finance to be reintegrated in 0.7.0, see
pydata/pandas-datareader#495
Bonus: FX currency data to be accessible with YahooFXReader.

@rsvp
Copy link
Owner Author

rsvp commented Mar 20, 2018

2018-03-20 @jfunction in pydata/pandas-datareader#502 (comment)
reports Google Finance has removed historical data.
In trying: http://finance.google.com/finance/historical?q=GOOG
the following message appears on that webpage:

We're sorry... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now. See Google Help for more information.

That specific Google Help page: https://support.google.com/websearch/answer/86640
in its relevant portion reads as follows:

"Unusual traffic from your computer network"

If devices on your network seem to be sending automated traffic to Google, you
might see "Our systems have detected unusual traffic from your computer network."

What Google considers automated traffic:
- Sending searches from a robot, computer program, automated service, or search scraper
- Using software that sends searches to see how a website or webpage ranks on Google

In other words, it seems that Google Finance is now only intended for human eyes,
not automated scripts. Our guess: for their advertising revenue.

@rsvp
Copy link
Owner Author

rsvp commented Jun 25, 2018

CHANGELOG 2018-06-23 (tag: v6.18.0623)

Major version change for fecon235 from v5 to v6
signaling our integration with fecon236
which was spun-off from our source code.

Henceforth, fecon235 becomes a repository solely of Jupyter notebooks.
The old Python source code at fecon235 will remain for archival purposes,
while new code development shifts over to fecon236.
Please see https://git.io/econ for details.

Revise docs/fecon235-00-README.ipynb to introduce fecon236.
Function names have been retained, but under fecon236
the call routing is expected to be more explicit than casual,
i.e. modules names are more significant.
Please see https://git.io/fecon-intro

NOTICE of MOVE

This issue and its remedies has moved to: MathSci/fecon236#2

Your review and feedback there would be greatly appreciated. Thank you!

@rsvp
Copy link
Owner Author

rsvp commented Dec 8, 2018

Possibility of using QUANDL is likely nill for freely accessible data

... since Nasdaq acquired Quandl on or about December 4, 2018,
and appears to have turned proprietary.

For full details, see follow-up on "Disruption of Quandl data, esp. futures"
MathSci/fecon236#7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants