Stock Data Master (or SDM) is a library to provide utility functions for stock analysis, including the following:
- Load stock data from API data provider via Restful API request
- Cleanup the data from response and save it to either SQLite database or CSV file
- Load saved data into a unified format and validate on different levels
- Simulate the market with data loaded, and evaluate the performance of the strategy defined by yourself
- Defines some popular candlestick patterns and other common calculations such as RSI
- Python: >= 3.5
- pandas
- plotly
- matplotlib
- numpy
- pytz
Simply run:
pip install sdmaster
This library simplify the process of calling an stock data provider API endpoints to get the stock data you want. It manages the API endpoint, authentication, date format, response parsing, etc., allowing users to only make one call to download the stock data from any provider supported in a united format for following tasks .
Currently SDM only supports three markets: NASDAQ, NYSE, and TSX (depends on the API data provider).
To make use of this function, you would need to first create an object for the specific API provider with your personal token, the market you want to operate on, and optionally your response format. Currently only json format is supported.
Example:
from sdm.api.iex_cloud import IEXCloudAPI
iex_caller = IEXCloudAPI('YOUR_TOKEN', 'nasdaq')
msft_raw_data = iex_caller.get_daily_historical_per_symbol(symbol='MSFT')
nasdaq_raw_data = iex_caller.get_daily_historical_all()
The following methods of requesting data are supported:
get_daily_historical_per_symbol(symbol)
: return the daily historical data for a symbolget_daily_historical_all()
: return the daily historical data for all symbols within the marketget_symbol_list()
: return the list of all symbolsget_previous_day_closing()
: return the previous closing price in the market.get_realtime_quote_per_symbol(symbol)
: return the real-time quote for specified symbolget_realtime_quote_all()
: return the real-time quote for all symbols in the market
The following API providers are supported:
Official documentation: https://iexcloud.io/docs/api/#symbols
Limitations:
- No support for TSX
Official documentation: https://financialmodelingprep.com/developer/docs/
Limitations:
- None
This function allows user to save the data you downloaded from API providers to a file on disk. Currently only SQLite and CSV formats are supported. If no file type specified, SQLite will be chosen by default.
To save historical/realtime quote data into a csv file, you can call the save_data() method in StockDataMaster similar to the sample code below.
from sdm.api.fmp import FMPAPI
from sdm.master import StockDataMaster
fmp = FMPAPI("YOUR_TOKEN", "nasdaq")
msft_data = fmp.get_daily_historical_per_symbol("MSFT")
sdm = StockDataMaster(file_path="/usr/local/data/sdm", file_type="csv")
sdm.save_data(data=msft_data, file_name="nasdaq_historical.csv", data_type="historical")
all_nasdaq_data = fmp.get_daily_historical_all()
sdm.save_data(data=all_nasdaq_data, file_name="nasdaq_historical.csv", data_type="historical")
Similarly, you can load the data from a csv file saved by SDM previously:
historical_data = sdm.load_data(file_name="nasdaq_historical.csv", data_type="historical")
realtime_data = sdm.load_data(file_name="nyse_realtime_quote.csv", data_type="realtime")
Notice that you can load data from your own CSV files as well, even if they are not generated by SDM. As long as there is a column named symbol representing the symbol of the stock, and a column named datetime representing the datetime of the data record. Then you can load it into SDM data with your own datetime format like below:
own_data = sdm.load_data(file_name="my_own_data.csv", data_type="historical", csv_datetime_format="%Y-%m-%d %H:%M:%S")
Benefits:
- Most widely used data format by data providers
- Can immediately see what is inside the file
- Allow you to load data from csv file of other sources
- Able to work closely with many other python frameworks such as pandas
Limitations:
- Must be loaded as a whole. Not ideal when you only need a portion of the data in the file
- Not able to mix data with different data columns in one file
Best Use Case:
- If you need to share the file with others
- If you only need to save the file and use other frameworks for further analysis
- If you need to load historical data from another source in csv formaat
The data is saved in a SQLite databse file with three columns: symbol, timestamp of the record, and a json string representing the data record. To save and load by SQLite is very similar to csv, with only file_type changed. notice that you can mix historical and realtime data in one SQLite db file, but there cannot be two records of the same symbol and timestamp since their combination is used as the primary key.
fmp = FMPAPI("YOUR_TOKEN", "nasdaq")
sdm = StockDataMaster(file_path="/usr/local/data/sdm", file_type="sql")
msft_data = fmp.get_daily_historical_per_symbol("MSFT")
sdm.save_data(data=msft_data, file_name="nasdaq_data.db", data_type="historical")
all_nasdaq_quote_data = fmp.get_realtime_quote_all()
sdm.save_data(data=all_nasdaq_quote_data, file_name="nasdaq_data.db", data_type="realtime")
Benefits:
- Able to quickly load only a subset of symbols and date range from a huge database.
- Able to mix different kinds of data record format into one file
Limitations:
- Need to use online tool or write queries to peek the content in the sql file
- Can only load db files generated by SDM
Best Use Cases:
- If you are going to use SDM for loading/saving/simulating duties
- If you would like to dump all the historical data into one big file, and load subset of data subsequently for analysis, such as data from Jan 1 2010 to Dec 31, 2020, or all the data for AAPL, etc.
When loading data from files, you need to provide a validation level (default as level 2 if not provided). The levels are explained as below:
- Level 0: no validation rules will be applied
- Level 1: all the basic columns (open, close, high, low, volume) must have a positive float/integer number as the value
- Level 2: Including all level 1 validation rules, while also verify that min must be lower than or equal to both open and close, while max must be higher than or equal to both open and close
- Level 3: Including all level 2 validation rules, while also check if there is a gap in the stock data of any symbol (considering market open days), or if there is a data record on a day that the market is supposed to be closed
You can perform a simulation using historical data and your algorithm to see how effective your strategy would work. Exapmle:
from sdm.master import StockDataMaster
from sdm.simulation.market import Market
from sdm.simulation.trader import Trader
from sdm.simulation.transaction import Transaction
from sdm.simulation.performance import get_trader_performance_metrics
import datetime as dt
# First we need to load some historical data to begin with. You need to have your own dataset file here
start_date = dt.datetime(2010, 1, 5)
end_date = dt.datetime(2013, 12, 31)
sdm = StockDataMaster(file_path="/usr/local", file_type="sql", data_type="historical")
tsx_data = sdm.load_data(file_name="tsx_data.db", start_date=start_date, end_date=end_date)
sdm.validate_data(tsx_data, market="tsx")
# Create the market object which will save all the historical data
tsx_market = Market(tsx_data, "tsx", start_date, end_date)
# We need to define our strategy algorithm here in a function. It is a generator yielding a transaction
def my_algorithm(market_data_cumulative, current_day, position, cash, transaction_history, real_time_price, clf=None):
# This generator function should be using the cumulative stock market data, your positions, your transaction
# history, and the realtime market price to determine whether you want to buy or sell, and if yes yield a
# transaction object, which will be processed by the market. An example yield can be found below
# yield Transaction(-1, "AAPL", 10, 100.00, dt.datetime(2011, 5, 20))
return []
# Create a trader with the initial funds and strategy function and add to the market. Note that you can have multiple
# traders added to the same market, with different strategies or funds
small_trader = Trader(strategy_function=my_algorithm,
init_fund=10000,
start_date=start_date,
end_date=end_date,
name="Small Trader with less funds")
tsx_market.add_trader(small_trader)
big_trader = Trader(strategy_function=my_algorithm,
init_fund=1000000,
start_date=start_date,
end_date=end_date,
name="Big Trader with more funds")
tsx_market.add_trader(big_trader)
# Start the simulation
while not tsx_market.is_the_end():
tsx_market.trade_and_forward()
# Evaluate the performance of the trader
small_trader.log_assets()
big_trader.log_assets()
get_trader_performance_metrics(small_trader) # this line will throw exception because you don't have any transaction