In this repository, I analyze the price response functions of the NASDAQ TAQ financial market data for the year 2008.
A research paper made using the code in this repository and published in The European Journal of Physics B can be found here. A preprint version of the paper can be found here. To cite the papers or the code you can use the following BibTeX suggestions.
I reproduce in the taq_responses_physical folder the sections 3.1 and 3.2 of the paper Cross-response in correlated financial markets: individual stocks to obtain the midpoint prices, trade signs, self-responses, cross-responses, trade sign self-correlators and trade sign cross-correlators values for different stocks.
Based on these values, I analyzed the price response functions in trade and physical time scale (taq_responses_trade, taq_responses_physical) and the influence of the number of trades in a second in the response functions (taq_responses_activity). I also analyze the influence of the time shift between trade signs and midpoint prices (taq_physical_shift, taq_responses_physical_shift, taq_trade_shift and taq_responses_trade_shift), the influence of the time lag (taq_responses_physical_short_long) in the response functions in physical time scale and the impact of the spread (taq_avg_responses_physical) in the strength of the response functions in physical time scale.
You can find here a detailed documentation of the code.
The main code is implemented in Python
. As we use the TAQ data format, it is
necessary to extract the data to a readable format. To do that, is used a C++
module, however, all this process is automated with Python
.
If you are part of the AG Guhr and you are interested in test the code, you can write me asking for some data files examples, so I can share the files with you. Unfortunately, due to Copyright, I can not share the data files with external people of the research group.
For Python
, all the packages needed to run the analysis are in the
requirements.txt
file.
For the C++
module compilation I used the g++
compiler. It is necessary to
install the -lboost_date_time
and the armadillo-3.920.3
module.
The first step is to clone the repository
$ git clone https://github.com/juanhenao21/financial_response_spread_year.git
To install all the needed Python
packages I recommend to create a virtual
environment and install them from the requirements.txt
file. To install the
packages from terminal, you can use
$ virtualenv -p python3 env
$ source env/bin/activate
$ pip install -r requirements.txt
After you clone the repository, you need to create two folders inside the
financial_response_spread_year/project
folder, one folder with the name
taq_data
and another folder with the name taq_plot
. In these folders will
be saved the results of the analysis.
To run the code from scratch and reproduce the results in section 2.3 and
2.4 of the
paper,
you need to copy the folder decompress_original_data_2008
to the folder
financial_response_spread_year/project/taq_data
.
Then you need to create a folder with the name original_year_data_2008
inside
financial_response_spread_year/project/taq_data
and move the .quotes
and
.trades
files of the tickers you want to analyze. Make sure you move a copy
of the files and not the originals, because when you run the code, it will
delete these files to free space.
Then, you need to move (cd) to the folder
financial_response_spread_year/project/taq_responses_physical/taq_algorithms/
and in the main()
function of the module
taq_data_main_responses_physical.py
, edit the tickers list with the stocks
you want to analyze (in this case the symbols of the files of the tickers you
copy in the previous step).
tickers = ['AAPL', 'MSFT']
Finally, you need to run the module. In Linux, using the terminal the command looks like
$ python3 taq_data_main_responses_physical.py
The program will obtain and plot the data for the corresponding stocks.
If you have the CSV data files, you need to create a folder with the name
csv_year_data_2008
inside financial_response_spread_year/project/taq_data
,
and move the CSV files inside. Make sure you move a copy of the files and not
the originals, because when you run the code, it will delete these files to
free space. Then go to the
financial_response_spread_year/project/taq_responses_physical/taq_algorithms/taq_data_main_responses_physical.py
file and comment the line in the main
function
# taq_build_from_scratch(tickers, year)
Edit the tickers list with the stocks you want to analyze (in this case the symbols of the files of the tickers you copy in the previous step).
tickers = ['AAPL', 'MSFT']
Finally, you need to run the module. In Linux, using the terminal, the command looks like
$ python3 taq_data_main_responses_physical.py
The program will obtain and plot the data for the corresponding stocks.
All the following analysis depend directly from the results of this section. If you want to run them, you need to run this section first.
To run this part of the code, you need to move (cd) to the folder
financial_response_spread_year/project/taq_responses_trade/taq_algorithms/
and edit the tickers list with the stocks you want to analyze (in this case the
symbols of the files of the tickers you use in the previous section).
tickers = ['AAPL', 'MSFT']
Then you need to run the module taq_data_main_responses_trade.py
. In Linux,
using the terminal the command looks like
$ python3 taq_data_main_responses_trade.py
This part of the code is the slowest due to a bad implementation. I do not recommend to analyze several stocks in this time scale.
To run this part of the code, you need to move (cd) to the folder
financial_response_spread_year/project/taq_responses_activity/taq_algorithms/
and edit the tickers list with the stocks you want to analyze (in this case the
symbols of the files of the tickers you use in the TAQ Responses Physical
section).
tickers = ['AAPL', 'MSFT']
Then you need to run the module taq_data_main_responses_activity.py
. In
Linux, using the terminal the command looks like
$ python3 taq_data_main_responses_activity.py
The TAQ time shift analysis is divided in two time scales and in two modules. The modules have to be executed in the order they appear in the explanation.
In both cases you need to edit the tickers list with the stocks you want to analyze (in this case the symbols of the files of the tickers you use in the previous sections).
tickers = ['AAPL', 'MSFT']
To run this part of the code, you need to move (cd) to the folder
financial_response_spread_year/project/taq_physical_shift/taq_algorithms/
and
run the module taq_data_main_physical_shift.py
. In Linux, using the terminal
the command looks like
$ python3 taq_data_main_physical_shift.py
After you run the taq_data_main_physical_shift.py
module, you can move (cd)
to the folder
financial_response_spread_year/project/taq_responses_physical_shift/taq_algorithms/
and run the module taq_data_main_responses_physical_shift.py
. In Linux, using
the terminal the command looks like
$ python3 taq_data_main_responses_physical_shift.py
To run this part of the code, you need to move (cd) to the folder
financial_response_spread_year/project/taq_trade_shift/taq_algorithms/
and
run the module taq_data_main_trade_shift.py
. In Linux, using the terminal the
command looks like
$ python3 taq_data_main_trade_shift.py
After you run the taq_data_main_trade_shift.py
module, you can move (cd) to
the folder
financial_response_spread_year/project/taq_responses_trade_shift/taq_algorithms/
and run the module taq_data_main_responses_trade_shift.py
. In Linux, using
the terminal the command looks like
$ python3 taq_data_main_responses_trade_shift.py
To run this part of the code, you need to move (cd) to the folder
financial_response_spread_year/project/taq_responses_physical_short_long/taq_algorithms/
and edit the tickers list with the stocks you want to analyze (in this case the
symbols of the files of the tickers you use in the previous sections).
tickers = ['AAPL', 'MSFT']
Then you need to run the module
taq_data_main_responses_physical_short_long.py
. In Linux, using the terminal
the command looks like
$ python3 taq_data_main_responses_physical_short_long.py
To run this part of the code, you need to move (cd) to the folder
financial_response_spread_year/project/taq_avg_spread/taq_algorithms/
and edit the tickers list with the stocks you want to analyze (in this case the
symbols of the files of the tickers you use in the previous sections).
tickers = ['AAPL', 'MSFT']
Then you need to run the module taq_data_main_avg_spread.py
. In Linux, using
the terminal the command looks like
$ python3 taq_data_main_avg_spread.py
This analysis is recommended to be done with several stocks. The key point is that all the stocks used have to have already the self-response function analysis of the first part (TAQ Responses Physical).
After you run the taq_data_main_avg_spread.py
module, you can move (cd) to
the folder
financial_response_spread_year/project/taq_avg_responses_physical/taq_algorithms/
and run the module taq_data_main_avg_responses_physical.py
. In Linux, using
the terminal the command looks like
$ python3 taq_data_main_avg_responses_physical.py
A complete explanation of this work can be found in this paper. In general for the response functions, an increase to a maximum followed by a slowly decrease is expected.
In the time shift analysis, a change in the relative position between returns and trade signs can vanish the response function signal.
Dividing the time lag used in the returns, we obtain a short and long response function, where the short component has a large impact compared with the long component.
Finally, the spread directly impact the strength of the price response functions. Liquid stocks have smaller price responses.
- Juan Camilo Henao Londono - Initial work, repository, paper - Website
- Sebastian M. Krause - Paper
- Thomas Guhr - Paper
- Research Group Guhr - Website
- DAAD Research Grants - Doctoral Programmes in Germany