Skip to content

Commit

Permalink
Merge branch 'main' into issue-804-thread-multisim
Browse files Browse the repository at this point in the history
  • Loading branch information
calina-c committed Apr 24, 2024
2 parents 8cbb6a9 + eb083e8 commit 848a8d7
Show file tree
Hide file tree
Showing 41 changed files with 1,260 additions and 948 deletions.
23 changes: 13 additions & 10 deletions READMEs/predictoor.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,6 @@ codesign --force --deep --sign - venv/sapphirepy_bin/sapphirewrapper-arm64.dylib

## 2. Simulate Modeling and Trading

> [!WARNING]
> Simulation has been temporarily disabled as of version v0.3.3
Simulation allows us to quickly build intuition, and assess the performance of the data / predicting / trading strategy (backtest).

Copy [`ppss.yaml`](../ppss.yaml) into your own file `my_ppss.yaml` and change parameters as you see fit.
Expand All @@ -70,22 +67,29 @@ Copy [`ppss.yaml`](../ppss.yaml) into your own file `my_ppss.yaml` and change pa
cp ppss.yaml my_ppss.yaml
```

Let's simulate! In console:

Let's run the simulation engine. In console:
```console
pdr sim my_ppss.yaml
```

What it does:

What the engine does does:
1. Set simulation parameters.
1. Grab historical price data from exchanges and stores in `parquet_data/` dir. It re-uses any previously saved data.
1. Run through many 5min epochs. At each epoch:
- Build a model
- Predict
- Trade
- Plot profit versus time, more
- Log to console and `logs/out_<time>.txt`
- For plots, output state to `sim_state/`

Let's visualize results. Open a separate console, and:
```console
cd ~/code/pdr-backend # or wherever your pdr-backend dir is
source venv/bin/activate

#display real-time plots of the simulation
streamlit run sim_plots.py
```

"Predict" actions are _two-sided_: it does one "up" prediction tx, and one "down" tx, with more stake to the higher-confidence direction. Two-sided is more profitable than one-sided prediction.

Expand All @@ -97,10 +101,9 @@ To see simulation CLI options: `pdr sim -h`.

Simulation uses Python [logging](https://docs.python.org/3/howto/logging.html) framework. Configure it via [`logging.yaml`](../logging.yaml). [Here's](https://medium.com/@cyberdud3/a-step-by-step-guide-to-configuring-python-logging-with-yaml-files-914baea5a0e5) a tutorial on yaml settings.

Plot profit versus time, more: use `streamlit run sim_plots.py` to display real-time plots of the simulation while it is running. After the final iteration, the app settles into an overview of the final state.

By default, streamlit plots the latest sim (even if it is still running). To enable plotting for a specific run, e.g. if you used multisim or manually triggered different simulations, the sim engine assigns unique ids to each run.
Select that unique id from the `sim_state` folder, and run `streamlit run sim_plots.py <unique_id>` e.g. `streamlit run sim_plots.py 97f9633c-a78c-4865-9cc6-b5152c9500a3`

You can run many instances of streamlit at once, with different URLs.

## 3. Run Predictoor Bot on Sapphire Testnet
Expand Down
19 changes: 13 additions & 6 deletions READMEs/trader.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,21 +59,29 @@ Copy [`ppss.yaml`](../ppss.yaml) into your own file `my_ppss.yaml` and change pa
cp ppss.yaml my_ppss.yaml
```

Let's simulate! In console:

Let's run the simulation engine. In console:
```console
pdr sim my_ppss.yaml
```

What it does:

What the engine does does:
1. Set simulation parameters.
1. Grab historical price data from exchanges and stores in `parquet_data/` dir. It re-uses any previously saved data.
1. Run through many 5min epochs. At each epoch:
- Build a model
- Predict
- Trade
- Log to console and `logs/out_<time>.txt`
- For plots, output state to `sim_state/`

Let's visualize results. Open a separate console, and:
```console
cd ~/code/pdr-backend # or wherever your pdr-backend dir is
source venv/bin/activate

#display real-time plots of the simulation
streamlit run sim_plots.py
```

"Predict" actions are _two-sided_: it does one "up" prediction tx, and one "down" tx, with more stake to the higher-confidence direction. Two-sided is more profitable than one-sided prediction.

Expand All @@ -85,10 +93,9 @@ To see simulation CLI options: `pdr sim -h`.

Simulation uses Python [logging](https://docs.python.org/3/howto/logging.html) framework. Configure it via [`logging.yaml`](../logging.yaml). [Here's](https://medium.com/@cyberdud3/a-step-by-step-guide-to-configuring-python-logging-with-yaml-files-914baea5a0e5) a tutorial on yaml settings.

Plot profit versus time, more: use `streamlit run sim_plots.py` to display real-time plots of the simulation while it is running. After the final iteration, the app settles into an overview of the final state.

By default, streamlit plots the latest sim (even if it is still running). To enable plotting for a specific run, e.g. if you used multisim or manually triggered different simulations, the sim engine assigns unique ids to each run.
Select that unique id from the `sim_state` folder, and run `streamlit run sim_plots.py <unique_id>` e.g. `streamlit run sim_plots.py 97f9633c-a78c-4865-9cc6-b5152c9500a3`

You can run many instances of streamlit at once, with different URLs.

## Run Trader Bot on Sapphire Testnet
Expand Down
13 changes: 11 additions & 2 deletions READMEs/vps.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,10 +242,19 @@ In `my_ppss.yaml` file, in `web3_pp` -> `development` section:

### Run pdr bot

Then, run a bot with modeling-on-the fly (approach 3). In console:
Then, run a bot with modeling-on-the fly (approach 2). In console:

```console
pdr predictoor 3 my_ppss.yaml development
pdr predictoor 2 my_ppss.yaml development
```

Or, to be fancier: (a) add `nohup` so that the run keeps going if the ssh session closes, and (b) output to out.txt (c) observe output
```console
# start bot
nohup pdr predictoor 2 my_ppss.yaml development 1>out.txt 2>&1 &

# observe output
tail -f out.txt
```

Your bot is running, congrats! Sit back and watch it in action. It will loop continuously.
Expand Down
35 changes: 20 additions & 15 deletions pdr_backend/aimodel/aimodel_data_factory.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import logging
import sys
from typing import Optional, Tuple
from typing import List, Optional, Tuple

import numpy as np
import pandas as pd
Expand Down Expand Up @@ -68,8 +68,8 @@ def create_xy(
self,
mergedohlcv_df: pl.DataFrame,
testshift: int,
feed: ArgFeed,
feeds: Optional[ArgFeeds] = None,
predict_feed: ArgFeed,
train_feeds: Optional[ArgFeeds] = None,
do_fill_nans: bool = True,
) -> Tuple[np.ndarray, np.ndarray, pd.DataFrame, np.ndarray]:
"""
Expand All @@ -80,6 +80,8 @@ def create_xy(
@arguments
mergedohlcv_df -- *polars* DataFrame. See class docstring
testshift -- to simulate across historical test data
predict_feed -- feed to predict
train_feeds -- feeds to use for model inputs. If None use predict feed
do_fill_nans -- if any values are nan, fill them? (Via interpolation)
If you turn this off and mergedohlcv_df has nans, then X/y/etc gets nans
Expand All @@ -94,27 +96,30 @@ def create_xy(
assert "timestamp" in mergedohlcv_df.columns
assert "datetime" not in mergedohlcv_df.columns

# every column should be ordered with oldest first, youngest last.
# let's verify! The timestamps should be in ascending order
# condition mergedohlcv_df
# - every column should be ordered with oldest first, youngest last.
# let's verify! The timestamps should be in ascending order
uts = mergedohlcv_df["timestamp"].to_list()
assert uts == sorted(uts, reverse=False)

# condition inputs
if do_fill_nans and has_nan(mergedohlcv_df):
mergedohlcv_df = fill_nans(mergedohlcv_df)
ss = self.ss.aimodel_ss
x_dim_len = 0
if not feeds:
x_dim_len = ss.n
feeds = ss.feeds

# condition other inputs
train_feeds_list: List[ArgFeed]
if train_feeds:
train_feeds_list = train_feeds
else:
x_dim_len = len(feeds) * ss.autoregressive_n
train_feeds_list = [predict_feed]
ss = self.ss.aimodel_ss
x_dim_len = len(train_feeds_list) * ss.autoregressive_n

# main work
x_df = pd.DataFrame() # build this up
xrecent_df = pd.DataFrame() # ""

target_hist_cols = [
f"{feed.exchange}:{feed.pair}:{feed.signal}" for feed in feeds
f"{train_feed.exchange}:{train_feed.pair}:{train_feed.signal}"
for train_feed in train_feeds_list
]
for hist_col in target_hist_cols:
assert hist_col in mergedohlcv_df.columns, f"missing data col: {hist_col}"
Expand Down Expand Up @@ -146,7 +151,7 @@ def create_xy(

# y is set from yval_{exch_str, signal_str, pair_str}
# eg y = [BinEthC_-1, BinEthC_-2, ..., BinEthC_-450, BinEthC_-451]
hist_col = f"{feed.exchange}:{feed.pair}:{feed.signal}"
hist_col = f"{predict_feed.exchange}:{predict_feed.pair}:{predict_feed.signal}"
z = mergedohlcv_df[hist_col].to_list()
y = np.array(_slice(z, -testshift - N_train - 1, -testshift))

Expand Down
Loading

0 comments on commit 848a8d7

Please sign in to comment.