#701 - Predictoor Agent with PredSubmitterManager (#877)

* #701 - Update PredictoorSS to include prediction feeds and the data sources for training these feeds (#828) * Trimming apart old implementation * Make predict_feed an array * Rename to predict_feeds * MultiFeedMixin * Update key to feeds * Update structure * Update verify_feed_dependencies for multiple feeds * Create PredictFeedMixin * PredictFeedMixin test * parse_feed_obj func and fixes * test_parse_feed_obj and multi predict feeds * pred_submitter_mgr property * Convert PredictoorSS to PredictFeedMixin * Formatting * Update predictoor_ss_test_dict * Add prepare_stakes function * Break on cutoff * Update check balances to check contracts * Load pred_submitter_mgr_addr * Update take_step and remove unused funcs * Add PredictFeed and PredictFeeds classes * add feeds property * Update predictoor_ss_test_dict * Update predictoor ss tests * convert to str * Add feeds_str and fixes * Update keys * Black * Fixes * Black * Enforce typing and fixes * Type fixes * Type fixes * Fix & update ppss tests * Formatting * Feeds structure * Remove removed functions tests * enable assert attributes * Use list * Set submitter addr * Update sim engine * Update sim engine test * Update feed list * Update AI model factory tests * Use PredictFeeds * Return PredictFeeds * Use argfeed * BLack * PredictFeed update to have 1 -> N * Fix test * feeds_list func for comparison * Single predict feed * Enforce types * Usel ist * Set self.feeds and remove extra properties * Call to_list * Fix tests for single predict feed * Black * Update tests for single pred feed and structure * Add feeds property back * Update tests to support new structure and single pred feed * Fix check network tests * Fix mock args * Black * Replace all, arg fix * Fix args * Fix tests * Update dict * Update dict * Black * create_xy on given feeds * Add filter_feeds_from_candidates to mixin * Fallback to ss.feeds * Formatting * Linter * Linter * Revert "#701 - Update PredictoorSS to include prediction feeds and the data s…" (#878) This reverts commit ac6899b. * #701 - Multi Feed Predictions and PredSubmitter in Predictoor Agent (#855) * Trimming apart old implementation * Make predict_feed an array * Rename to predict_feeds * MultiFeedMixin * Update key to feeds * Update structure * Update verify_feed_dependencies for multiple feeds * Create PredictFeedMixin * PredictFeedMixin test * parse_feed_obj func and fixes * test_parse_feed_obj and multi predict feeds * pred_submitter_mgr property * Convert PredictoorSS to PredictFeedMixin * Formatting * Update predictoor_ss_test_dict * Add prepare_stakes function * Break on cutoff * Update check balances to check contracts * Load pred_submitter_mgr_addr * Update take_step and remove unused funcs * Add PredictFeed and PredictFeeds classes * add feeds property * Update predictoor_ss_test_dict * Update predictoor ss tests * convert to str * Add feeds_str and fixes * Update keys * Black * Fixes * Black * Enforce typing and fixes * Type fixes * Type fixes * Fix & update ppss tests * Formatting * Feeds structure * Remove removed functions tests * enable assert attributes * Use list * Set submitter addr * Update sim engine * Update sim engine test * Update feed list * Update AI model factory tests * Use PredictFeeds * Return PredictFeeds * Use argfeed * BLack * PredictFeed update to have 1 -> N * Fix test * feeds_list func for comparison * Single predict feed * Enforce types * Usel ist * Set self.feeds and remove extra properties * Call to_list * Fix tests for single predict feed * Black * Update tests for single pred feed and structure * Add feeds property back * Update tests to support new structure and single pred feed * Fix check network tests * Fix mock args * Black * Replace all, arg fix * Fix args * Fix tests * Update dict * Update dict * Black * create_xy on given feeds * Add filter_feeds_from_candidates to mixin * Fallback to ss.feeds * Formatting * Linter * Linter * Add missing import * Add get_predict_feed func * Add PredictionSlotsData class * Multi slot support * Fixes, ready to predict * Argpair attributes * Payout function and todos * Todo sim engine * Add minimum_timeframe_seconds func * Add get_min_epoch_s_left to predictoor agent * Fix balance check test * Fix empty init test * Add mock functions * Fix test_predictoor_agent_calc_stakes2_1feed test * Move pred_submitter_mgr to ganache conftest * Pass in pred_submitter_mgr * Pass in pred_submitter_mgr between mocks * pred_submitter_mgr test dict * Fix test_predictoor_agent_calc_stakes2_2feeds test * Make optional * Fix tests * Chain id attr * Comment out sanity checks * Update wait_for_transaction_receipt mock * Black formatting * Add pair_str * Update sim engine for single predict feed * Update sim engine tests * Update sim engine to include feed in MultisimEngine initialization * Formatting * Update sim engine to include feed in MultisimEngine initialization * Formatting * Resolve linter issues * Implementing unique epoch * Add min_epoch_seconds func * Remove removed vars * Black Formatting * Linter * Final touchups * Rename to get * Todo * Fix system test predictoor * Formatting * fix mypy errors * Transfer automatically * Compile * Approve tokens * Fix token transfer issue in PredSubmitterMgr.sol * Compiled contract * Update test * Black * Add info * Fix .exchange * Fix order * make it more explicit * Update predictoor docs * Update cli help * Update * Update docs * Remove unused variable * Use logger.info * Test PredictFeed and parse_feed_obj * Add tests for PredictFeeds * Formatting * Remove cutoff seconds * Remove cut_off * Update mock func name * Formatting * Resolve linter issues and bugs * Update todo comment * Linter * Remove types * Formatting * Improve docs * Quotes * Remove 1s * Add log message * Fix typo * Fix assert * Fix typo * Refactor aimodel_data_factory.py to handle feeds and x_dim_len dynamically
oceanprotocol · Apr 16, 2024 · 96c0da8 · 96c0da8
1 parent dbb8cca
commit 96c0da8
Show file tree

Hide file tree

Showing 43 changed files with 1,079 additions and 492 deletions.
diff --git a/READMEs/predictoor.md b/READMEs/predictoor.md
@@ -48,7 +48,7 @@ You need a local copy of Ocean contract addresses [`address.json`](https://githu
 mkdir -p ~/.ocean; mkdir -p ~/.ocean/ocean-contracts; mkdir -p ~/.ocean/ocean-contracts/artifacts/
 
 # copy from github to local directory. Or, use wget if Linux. Or, download via browser.
-curl https://github.com/oceanprotocol/contracts/blob/main/addresses/address.json -o ~/.ocean/ocean-contracts/artifacts/address.json
+curl https://raw.githubusercontent.com/oceanprotocol/contracts/main/addresses/address.json -o ~/.ocean/ocean-contracts/artifacts/address.json
 ```
 
 If you're running MacOS, then in console:
@@ -59,6 +59,9 @@ codesign --force --deep --sign - venv/sapphirepy_bin/sapphirewrapper-arm64.dylib
 
 ## 2. Simulate Modeling and Trading
 
+> [!WARNING]  
+> Simulation has been temporarily disabled as of version v0.3.3
+
 Simulation allows us to quickly build intuition, and assess the performance of the data / predicting / trading strategy (backtest).
 
 Copy [`ppss.yaml`](../ppss.yaml) into your own file `my_ppss.yaml` and change parameters as you see fit.
@@ -106,18 +109,44 @@ Predictoor contracts run on [Oasis Sapphire](https://docs.oasis.io/dapp/sapphire
 
 Let's get our predictoor bot running on testnet first.
 
-The bot does two-sided predictions, like in simulation. This also means it needs two Ethereum accounts, with keys PRIVATE_KEY and PRIVATE_KEY2.
+The bot does two-sided predictions, like in simulation.
 
 First, tokens! You need (fake) ROSE to pay for gas, and (fake) OCEAN to stake and earn, for both accounts. [Get them here](testnet-faucet.md).
 
 Then, copy & paste your private keys as envvars. In console:
 
 ```console
-export PRIVATE_KEY=<YOUR_PRIVATE_KEY 1>
-export PRIVATE_KEY2=<YOUR_PRIVATE_KEY 2>
+export PRIVATE_KEY=<YOUR_PRIVATE_KEY>
+```
+
+### Deploy the Prediction Submitter Manager
+
+Prediction submitter manager is a smart contract that can submit predictions for multiple pairs and both sides in a single transaction. Predictoor agent uses this smart contract to submit predictions and it must be deployed first. To deploy the contract, run:
+
 ```
+pdr deploy_pred_submitter_mgr my_ppss.yaml sapphire-testnet
+```
+
+Copy [`ppss.yaml`](../ppss.yaml) into your own file `my_ppss.yaml`.
+
+```console
+cp ppss.yaml my_ppss.yaml
+```
+
+#### Update YAML config with the contract address
+
+Next, update `my_ppss.yaml` and input the contract address in place of `predictoor_ss.pred_submitter_mgr`:
+
+```
+predictoor_ss:
+  ...
+  pred_submitter_mgr: "CONTRACT_ADDRESS"
+  ...
+```
+
+Update the rest of the config as desired.
 
-Next, update `my_ppss.yaml` as desired.
+### Running the bot
 
 Then, run a bot with modeling-on-the fly (approach 2). In console:
 
@@ -150,11 +179,10 @@ First, real tokens! Get [ROSE via this guide](get-rose-on-sapphire.md) and [OCEA
 Then, copy & paste your private keys as envvars. (You can skip this if keys are same as testnet.) In console:
 
 ```console
-export PRIVATE_KEY=<YOUR_PRIVATE_KEY 1>
-export PRIVATE_KEY2=<YOUR_PRIVATE_KEY 2>
+export PRIVATE_KEY=<YOUR_PRIVATE_KEY>
 ```
 
-Update `my_ppss.yaml` as desired.
+Follow the same steps in [Deploy the Prediction Submitter Manager](#deploy-the-prediction-submitter-manager) and make sure to update `pred_submitter_mgr` in the `my_ppss.yaml` config, update the rest of it as desired.
 
 Then, run the bot. In console:
 

diff --git a/pdr_backend/aimodel/aimodel_data_factory.py b/pdr_backend/aimodel/aimodel_data_factory.py
@@ -1,12 +1,15 @@
 import logging
 import sys
-from typing import Tuple
+from typing import Optional, Tuple
 
 import numpy as np
 import pandas as pd
 import polars as pl
+
 from enforce_typing import enforce_types
 
+from pdr_backend.cli.arg_feed import ArgFeed
+from pdr_backend.cli.arg_feeds import ArgFeeds
 from pdr_backend.ppss.predictoor_ss import PredictoorSS
 from pdr_backend.util.mathutil import fill_nans, has_nan
 
@@ -65,6 +68,8 @@ def create_xy(
         self,
         mergedohlcv_df: pl.DataFrame,
         testshift: int,
+        feed: ArgFeed,
+        feeds: Optional[ArgFeeds] = None,
         do_fill_nans: bool = True,
     ) -> Tuple[np.ndarray, np.ndarray, pd.DataFrame, np.ndarray]:
         """
@@ -98,15 +103,19 @@ def create_xy(
         if do_fill_nans and has_nan(mergedohlcv_df):
             mergedohlcv_df = fill_nans(mergedohlcv_df)
         ss = self.ss.aimodel_ss
-
+        x_dim_len = 0
+        if not feeds:
+            x_dim_len = ss.n
+            feeds = ss.feeds
+        else:
+            x_dim_len = len(feeds) * ss.autoregressive_n
         # main work
         x_df = pd.DataFrame()  # build this up
         xrecent_df = pd.DataFrame()  # ""
 
         target_hist_cols = [
-            f"{feed.exchange}:{feed.pair}:{feed.signal}" for feed in ss.feeds
+            f"{feed.exchange}:{feed.pair}:{feed.signal}" for feed in feeds
         ]
-
         for hist_col in target_hist_cols:
             assert hist_col in mergedohlcv_df.columns, f"missing data col: {hist_col}"
             z = mergedohlcv_df[hist_col].to_list()  # [..., z(t-2), z(t-1)]
@@ -137,15 +146,14 @@ def create_xy(
 
         # y is set from yval_{exch_str, signal_str, pair_str}
         # eg y = [BinEthC_-1, BinEthC_-2, ..., BinEthC_-450, BinEthC_-451]
-        ref_ss = self.ss
-        hist_col = f"{ref_ss.exchange_str}:{ref_ss.pair_str}:{ref_ss.signal_str}"
+        hist_col = f"{feed.exchange}:{feed.pair}:{feed.signal}"
         z = mergedohlcv_df[hist_col].to_list()
         y = np.array(_slice(z, -testshift - N_train - 1, -testshift))
 
         # postconditions
         assert X.shape[0] == y.shape[0]
         assert X.shape[0] <= (ss.max_n_train + 1)
-        assert X.shape[1] == ss.n
+        assert X.shape[1] == x_dim_len
         assert isinstance(x_df, pd.DataFrame)
 
         assert "timestamp" not in x_df.columns

diff --git a/pdr_backend/aimodel/test/test_aimodel_data_factory.py b/pdr_backend/aimodel/test/test_aimodel_data_factory.py
@@ -1,10 +1,12 @@
-import numpy as np
-from numpy.testing import assert_array_equal
 import pandas as pd
 import polars as pl
 import pytest
+import numpy as np
+
+from numpy.testing import assert_array_equal
 from enforce_typing import enforce_types
 
+from pdr_backend.cli.predict_feeds import PredictFeeds
 from pdr_backend.aimodel.aimodel_data_factory import AimodelDataFactory
 from pdr_backend.lake.merge_df import merge_rawohlcv_dfs
 from pdr_backend.lake.test.resources import (
@@ -33,8 +35,11 @@ def test_ycont_to_ytrue():
 
 @enforce_types
 def test_create_xy__0():
+    predict_feeds = [
+        {"predict": "binanceus ETH/USDT c 5m", "train_on": "binanceus ETH/USDT c 5m"}
+    ]
     d = predictoor_ss_test_dict(
-        predict_feed="binanceus ETH/USDT c 5m",
+        predict_feeds=predict_feeds,
         input_feeds=["binanceus ETH/USDT oc"],
     )
     d["aimodel_ss"]["max_n_train"] = 4
@@ -73,7 +78,9 @@ def test_create_xy__0():
     factory = AimodelDataFactory(predictoor_ss)
 
     target_y = np.array([5.3, 6.4, 7.5, 8.6, 9.7])  # oldest to newest
-    X, y, x_df, xrecent = factory.create_xy(mergedohlcv_df, testshift=0)
+    X, y, x_df, xrecent = factory.create_xy(
+        mergedohlcv_df, testshift=0, feed=predictoor_ss.feeds.feeds[0]
+    )
     _assert_pd_df_shape(predictoor_ss.aimodel_ss, X, y, x_df)
     assert_array_equal(X, target_X)
     assert_array_equal(y, target_y)
@@ -83,9 +90,12 @@ def test_create_xy__0():
 
 @enforce_types
 def test_create_xy_reg__1exchange_1coin_1signal():
-    d = predictoor_ss_test_dict("binanceus ETH/USDT h 5m")
-    ss = PredictoorSS(d)
-    aimodel_data_factory = AimodelDataFactory(ss)
+    predict_feeds = [
+        {"predict": "binanceus ETH/USDT h 5m", "train_on": "binanceus ETH/USDT h 5m"}
+    ]
+    d = predictoor_ss_test_dict(predict_feeds)
+    predictoor_ss = PredictoorSS(d)
+    aimodel_data_factory = AimodelDataFactory(predictoor_ss)
     mergedohlcv_df = merge_rawohlcv_dfs(ETHUSDT_RAWOHLCV_DFS)
 
     # =========== have testshift = 0
@@ -123,9 +133,11 @@ def test_create_xy_reg__1exchange_1coin_1signal():
     )
     target_xrecent = np.array([3.0, 2.0, 1.0])
 
-    X, y, x_df, xrecent = aimodel_data_factory.create_xy(mergedohlcv_df, testshift=0)
+    X, y, x_df, xrecent = aimodel_data_factory.create_xy(
+        mergedohlcv_df, testshift=0, feed=predictoor_ss.feeds.feeds[0]
+    )
 
-    _assert_pd_df_shape(ss.aimodel_ss, X, y, x_df)
+    _assert_pd_df_shape(predictoor_ss.aimodel_ss, X, y, x_df)
     assert_array_equal(X, target_X)
     assert_array_equal(y, target_y)
     assert x_df.equals(target_x_df)
@@ -165,9 +177,11 @@ def test_create_xy_reg__1exchange_1coin_1signal():
     )
     target_xrecent = np.array([4.0, 3.0, 2.0])
 
-    X, y, x_df, xrecent = aimodel_data_factory.create_xy(mergedohlcv_df, testshift=1)
+    X, y, x_df, xrecent = aimodel_data_factory.create_xy(
+        mergedohlcv_df, testshift=1, feed=predictoor_ss.feeds.feeds[0]
+    )
 
-    _assert_pd_df_shape(ss.aimodel_ss, X, y, x_df)
+    _assert_pd_df_shape(predictoor_ss.aimodel_ss, X, y, x_df)
     assert_array_equal(X, target_X)
     assert_array_equal(y, target_y)
     assert x_df.equals(target_x_df)
@@ -193,12 +207,14 @@ def test_create_xy_reg__1exchange_1coin_1signal():
         }
     )
 
-    assert "max_n_train" in ss.aimodel_ss.d
-    ss.aimodel_ss.d["max_n_train"] = 5
+    assert "max_n_train" in predictoor_ss.aimodel_ss.d
+    predictoor_ss.aimodel_ss.d["max_n_train"] = 5
 
-    X, y, x_df, _ = aimodel_data_factory.create_xy(mergedohlcv_df, testshift=0)
+    X, y, x_df, _ = aimodel_data_factory.create_xy(
+        mergedohlcv_df, testshift=0, feed=predictoor_ss.feeds.feeds[0]
+    )
 
-    _assert_pd_df_shape(ss.aimodel_ss, X, y, x_df)
+    _assert_pd_df_shape(predictoor_ss.aimodel_ss, X, y, x_df)
     assert_array_equal(X, target_X)
     assert_array_equal(y, target_y)
     assert x_df.equals(target_x_df)
@@ -218,9 +234,16 @@ def test_create_xy_reg__2exchanges_2coins_2signals():
     }
 
     d = predictoor_ss_test_dict()
-    assert "predict_feed" in d
+    assert "feeds" in d
     assert "input_feeds" in d["aimodel_ss"]
-    d["predict_feed"] = "binanceus ETH/USDT h 5m"
+    d["predict_feed"] = PredictFeeds.from_array(
+        [
+            {
+                "predict": "binanceus ETH/USDT h 5m",
+                "train_on": "binanceus ETH/USDT h 5m",
+            }
+        ]
+    )
     d["aimodel_ss"]["input_feeds"] = [
         "binanceus BTC/USDT,ETH/USDT hl",
         "kraken BTC/USDT,ETH/USDT hl",
@@ -233,7 +256,9 @@ def test_create_xy_reg__2exchanges_2coins_2signals():
     mergedohlcv_df = merge_rawohlcv_dfs(rawohlcv_dfs)
 
     aimodel_data_factory = AimodelDataFactory(ss)
-    X, y, x_df, _ = aimodel_data_factory.create_xy(mergedohlcv_df, testshift=0)
+    X, y, x_df, _ = aimodel_data_factory.create_xy(
+        mergedohlcv_df, testshift=0, feed=d["predict_feed"].feeds[0]
+    )
 
     _assert_pd_df_shape(ss.aimodel_ss, X, y, x_df)
     found_cols = x_df.columns.tolist()
@@ -272,7 +297,7 @@ def test_create_xy_reg__2exchanges_2coins_2signals():
         "binanceus:ETH/USDT:high:t-2",
     ]
     Xa = X[:, 3:6]
-    assert Xa[-1, :].tolist() == [4, 3, 2] and y[-1] == 1
+    assert Xa[-1, :].tolist() == [4.0, 3.0, 2.0] and y[-1] == 1
     assert Xa[-2, :].tolist() == [5, 4, 3] and y[-2] == 2
     assert Xa[0, :].tolist() == [11, 10, 9] and y[0] == 8
 
@@ -302,13 +327,14 @@ def test_create_xy_reg__check_timestamp_order():
     assert uts == sorted(uts, reverse=False)
 
     # happy path
-    factory.create_xy(mergedohlcv_df, testshift=0)
+    feed = factory.ss.feeds[0]
+    factory.create_xy(mergedohlcv_df, testshift=0, feed=feed.predict)
 
     # failure path
     bad_uts = sorted(uts, reverse=True)  # bad order
     bad_mergedohlcv_df = mergedohlcv_df.with_columns(pl.Series("timestamp", bad_uts))
     with pytest.raises(AssertionError):
-        factory.create_xy(bad_mergedohlcv_df, testshift=0)
+        factory.create_xy(bad_mergedohlcv_df, testshift=0, feed=feed.predict)
 
 
 @enforce_types
@@ -319,19 +345,25 @@ def test_create_xy_reg__input_type():
     assert isinstance(aimodel_data_factory, AimodelDataFactory)
 
     # create_xy() input should be pl
-    aimodel_data_factory.create_xy(mergedohlcv_df, testshift=0)
+    feed = aimodel_data_factory.ss.feeds[0]
+    aimodel_data_factory.create_xy(mergedohlcv_df, testshift=0, feed=feed.predict)
 
     # create_xy() inputs shouldn't be pd
     with pytest.raises(AssertionError):
-        aimodel_data_factory.create_xy(mergedohlcv_df.to_pandas(), testshift=0)
+        aimodel_data_factory.create_xy(
+            mergedohlcv_df.to_pandas(), testshift=0, feed=feed.predict
+        )
 
 
 @enforce_types
 def test_create_xy_reg__handle_nan():
     # create mergedohlcv_df
-    d = predictoor_ss_test_dict("binanceus ETH/USDT h 5m")
-    ss = PredictoorSS(d)
-    aimodel_data_factory = AimodelDataFactory(ss)
+    predict_feeds = [
+        {"predict": "binanceus ETH/USDT h 5m", "train_on": "binanceus ETH/USDT h 5m"}
+    ]
+    d = predictoor_ss_test_dict(predict_feeds)
+    predictoor_ss = PredictoorSS(d)
+    aimodel_data_factory = AimodelDataFactory(predictoor_ss)
     mergedohlcv_df = merge_rawohlcv_dfs(ETHUSDT_RAWOHLCV_DFS)
 
     # initial mergedohlcv_df should be ok
@@ -353,7 +385,10 @@ def test_create_xy_reg__handle_nan():
     # run create_xy() and force the nans to stick around
     # -> we want to ensure that we're building X/y with risk of nan
     X, y, x_df, _ = aimodel_data_factory.create_xy(
-        mergedohlcv_df, testshift=0, do_fill_nans=False
+        mergedohlcv_df,
+        testshift=0,
+        do_fill_nans=False,
+        feed=predictoor_ss.feeds.feeds[0],
     )
     assert has_nan(X) and has_nan(y) and has_nan(x_df)
 
@@ -363,12 +398,17 @@ def test_create_xy_reg__handle_nan():
 
     # nan approach 2: explicitly tell create_xy to fill nans
     X, y, x_df, _ = aimodel_data_factory.create_xy(
-        mergedohlcv_df, testshift=0, do_fill_nans=True
+        mergedohlcv_df,
+        testshift=0,
+        do_fill_nans=True,
+        feed=predictoor_ss.feeds.feeds[0],
     )
     assert not has_nan(X) and not has_nan(y) and not has_nan(x_df)
 
     # nan approach 3: create_xy fills nans by default (best)
-    X, y, x_df, _ = aimodel_data_factory.create_xy(mergedohlcv_df, testshift=0)
+    X, y, x_df, _ = aimodel_data_factory.create_xy(
+        mergedohlcv_df, testshift=0, feed=predictoor_ss.feeds.feeds[0]
+    )
     assert not has_nan(X) and not has_nan(y) and not has_nan(x_df)