From d3527b295260b48f708346103dfe0e23becd71c2 Mon Sep 17 00:00:00 2001 From: Chris Sellers Date: Sun, 15 Oct 2023 15:01:50 +1100 Subject: [PATCH] Update docs --- docs/concepts/advanced/advanced_orders.md | 6 +- docs/concepts/advanced/custom_data.md | 8 +- docs/concepts/advanced/emulated_orders.md | 2 +- .../advanced/synthetic_instruments.md | 4 +- docs/concepts/architecture.md | 2 +- docs/concepts/backtesting.md | 2 +- docs/concepts/data.md | 120 ++++++++++++++++-- docs/concepts/execution.md | 29 ++++- docs/concepts/overview.md | 2 +- nautilus_trader/system/kernel.py | 2 +- 10 files changed, 149 insertions(+), 28 deletions(-) diff --git a/docs/concepts/advanced/advanced_orders.md b/docs/concepts/advanced/advanced_orders.md index 23b35e02bd35..ba4238497d45 100644 --- a/docs/concepts/advanced/advanced_orders.md +++ b/docs/concepts/advanced/advanced_orders.md @@ -19,19 +19,19 @@ specific exchange they are being routed to. These contingency types relate to ContingencyType FIX tag <1385> https://www.onixs.biz/fix-dictionary/5.0.sp2/tagnum_1385.html. ``` -### One Triggers the Other (OTO) +### *'One Triggers the Other'* (OTO) An OTO orders involves two orders—a parent order and a child order. The parent order is a live marketplace order. The child order, held in a separate order file, is not. If the parent order executes in full, the child order is released to the marketplace and becomes live. An OTO order can be made up of stock orders, option orders, or a combination of both. -### One Cancels the Other (OCO) +### *'One Cancels the Other'* (OCO) An OCO order is an order whose execution results in the immediate cancellation of another order linked to it. Cancellation of the Contingent Order happens on a best efforts basis. In an OCO order, both orders are live in the marketplace at the same time. The execution of either order triggers an attempt to cancel the other unexecuted order. Partial executions will also trigger an attempt to cancel the other order. -### One Updates the Other (OUO) +### *'One Updates the Other'* (OUO) An OUO order is an order whose execution results in the immediate reduction of quantity in another order linked to it. The quantity reduction happens on a best effort basis. In an OUO order both orders are live in the marketplace at the same time. The execution of either order triggers an diff --git a/docs/concepts/advanced/custom_data.md b/docs/concepts/advanced/custom_data.md index ff735420c528..83b5ab75351c 100644 --- a/docs/concepts/advanced/custom_data.md +++ b/docs/concepts/advanced/custom_data.md @@ -6,6 +6,10 @@ guide covers some possible use cases for this functionality. It's possible to create custom data types within the Nautilus system. First you will need to define your data by subclassing from `Data`. +```{note} +As `Data` holds no state, it is not strictly necessary to call `super().__init__()`. +``` + ```python from nautilus_trader.core.data import Data @@ -67,10 +71,6 @@ The recommended approach to satisfy the contract is to assign `ts_event` and `ts to backing fields, and then implement the `@property` for each as shown above (for completeness, the docstrings are copied from the `Data` base class). -```{note} -As `Data` holds no state, it is not strictly necessary to call `super().__init__()`. -``` - ```{note} These timestamps are what allow Nautilus to correctly order data streams for backtests by monotonically increasing `ts_init` UNIX nanoseconds. diff --git a/docs/concepts/advanced/emulated_orders.md b/docs/concepts/advanced/emulated_orders.md index 44acc37d1935..29af99176163 100644 --- a/docs/concepts/advanced/emulated_orders.md +++ b/docs/concepts/advanced/emulated_orders.md @@ -2,7 +2,7 @@ The platform makes it possible to emulate most order types locally, regardless of whether the type is supported on a trading venue. The logic and code paths for -order emulation are exactly the same for all environment contexts (backtest, sandbox, live), +order emulation are exactly the same for all environment contexts (`backtest`, `sandbox`, `live`) and utilize a common `OrderEmulator` component. ```{note} diff --git a/docs/concepts/advanced/synthetic_instruments.md b/docs/concepts/advanced/synthetic_instruments.md index e9b1ae0d2540..1e78b7df247e 100644 --- a/docs/concepts/advanced/synthetic_instruments.md +++ b/docs/concepts/advanced/synthetic_instruments.md @@ -3,7 +3,7 @@ The platform supports the definition of customized synthetic instruments. These instruments can generate synthetic quote and trade ticks, which are beneficial for: -- Allowing actors (and strategies) to subscribe to quote or trade feeds (for any purpose) +- Allowing `Actor` (and `Strategy`) components to subscribe to quote or trade feeds (for any purpose) - Facilitating the triggering of emulated orders - Constructing bars from synthetic quotes or trades @@ -67,7 +67,7 @@ self.subscribe_quote_ticks(self._synthetic_id) ``` ```{note} -The `instrument_id` for the synthetic instrument in the above example will be structured as `{symbol}.{SYNTH}`, resulting in 'BTC-ETH:BINANCE.SYNTH'. +The `instrument_id` for the synthetic instrument in the above example will be structured as `{symbol}.{SYNTH}`, resulting in `'BTC-ETH:BINANCE.SYNTH'`. ``` ## Updating formulas diff --git a/docs/concepts/architecture.md b/docs/concepts/architecture.md index 6e95e6529aaa..f35bc026c4f9 100644 --- a/docs/concepts/architecture.md +++ b/docs/concepts/architecture.md @@ -107,7 +107,7 @@ for each of these subpackages from the left nav menu. ### System implementations - `backtest` - backtesting componentry as well as a backtest engine and node implementations - `live` - live engine and client implementations as well as a node for live trading -- `system` - the core system kernel common between backtest, sandbox and live contexts +- `system` - the core system kernel common between `backtest`, `sandbox`, `live` contexts ## Code structure The foundation of the codebase is the `nautilus_core` directory, containing a collection of core Rust libraries including a C API interface generated by `cbindgen`. diff --git a/docs/concepts/backtesting.md b/docs/concepts/backtesting.md index 2beb13d62019..edbc28fdb046 100644 --- a/docs/concepts/backtesting.md +++ b/docs/concepts/backtesting.md @@ -2,7 +2,7 @@ Backtesting with NautilusTrader is a methodical simulation process that replicates trading activities using a specific system implementation. This system is composed of various components -including [Actors](), [Strategies](/docs/concepts/strategies.md), [Execution Algorithms](/docs/concepts/execution.md), +including [Actors](advanced/actors.md), [Strategies](strategies.md), [Execution Algorithms](execution.md), and other user-defined modules. The entire trading simulation is predicated on a stream of historical data processed by a `BacktestEngine`. Once this data stream is exhausted, the engine concludes its operation, producing detailed results and performance metrics for in-depth analysis. diff --git a/docs/concepts/data.md b/docs/concepts/data.md index afdd51953f25..df17dde8d539 100644 --- a/docs/concepts/data.md +++ b/docs/concepts/data.md @@ -7,7 +7,7 @@ a trading domain: - `OrderBookDeltas` (L1/L2/L3) - Bundles multiple order book deltas - `QuoteTick` - Top-of-book best bid and ask prices and sizes - `TradeTick` - A single trade/match event between counterparties -- `Bar` - OHLCV data aggregated using a specific method +- `Bar` - OHLCV 'bar' data, aggregated using a specific *method* - `Ticker` - General base class for a symbol ticker - `Instrument` - General base class for a tradable instrument - `VenueStatus` - A venue level status event @@ -18,28 +18,71 @@ Each of these data types inherits from `Data`, which defines two fields: - `ts_event` - The UNIX timestamp (nanoseconds) when the data event occurred - `ts_init` - The UNIX timestamp (nanoseconds) when the object was initialized -This inheritance ensures chronological data ordering, vital for backtesting, while also enhancing analytics. +This inheritance ensures chronological data ordering (vital for backtesting), while also enhancing analytics. -Consistency is key; data flows through the platform in exactly the same way between all system contexts (backtest, sandbox and live), +Consistency is key; data flows through the platform in exactly the same way for all system contexts (`backtest`, `sandbox`, `live`) primarily through the `MessageBus` to the `DataEngine` and onto subscribed or registered handlers. -For those seeking customization, the platform supports user-defined data types. Refer to the [advanced custom guide](/docs/concepts/advanced/custom_data.md) for more details. +For those seeking customization, the platform supports user-defined data types. Refer to the advanced [Custom/Generic data guide](advanced/custom_data.md) for more details. ## Loading data NautilusTrader facilitates data loading and conversion for three main use cases: -- Populating the `BacktestEngine` directly -- Persisting the Nautilus-specific Parquet format via `ParquetDataCatalog.write_data(...)` to be used with a `BacktestNode` -- Research purposes +- Populating the `BacktestEngine` directly to run backtests +- Persisting the Nautilus-specific Parquet format for the data catalog via `ParquetDataCatalog.write_data(...)` to be later used with a `BacktestNode` +- For research purposes (to ensure data is consistent between research and backtesting) Regardless of the destination, the process remains the same: converting diverse external data formats into Nautilus data structures. -To achieve this two components are necessary: -- A data loader which can read the data and return a `pd.DataFrame` with the correct schema for the desired Nautilus object -- A data wrangler which takes this `pd.DataFrame` and returns a `list[Data]` of Nautilus objects -`raw data (e.g. CSV)` -> `*DataLoader` -> `pd.DataFrame` -> `*DataWrangler` -> Nautilus `list[Data]` +To achieve this, two main components are necessary: +- A type of DataLoader (normally specific per raw source/format) which can read the data and return a `pd.DataFrame` with the correct schema for the desired Nautilus object +- A type of DataWrangler (specific per data type) which takes this `pd.DataFrame` and returns a `list[Data]` of Nautilus objects -Conceretely, this would involve for example: +### Data loaders + +Data loader components are typically specific for the raw source/format and per integration. For instance, Binance order book data is stored in its raw CSV file form with +an entirely different format to [Databento Binary Encoding (DBN)](https://docs.databento.com/knowledge-base/new-users/dbn-encoding/getting-started-with-dbn) files. + +### Data wranglers + +Data wranglers are implemented per specific Nautilus data type, and can be found in the `nautilus_trader.persistence.wranglers` modules. +Currently there exists: +- `OrderBookDeltaDataWrangler` +- `QuoteTickDataWrangler` +- `TradeTickDataWrangler` +- `BarDataWrangler` + +```{warning} +At the risk of causing confusion, there are also a growing number of DataWrangler v2 components, which will take a `pd.DataFrame` typically +with a different fixed width Nautilus arrow v2 schema, and output pyo3 Nautilus objects which are only compatible with the new version +of the Nautilus core, currently in development. + +**These pyo3 provided data objects are not compatible where the legacy Cython objects are currently used (adding directly to a `BacktestEngine` etc).** +``` + +### Transformation pipeline + +**Process flow:** +1. Raw data (e.g., CSV) is input into the pipeline +2. DataLoader processes the raw data and converts it into a `pd.DataFrame` +3. DataWrangler further processes the `pd.DataFrame` to generate a list of Nautilus objects +4. The Nautilus `list[Data]` is the output of the data loading process + +``` + ┌──────────┐ ┌──────────────────────┐ ┌──────────────────────┐ + │ │ │ │ │ │ + │ │ │ │ │ │ + │ Raw data │ │ │ `pd.DataFrame` │ │ + │ (CSV) ├───►│ DataLoader ├─────────────────►│ DataWrangler ├───► Nautilus `list[Data]` + │ │ │ │ │ │ + │ │ │ │ │ │ + │ │ │ │ │ │ + └──────────┘ └──────────────────────┘ └──────────────────────┘ + +- This diagram illustrates how raw data is transformed into Nautilus data structures. +``` + +Conceretely, this would involve: - `BinanceOrderBookDeltaDataLoader.load(...)` which reads CSV files provided by Binance from disk, and returns a `pd.DataFrame` - `OrderBookDeltaDataWrangler.process(...)` which takes the `pd.DataFrame` and returns `list[OrderBookDelta]` @@ -81,4 +124,55 @@ from the `/serialization/arrow/schema.py` module. 2023-10-14: The current plan is to eventually phase out the Python schemas module, so that all schemas are single sourced in the Rust core. ``` -**This doc is an evolving work in progress and will continue to describe the data catalog more fully...** +### Initializing +The data catalog can be initialized from a `NAUTILUS_PATH` environment variable, or by explicitly passing in a path like object. + +The following example shows how to initialize a data catalog where there is pre-existing data already written to disk at the given path. + +```python +CATALOG_PATH = os.getcwd() + "/catalog" + +# Create a new catalog instance +catalog = ParquetDataCatalog(CATALOG_PATH) +``` + +### Writing data +New data can be stored in the catalog, which is effectively writing the given data to disk in the Nautilus-specific Parquet format. +All Nautilus built-in `Data` objects are supported, and any data which inherits from `Data` can be written. + +The following example shows the above list of Binance `OrderBookDelta` objects being written. +```python +catalog.write_data(deltas) +``` + +Rust Arrow schema implementations and available for the follow data types (enhanced performance): +- `OrderBookDelta` +- `QuoteTick` +- `TradeTick` +- `Bar` + +### Reading data +Any stored data can then we read back into memory: +```python +start = dt_to_unix_nanos(pd.Timestamp("2020-01-03", tz=pytz.utc)) +end = dt_to_unix_nanos(pd.Timestamp("2020-01-04", tz=pytz.utc)) + +deltas = catalog.order_book_deltas(instrument_ids=[instrument.id.value], start=start, end=end) +``` + +### Streaming data +When running backtests in streaming mode with a `BacktestNode`, the data catalog can be used to stream the data in batches. + +The following example shows how to achieve this by initializing a `BacktestDataConfig` configuration object: +```python +data_config = BacktestDataConfig( + catalog_path=str(catalog.path), + data_cls=OrderBookDelta, + instrument_id=instrument.id.value, + start_time=start, + end_time=end, +) +``` + +This configuration object then be passed into a `BacktestRunConfig` and then in turn passed into a `BacktestNode` as part of a run. +See the [Backtest (high-level API)](../tutorials/backtest_high_level.md) tutorial for more details. diff --git a/docs/concepts/execution.md b/docs/concepts/execution.md index e7d8489d06e7..1f3156a3399e 100644 --- a/docs/concepts/execution.md +++ b/docs/concepts/execution.md @@ -37,6 +37,33 @@ The general execution flow looks like the following (each arrow indicates moveme The `OrderEmulator` and `ExecAlgorithm`(s) components are optional in the flow, depending on individual order parameters (as explained below). +``` + ┌───────────────────┐ + │ │ + │ │ + ┌───────► ├────────────┐ + │ │ OrderEmulator │ │ + │ │ │ │ + ┌─────────┴──┐ │ │ │ + │ │ │ │ ┌───────▼────────┐ ┌─────────────────────┐ ┌─────────────────────┐ + │ │ └───────┬───▲───────┘ │ │ │ │ │ │ + │ │ │ │ │ ├───► ├───► │ + │ Strategy ◄────────────┼───┼────────────┤ │ │ │ │ │ + │ │ │ │ │ RiskEngine │ │ ExecutionEngine │ │ ExecutionClient │ + │ │ │ │ │ ◄───┤ ◄───┤ │ + │ │ ┌───────▼───┴───────┐ │ │ │ │ │ │ + │ │ │ │ │ │ │ │ │ │ + └─────────┬──┘ │ │ └────────▲───────┘ └─────────────────────┘ └─────────────────────┘ + │ │ │ │ + │ │ ExecAlgorithm ├─────────────┘ + │ │ │ + └───────► │ + │ │ + └───────────────────┘ + +- This diagram illustrates message flow (commands and events) across the Nautilus execution components. +``` + ## Execution algorithms The platform supports customized execution algorithm components and provides some built-in @@ -190,7 +217,7 @@ or confusion with the "parent" and "child" contingency orders terminology (an ex The `Cache` provides several methods to aid in managing (keeping track of) the activity of an execution algorithm: -```python +```cython cpdef list orders_for_exec_algorithm( self, diff --git a/docs/concepts/overview.md b/docs/concepts/overview.md index bcb3f63a6ac2..a7d70cbba1c8 100644 --- a/docs/concepts/overview.md +++ b/docs/concepts/overview.md @@ -74,7 +74,7 @@ The platform is designed to be easily integrated into a larger distributed syste To facilitate this, nearly all configuration and domain objects can be serialized using JSON, MessagePack or Apache Arrow (Feather) for communication over the network. ## Common core -The common system core is utilized by both the backtest, sandbox, and live trading nodes. +The common system core is utilized by all node contexts `backtest`, `sandbox`, and `live`. User-defined Actor, Strategy and ExecAlgorithm components are managed consistently across these environment contexts. ## Backtesting diff --git a/nautilus_trader/system/kernel.py b/nautilus_trader/system/kernel.py index b14a6cb47adb..9240dfd8ed38 100644 --- a/nautilus_trader/system/kernel.py +++ b/nautilus_trader/system/kernel.py @@ -93,7 +93,7 @@ class NautilusKernel: """ Provides the core Nautilus system kernel. - The kernel is common between backtest, sandbox and live environment context types. + The kernel is common between ``backtest``, ``sandbox`` and ``live`` environment context types. Parameters ----------