This document presents the general design of the ffs
package of powergate
.
Disclaimer: This's ongoing work so the design will continue to change.
The following picture presents principal packages and interfaces that are part of the design:
The picture has an advanced scenario where different API instances are wired to different Scheduler instances. Component names prefixed with * don't exist but are mentioned as possible implementations of existing interfaces.
The central idea about the design is that an API defines the desired storing state for a Cid using a StorageConfig struct. This struct has information about desired storing state configuration in the Hot and Cold storages.
When a new or updated StorageConfig is pushed in an API, it delegates this work to the Scheduler. The Scheduler will execute whatever work is necessary to comply with the new/updated Cid storage configuration.
From the Scheduler point of view, this work is considered a Job created by API. The job refers to doing the necessary work to enforce the new StorageConfig. The API can watch for this Job state changes to see if the task of pushing a new StorageConfig is queued, executing, finished successfully, failed, or canceled. The Scheduler also provides a human-friendly log stream of work being done for a Cid.
The Scheduler also executes proactive actions for prior pushed StorageConfigs which enabled the renew or repair feature. Finally, the Scheduler is designed to resume any kind of interrupted job executions.
The following sections give a more detailed description of each component and interface in the diagram.
This component is responsible for creating API instances. When a new API instance is created, an auth-token for this instance is also created. The client uses this auth-token in each request in the API so that the Manager can redirect the action to its corresponding API instance, while also having some minimal access-control validation.
The mapping between auth-tokens and API is controlled by an Auth component. Further features such as token invalidation, finer-grained access control per action, or multiple auth token support will live in this module.
Since API might store data in the Filecoin network, they're asigned a newly created Filecoin address which will be controlled by the underlying Filecoin client used in the ColdStorage. The process of creating and assigning this new wallet account is done automatically by Manager, using a subcomponent WalletManager.
Manager enables being configured to auto-fund newly created wallet addresses, so new created API can have funds to execute actions in the Filecoin network. This feature can be optionally enabled. If enabled, a masterAddress and initialFunds will be configured which indicates from which Filecoin Client wallet address funds will be sent and the amount of the transfer.
API is a concrete instance of FFS to be used by a client. It owns the following information:
- At least one Filecoin address. Later the client can opt to create more address and indicate which to use when making action.
- StorageConfigs describing the desired state for Cids to be stored in Hot and Cold storage.
- A default StorageConfig to be used unless an explicit StorageConfig is given.
The instance provides apis to:
- Get and Set the default StorageConfig used to store new data.
- Get summary information about all the Cid stored in this instance.
- Manage Filecoin wallet addresses under its control.
- Sending FIL transactions from owned Filecoin wallet addresses.
- Create, replace and remove StorageConfig which indicates which cids to store in the instance.
- Provide detailed information about a particular stored Cid.
- Get information about status of executing Jobs corresponding to the FFS instance.
- Human-friendly log streams about events happening for a Cid, from storage, renewals, repair and anything related to actions being done for it.
In a nutshell, the Scheduler is the component responsible for orchestrating the Hot and Cold storage to enforce indicated StorageConfigs by connected API.
Refer to the Go docs to see its exported API.
When a new StorageConfig is pushed by an API, the Scheduler is responsible for orchestrating whatever actions are necessary to enforce it with the Hot and Col storage.
Every new StorageConfig, being the first or newer version for a Cid, is encapsulated in a Job. A Job is the unit of work which the Scheduler executes. Jobs have different status: Queued, Executing, Done, Failed, and Canceled.
Apart from executing Jobs, the Scheduler has background processes to keep enforcing configuration features that requires tracking. For example, if a StorageConfig has renewal or repair enabled, the Scheduler is responsible for do necessary work as expected. Apart from Jobs, the Scheduler has background tasks that monitor deal renewals or repair operations.
In summary, APIs delegates the desired state for a Cid and the Scheduler is responsible for ensuring that state is true by orchestrating the Hot and Cold storage.
The Scheduler interacts with abstractions for the Hot and Cold storage. Refer to the Go docs of the HotStorage and ColdStorage to understand their APIs.
It can be noticed that the ColdStorage interface is quite biased towards using a Filecoin client in the implementation, but this enables to include also other tiered cold storages if wanted if deal creation or retrieval may be wanted. Refer to the diagram at the top of this document to understand possible configurations.
The ColdStorage relies on a MinerSelector interface to query the universe of available miners to make new deals. Refer to the Go doc to understand its API.
Powergate has the Reputation Module which leverages built indexes about miners data to provide a universe of available miners soreted by a chosen criteria. In a full run of FFS, the ColdStorage is connected to a MinerSelector with the Reputation Module implementation. However, for integration tests a FixedMiners miner selector is used to bound the universe of available miners for deals to desired values.
The MinerSelector API already provides enough filtering configuration to force using or excluding particular miners. In general, other implementations than the default one should be used if the universe of available miners wants to be completely controlled by design, and not by available miners on the connected Filecoin network.
In the current document we've referred to StorageConfigs as a central concept in the FFS module. A StorageConfig indicates the desired storing state of a Cid scoped in a API. Refer to the Go docs to understand its rich configuration.
One important point is that Get
operations in API can only retrieve data from hot storage (via GetCidFromHot
in the Scheduler).
This has some different scenarios:
- If the data is stored in hot storage, it fetched from there.
- If the data wasn't enabled in hot storage (
HotConfig.Enabled: false
), it will error indicating that hot storage isn't enabled.
The last point indicates that the API client should explicitly set HotConfig.Enabled: true
to be able to retrieve the data. Hot Storage enabling is done in two steps:
- It tries to fetch the data from the IPFS network considering the
AddTimeout
as a bound of time. - If the last step failed:
2.a) If
HotConfig.AllowUnfreeze: false
, it fails since it couldn't fetch the data from the single allowed source (IPFS network). 2.b) IfHotConfig.AllowUnfreeze: true
; it will check if the data is available at Cold Storage. If that's the case, it will unfreeze the data, and save it to Hot Storage. This allows aGet
operation afterward.
The rationale behind asking the client to enable hot storage with allow-unfreeze is related to the fact that retrieving data from Filecion incurs in an economic cost that will be paid by the API address. Retrieving data from the IPFS network is considered free (discarding unavoidable bandwidth costs, etc).
The Scheduler is always checking the current state of Cid storage before executing actions regarding an updated StorageConfig.
StorageConfig changes regarding Hot Storage are always applied since Hot Storage is, in general, malleable. In particular, enabling or disabling Hot Storage is most probably easy to execute and thus have a predictable ending state.
StorageConfig changes regarding Cold Storage have more subtle meaning. For example, if the RepFactor is increased the Scheduler will be aware of the current RepFactor and only make enough new deals to ensure its new value. e.g: if RepFactor was 1 and the updated StorageConfig has RepFactor 2, it will only make one new deal. As another example, if RepFactor was 2 and is decreased to 1, the Scheduler won't execute any actual work since one of the two current active deals will eventually expire.
The RepFactor configuration also is considered if the Cid has enabled automatic deal renweal. In particular, if the RepFactor was decreased from 3 to 1, the rewneal logic will wait until the last deal is close to expiring to only renew that one. That's saying, the renew logic doesn't blindly renew expiring deals, but it's RepFactor aware as expected.
Regarding other Cold Storage configuration changes regarding miner selection, such as country filtering or excluded miners, these new considerations will be made every time a new deal is made. Any other existing deals that are active that were created on other configuration conditions can't be canceled or reverted. Saying it differently, the new miner-related configuration will be considered from future new deals, i.e: renewing deals, increased RepFactor, repairing.