POLIXIR REVIVE is released subject to the POLIXIR Commercial License, unless otherwise indicated. Please read and agree the accompanying license file carefully before downloading, installing or using the Polixir software or any accompanying files.
Unless otherwise noted, all files Copyright © 2021 - 2023 Polixir Technologies, Co., Ltd.
REVIVE is a general platform that aims to bring automatic decision-making to real-world scenarios. The platform operates in a pipeline of two steps:
- Venv Training: A virtual-environment model is trained from any offline data to mimic each agent's policy along with the transition between states (also known as nature's policy).
- Policy Training: Treat one of the agents as the active agent and freeze others as its environment. Train the active agent with reinforcement learning to derive a better policy for the agent.
The tutorials and API documentation are hosted on https://revive.cn/help/polixir-revive-sdk/index.html.
- Linux x86_64
- Python: v3.7.0+ / v3.8.0+ / v3.9.0+
- CUDA Toolkit (If NVIDIA CUDA GPU device is available.)
You can install the latest sversion of the from a cloned Git repository:
$ git clone https://agit.ai/Polixir/revive
$ cd revive
$ pip install -e .
You can also view and download all versions of the code package through the releases page.
Releases Page : https://agit.ai/Polixir/revive/releases
Or pull the lastest image contains SDK and its runtime environment from Docker Hub :
$ docker pull polixir/revive-sdk
The REVIVE SDK library is developed by Polixir. We encrypt and protect a few algorithm modules, which have their own intellectual property rights, and you can register an account to use the features of the full algorithm package.
The process can be divided into the following two steps:
Step 1. Visit the REVIVE Website to register an account.
REVIVE Website : https://www.revive.cn
Step 2. Configure registered account information.
- Open the
config.yaml
file in your home directory (for example,/home/your-user-name/.revive/config.yaml
) in a text editor. After installing REVIVE SDK, the config.yaml file will be generated automatically. - Get the accesskey from the user center of the REVIVE Website and fill it into the
config.yaml
.
accesskey: xxxxxxxxx
Data should all be put in the data
folder. See instruction to build your data in https://revive.cn/help/polixir-revive-sdk/index.html.
python train.py
. By default, the script will start local ray service and launch venv training and policy training both in tuning mode. You can change that behavior in command line, run python train.py -h
for more information. For general configurations, please see data/config.json
. For algorithm specific configurations, please see its own file. You can also modify the json file. Pass the path of the json file to the training scripts, e.g. python train.py ... -rcf config.json
will overwrite the configurable parameters.
To run training across multiple nodes, you need to manually setup the ray cluster. Start cluster on the head node with ray start --head --port 6379
in command line or join an existing cluster by ray start --address=xxx --redis-password=xxx
. Pass --address auto
when training.
revive
├── data # data folder
│ ├── config.json
│ ├── test.npz
│ ├── test_reward.py
│ └── test.yaml
├── examples # example code
│ ├── basic
│ ├── custom_node
│ ├── expert_function
│ ├── model_inference
│ ├── multiple_transition
│ └── parameter_tuning
├── README.md
├── revive # source code folder
│ ├── algo
│ ├── computation
│ ├── conf
│ ├── data
│ ├── dist
│ ├── __init__.py
│ ├── server.py
│ ├── utils
│ └── version.py
├── setup.py # installation script
├── tests # test scripts
│ ├── test_dists.py
│ └── test_processor.py
└── train.py # main start script
To add a new algorithm, you only needs to inherit the base class and implement model_creator
, optimizer_creator
, train_batch
and validate_batch
, then the code is ready to run. You can use the existing implementation as references.
- Behavior Cloning This is a supervised learning method through neural networks.
- Revive derived from Shi .et al. Virtual-Taobao: Virtualizing Real-World Online Retail Environment for Reinforcement Learning.