A gymnasium environment for a noisy master-slave system. This environment can be used to train an RL-based stationary Kalman filter.
- hat_x_1: The estimated angle.
- hat_x_2: The estimated frequency.
- x_1: Actual angle.
- x_2: Actual frequency.
- u1: First action coming from the RL Kalman filter.
- u2: Second action coming from the RL Kalman filter.
An episode is terminated when the maximum step limit is reached, or the step cost exceeds 100.
The agent's goal in the Ex3EKF environment is to act so that the estimator estimates the original noisy system perfectly. By doing this, it serves as an RL-based stationary Kalman filter.
The Ex3EKF environment uses the following cost function:
In addition to the observations, the environment returns an info dictionary containing the current reference and the error when a step is taken. This results in returning the following array:
[hat_x_1, hat_x_2, x_1, x_2, info_dict]
This environment is part of the Stable Gym package. It is therefore registered as a gymnasium environment when you import the Stable Gym package. If you want to use the environment in stand-alone mode, you can register it yourself.