You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In current versions of julia-based guard bot policies (e.g. LBG1_LG3, LBG1_LG4, LBG1_LG5), the julia-based iLQGames solver is only run after the environment is initiated. This can cause very long delays for time-to-first execution of guard policy, especially if the computer running the scenario is also running many other processes (e.g. a computationally expensive bandit policy). In turn, this causes the effective behavior of the guard to vary widely between systems.
Instead, the guard policy should be pre-run prior to environment initialization. This may look like:
initialize and reset environment
wait until first execution of guard policy completes
reset environment again
This will likely take changes to ksp_interface.py which calls the env.reset() function. If the changes are made at this fundamental level that would affect all environments, then all environments would need a version increment (e.g. V1 -> V2)
The text was updated successfully, but these errors were encountered:
Did some work on this, but I think it would need a deeper dive to implement correctly. I think the fundamental problem with a "quick fix" as described above is that iLQGames gets run in a separate process (i.e. burn_scheduler_loop defined in LBG1_LG5_ParentEnv), and if that process is started and then stopped during the reset process (which is the case in the quick fix), then I think the solved trajectory is dumped out of memory and you can't use it to speed up first execution after the full reset function is complete and the episode has started (thus defeating the purpose of the pre-run).
Here is a hack attempt at LBG1_LG5_ParentEnv.reset that doesn't really work
def reset(self):
"""Enable iLQGames first execution during episode reset process"""
# first perform a normal reset that should call KSPDGBaseEnv.reset
self.logger.info("PRE-RUNNING ENVIRONMENT FOR GUARD INSTANTIATION")
super().reset()
# wait for iLQGames first execution to complete
while self.burn_sched is None:
time.sleep(1.0)
self.logger.info("Waiting for Guard initialization to complete...")
# close environment and do a fresh restart
self.logger.info("CLOSING PRE-RUN ENVIRONMENT")
self.close()
self.logger.info("RESETING ENVIRONMENT TO START EPISODE")
return super().reset()
Beyond the fundamental problem described above, this reset function has several other issues with the current architecture. This reset gets called by ksp_interface.ksp_interface_loop but it hangs that thread while the full reset is completed. This prevents the observation_handshake from executing in a timely fashion causing the runner.policy_loop in the main thread to timeout. Thus the main thread has unraveled even before the reset process completes.
In current versions of julia-based guard bot policies (e.g. LBG1_LG3, LBG1_LG4, LBG1_LG5), the julia-based iLQGames solver is only run after the environment is initiated. This can cause very long delays for time-to-first execution of guard policy, especially if the computer running the scenario is also running many other processes (e.g. a computationally expensive bandit policy). In turn, this causes the effective behavior of the guard to vary widely between systems.
Instead, the guard policy should be pre-run prior to environment initialization. This may look like:
This will likely take changes to
ksp_interface.py
which calls theenv.reset()
function. If the changes are made at this fundamental level that would affect all environments, then all environments would need a version increment (e.g. V1 -> V2)The text was updated successfully, but these errors were encountered: