Pre-run Julia-based Guard policies before environment reset #26

rallen10 · 2025-01-05T19:55:39Z

In current versions of julia-based guard bot policies (e.g. LBG1_LG3, LBG1_LG4, LBG1_LG5), the julia-based iLQGames solver is only run after the environment is initiated. This can cause very long delays for time-to-first execution of guard policy, especially if the computer running the scenario is also running many other processes (e.g. a computationally expensive bandit policy). In turn, this causes the effective behavior of the guard to vary widely between systems.

Instead, the guard policy should be pre-run prior to environment initialization. This may look like:

initialize and reset environment
wait until first execution of guard policy completes
reset environment again

This will likely take changes to ksp_interface.py which calls the env.reset() function. If the changes are made at this fundamental level that would affect all environments, then all environments would need a version increment (e.g. V1 -> V2)

The text was updated successfully, but these errors were encountered:

rallen10 · 2025-01-06T17:09:09Z

Did some work on this, but I think it would need a deeper dive to implement correctly. I think the fundamental problem with a "quick fix" as described above is that iLQGames gets run in a separate process (i.e. burn_scheduler_loop defined in LBG1_LG5_ParentEnv), and if that process is started and then stopped during the reset process (which is the case in the quick fix), then I think the solved trajectory is dumped out of memory and you can't use it to speed up first execution after the full reset function is complete and the episode has started (thus defeating the purpose of the pre-run).

Here is a hack attempt at LBG1_LG5_ParentEnv.reset that doesn't really work

def reset(self):
        """Enable iLQGames first execution during episode reset process"""

        # first perform a normal reset that should call KSPDGBaseEnv.reset
        self.logger.info("PRE-RUNNING ENVIRONMENT FOR GUARD INSTANTIATION")
        super().reset()

        # wait for iLQGames first execution to complete
        while self.burn_sched is None:
            time.sleep(1.0)
            self.logger.info("Waiting for Guard initialization to complete...")

        # close environment and do a fresh restart
        self.logger.info("CLOSING PRE-RUN ENVIRONMENT")
        self.close()
        self.logger.info("RESETING ENVIRONMENT TO START EPISODE")
        return super().reset()

Beyond the fundamental problem described above, this reset function has several other issues with the current architecture. This reset gets called by ksp_interface.ksp_interface_loop but it hangs that thread while the full reset is completed. This prevents the observation_handshake from executing in a timely fashion causing the runner.policy_loop in the main thread to timeout. Thus the main thread has unraveled even before the reset process completes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-run Julia-based Guard policies before environment reset #26

Pre-run Julia-based Guard policies before environment reset #26

rallen10 commented Jan 5, 2025

rallen10 commented Jan 6, 2025

Pre-run Julia-based Guard policies before environment reset #26

Pre-run Julia-based Guard policies before environment reset #26

Comments

rallen10 commented Jan 5, 2025

rallen10 commented Jan 6, 2025