DVSA-Simulator Technical Tips
-
State-Dependent Action Set Extraction
The system extracts the set of valid actions dynamically based on the current state. This is essential for DVSA-RL research. In DVSA-environment, the overall action space is massive, but only a small portion is valid at a given moment. Our aim is to ensure agents do not waste computation or learning capacity on impossible actions.
-
Designed For Reinforcement Learning
The simulator is specifically tailored for reinforcement learning use cases by designing Gym interface. From its modular architecture to its efficient state management, every component is designed to support fast simulations.
-
Super Memory Efficiency
Written entirely in Rust, the system takes full advantage of zero-cost abstractions and strict ownership rules to minimize memory usage. References are used in place of cloning wherever possible, achieving exceptional memory efficiency, especially critical for large-scale or parallel environments.
-
Asynchronous Processing
The architecture supports asynchronous execution of core tasks such as agent decision-making, rendering, and logging. This improves user experience even if large model is used. Asynchronous Processing is implemented with event queues.
-
Well-Moduled System
The codebase is cleanly split into well-defined modules, such as `agent_system`, `game_system' and 'combat_system'. Splitted into small modules, developers can change or extend game system easily.
-
Well-Isolated Agent-Environment Architecture
Agents and environments are strictly separated. The environment provides only observation and action interfaces, while agents operate independently, treating the environment as a black box. This enforces clear boundaries and encourages clean, reusable agent design.
-
Minimal Dependencies for Maximum Portability
Dependencies are kept to a bare minimum to maximize portability. This allows the simulator to run in constrained environments, compile quickly, and avoid compatibility issues.
-
Rendering for Test
A built-in test renderer allows visualization of simulation states. It is modular and optional, meaning it can be toggled for debugging or analysis without impacting core performance. This is helpful for inspecting agent behavior during development or showcasing learning progress.
-
Snap Shot (On going)
The simulator supports full state snapshotting — serialization and restoration of environment and agent states. This feature enables rollback, branching, debugging, and advanced RL techniques like curriculum learning and replay buffer augmentation.