This is the fifth in a multi-part QOMPLX Intelligence series that examines how QOMPLX uses the real-time strategy (RTS) game StarCraft as a training ground for research to support advanced reinforcement learning and effective decision making using machine learning and artificial intelligence. Download a PDF copy of Part 5, StarCraft: Integrating the Grid System & Pathfinding Routines. We have also provided links to prior installments of this series below.
Like other researchers, QOMPLX's team is using Blizzard’s StarCraft real time strategy (RTS) game for experimentation and concept validation in areas like reinforcement learning, multi-agent reinforcement learning techniques, and select experiences in applied research supporting these improved decision-making goals.
In recent posts we've been digging deep into our team's ongoing research and development of machine learning and artificial intelligence, leveraging various aspects of StarCraft. For example in our last installment, we discussed grid-based map extraction and pathfinding in partially-observable environments like StarCraft's "fog of war."
Pathfinding from an Obstacle
As we have discussed in previous posts, QOMPLX has developed a grid system to learn StarCraft map features effectively and to facilitate pathfinding in environments with partial observability. An immediate challenge we faced in doing so is the fact that the grid system could cause pathfinding routines to crash. Specifically: the Fog of War can result in pathfinding routines to direct a unit into an area covered by an enemy unit or structure. In such a scenario, the grid system would immediately mark the current position of the unit as an "obstacle" and discourage future visits.
That sounds reasonable from the grid system's perspective. However, marking the obstacle node in such a way jams the pathfinding algorithms as none can operate properly in a scenario in which the starting node is an obstacle.
To resolve this conundrum, we relied on a simple principle: units who encounter a an enemy unit should seek to get out of the danger zone as soon as possible. To achieve this, we implemented a spiral search around the obstacle node to find the closest "safe" node. Intuitively, this is much more efficient than sweeping through the entire map, which was our motivation behind adopting this approach.
Testing our Approach
How did this work? To test our hypothesis, we ran 10 instances of this scenario, and used the average remaining total combined hit point percentage of the air units as a criterion for the comparison.
To find out whether pathfinding effected a unit's hit points, use the download button below or the following link to read our latest installment: StarCraft: Integrating the Grid System & Pathfinding Routines.