Learning from StarCraft-Reinforcement Learning in Risk - Blog

Download "Build Order Selection In StarCraft," the first in a series exploring how QOMPLX uses StarCraft as a training ground for research to support advanced reinforcement learning and effective decision making using machine learning and artificial intelligence.

QOMPLX uses machine learning and advanced simulations to help companies solve the toughest challenges in cybersecurity, insurance underwriting and finance. Our business value hinges on our ability to continually develop and improve our analytic and computational tools. That's why research and development has been and continues to be an important part of the work we do.

Lately, our staff has been engaged in advanced reinforcement learning (RL) focused research to develop technology and techniques capable of supporting more computationally and operationally efficient and effective decision-making under uncertainty. Our efforts are informed by- and focused on real-world use cases related to risk management in areas like information ("cyber") security and insurance. These are serious pursuits where the stakes are - literally - life and limb.

Interestingly, this very serious work has benefitted substantially from participation and interaction with leading researchers who work on RL-based approaches in gaming, specifically: StarCraft.

StarCraft and Reinforcement Learning

StarCraft is a popular online real-time strategy (RTS) game that has been around for more than two decades. The game has a military science fiction theme that posits a race for survival between three warring factions representing different species:

Protoss (P) - a technologically advanced humanoid species with psionic abilities
Terran (T) - humans exiled from Earth who excel at adapting to any situation;
Zerg (Z) - a race of insectoid aliens obsessed with assimilating other races in pursuit of genetic perfection;

Researchers in artificial intelligence and machine learning have taken to StarCraft in recent years as they move beyond traditional strategy games like Chess and Go to tackle even more complex puzzles. What StarCraft offers is an almost infinitely complex and rich canvas for experimentation and concept validation. Notably: unlike games like Chess or Go, StarCraft players must contend with so-called "partial observability" - the fact that the game board is obscured to them, at least initially, by what's termed "the fog of war."

During that time, researchers at Google's DeepMind division have fashioned StarCraft agents like AlphaStar that are capable of playing at the Grandmaster level and defeating some of the game's best human players, across species.

To talk about how StarCraft assists QOMPLX's researchers we are publishing a series of short papers this month that describe our work using StarCraft to improve reinforcement learning and multi-agent reinforcement learning techniques. We will describe select experiences in applied research using StarCraft to support improved decision-making.

For the first report, "Build Order Selection in StarCraft," we introduce some novel work on optimizing build orders to improve competitiveness.