• Corporate
  • Nov 12, 2020
  • By QOMPLX

QOMPLX Intelligence: Opponent Strategy Identification Using StarCraft

QOMPLX Intelligence: Opponent Strategy Identification Using StarCraft

This is the second in a multi-part QOMPLX Intelligence series that examines how QOMPLX uses StarCraft as a training ground for research to support advanced reinforcement learning and effective decision making using machine learning and artificial intelligence. You can download PDF copies of  Part 2: Opponent Strategy Identification Using StarCraft or our first installment: Build Order Selection in StarCraft.


As we've noted in a prior blog post, QOMPLX's business value hinges on our ability to continually develop and improve our analytic and computational tools. Ongoing research and development is an important part of the work we do.

To that end, our engineers have found Blizzard’s StarCraft Real Time Strategy (RTS) game to be a rich canvas for experimentation and concept validation. Like other cutting edge companies and research organizations working on machine learning and artificial intelligence, QOMPLX engineers are leveraging StarCraft for reinforcement learning, multi-agent reinforcement learning techniques, and select experiences in applied research supporting these improved decision-making goals.

Opponent Strategy Identification: A Common Objective

In our latest installment in this series, we are taking a look at how we're using StarCraft to help hone our ability to identify and learn an opponent's strategy.

Identifying and learning an opponent's strategies (and the way such strategies are
chosen) is an important problem in Reinforcement Learning (RL). While identifying strategies in games like Chess or Go is relatively straight forward, it is far more difficult in complex RTS games like StarCraft in which the play space is much larger and in which players must contend with so-called "partial observability" - the fact that large parts of the game board is obscured to them, at least initially.

Peering into the Fog

In our latest research, we have categorized different types of opponents based on how they choose their strategy (or "build order," in StarCraft terminology) by means of a scoring mechanism. We use this to inform our own decision making and to determine a counter-strategy that will anticipate the actions of the opponent.

The four types are opponents are:

  • 1BO - those who use a single BO all the time
  • RP - those who randomly pick a strategy from a pool of build orders in each game
  • UCB1 - those who use a popular strategy-selecting algorithm
  • CP - those who always use the build order they used successfully in the last game, or which maximizes the chance of success over the build order their opponent last used

    To examine the effectiveness of the scoring mechanisms for each type of opponent, we simulated an opponent who chose BOs in a way dictated by the opponent type. Our test totaled four experiments of 1000 games of StarCraft each. The scoring mechanisms for all four types of opponent were then applied. Use the button below to download our report and read a discussion of each experiment and its findings.

    Applications in Cyber Security

    Our research into build orders and dynamically assessing the likely strategy of an opponent is highly relevant for adversarial domains like information security, which involve sentient and learning actors. IT security teams commonly square off against adversaries about whom they have imperfect knowledge of  their identity, intentions and objectives or even their location within a compromised environment.

    Our research using the StarCraft RTS is an important part of QOMPLX’s broader research into optimal decision-making under uncertainty and how different techniques can improve risk management outcomes.

    Additional Reading

    A Reasoning-based Defogger for Opponent Army Composition Inference under Partial Observability (Hao Pan, 16th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment)

    Presentation: A Reasoning-based Defogger for Opponent Army Composition Inference under Partial Observability

    Build Order Selection in StarCraft Utilizing a Customized Bayesian Multi-Armed Bandit Algorithm (Hao Pan, 34th AAAI Conference on Artificial Intelligence )

    Poster: Build Order Selection in StarCraft Utilizing a Customized Bayesian Multi-Armed Bandit Algorithm

    Part 1: Build Order Selection in StarCraft.

  • You might also be interested in

    Empowering enterprises to stay ahead of evolving threats

    Empowering enterprises to stay ahead of evolving threats

    QOMPLX recently joined the IBM Security App Exchange. Here’s why the integration will take your security to the next level.

    Read more
    Identify and Fight the Phish #CyberMonth

    Identify and Fight the Phish #CyberMonth

    Phishing attacks are an easy way for a bad actor to gain access to a network. Once inside, they can cause devastating losses.

    Read more
    How much automation?

    How much automation?

    Automation of underwriting decisions has a very tangible benefit - cost savings. When rules are automated and decisions are made based on reliable supporting data, underwriters can focus on the outliers and make the most of their precious time.

    Read more
    Request a Demo

    Interested in learning more?

    Subscribe today to stay informed and get regular updates from QOMPLX.