starcraft 2
Google's DeepMind and Blizzard are helping developers train their AI agents to become the best StarCraft players. Reuters/Kai Pfaffenbach

Google's DeepMind and Blizzard Entertainment have unveiled a new set of "powerful" tools on Wednesday (9 August) to help train artificial intelligence systems using real-time strategy game StarCraft 2. The two firms announced the partnership in November last year saying the sci-fi game proves to be an interesting training environment for AI that provides "a useful bridge to the messiness of the real-world."

"DeepMind's scientific mission is to push the boundaries of AI by developing systems that can learn to solve complex problems," the firm said. "Testing our agents in games that are not specifically designed for AI research, and where humans play well, is crucial to benchmark agent performance."

The tool set includes a machine learning API developed by Blizzard that will allow AI agents to play the game like a human would and provide researchers with feedback.

"We've done a lot of work to allow this API to run at scale in cloud infrastructure," Blizzard said. "We are releasing the result of this work in the form of a fully functioning Linux package designed to run in the cloud for research purposes. This is a standalone Linux build optimized to only work with the API."

The kit includes a data set of about 650,000 anonymised 1v1 game replays from professional matches which will soon be bumped up to half a million as well as an open-source version of DeepMind's toolset PySC2. It also features a series of simple mini-games that breaks down the game into "manageable chunks" that developers can use to test out their AI's performance on specific tasks such as collecting minerals, selecting units.

A joint white paper that outlines the research environment is also included.

Blizzard said their image-based API "exposes a sandbox for the community to experiment with, using both learning based AI and scripted AI to build new tools that can benefit the StarCraft II and AI communities."

DeepMind researchers said StarCraft's complex gameplay make it the ideal environment for AI research. Besides the primary goal of beating one's opponent, players are also tasked with multiple subtasks such as building units, gathering resources or moving around the map.

"In addition, a game can take from a few minutes to one hour to complete, meaning actions taken early in the game may not pay-off for a long time," researchers explain. "Finally, the map is only partially observed, meaning agents must use a combination of memory and planning to succeed.

"The game also has other qualities that appeal to researchers, such as the large pool of avid players that compete online every day. This ensures that there is a large quantity of replay data to learn from - as well as a large quantity of extremely talented opponents for AI agents."

As compared to old Atari games' 10 basic actions such as up, down, left and more, StarCraft's array of over 300 hierarchical, custom actions and functions also offer an interesting challenge for AI as well.

While AI agents seem to be performing well in the mini-games for now, researchers said even strong baseline agents such as A3C have not been able to tackle even the easiest built-in AI.

"One technique that we know allows our agents to learn stronger policies is imitation learning," DeepMind said, noting that Blizzard's ongoing release of thousands of replays will help with this form of AI training.

"These will not only allow researchers to train supervised agents to play the game, but also opens up other interesting areas of research such as sequence prediction and long-term memory," they said. "Our hope is that the release of these new tools will build on the work that the AI community has already done in StarCraft, encouraging more DeepRL research and making it easier for researchers to focus on the frontiers of our field."