Cartpole as a gym environment#

Gym is a interface to create reinforcement learning environments.

Python#

Note

Using a NVIDIA GPU (Pascal+, 10 series or newer) for this tutorial is recommended.

As of 10/2021, the only supported consumer AMD-GPU supported are RadeonVII and Vega 64. RDNA1-2 are not yet supported.

See ROCm website for additional details.

Warning

unreal.mladapter requires msgpack-rpc-python which is badly out of date. The required tornado version is particularly old tornado >= 3,<5 This might break your environment. You should build a virtualenv just for UE.

Install python using conda
Install pytorch
Install unreal.mladapter (python)

pip install -e . UnrealEngine/Engine/Plugins/AI/MLAdapter/Source/python/

Enable MLAdapter (C++/UE4 plugin)

Create a new environment#

MLAdapter already implements the basic blocks for us. To create a new environment we simply need to inherit from unreal.mladapter.UnrealEnv. We need to specify with map (i.e level) to load for our training. Additionally we will need to set our agent configuration which mainly consist of the observation and action space.

class CartPole(UnrealEnv):
    MAP = '/Game/CartPole/CartPole.umap'
    PROJECT_NAME = None

    def __init__(self, path, ue4params=None, **kwargs):
        CartPole.PROJECT_NAME = path

        if ue4params is not None:
            ue4params.set_default_map_name(CartPole.MAP)

        super().__init__(ue4params=ue4params, **kwargs)

Define the action space#

The action space is defined by adding U4MLActuator to the agent configuration.

Available actuators:
- InputKey: used the input mapping defined by the project
- Camera: used to control the camera (AddPitchInput & AddYawInput)

class CartPole(UnrealEnv):

    ...

    @staticmethod
    def default_agent_config():
       # Create a new agent config
       agent_config = AgentConfig()

       # Set the spawn class that is being controlled
       agent_config.avatarClassName = "Pawn_Cart_C"

       # Define the action space by adding actuators
       # In our case our entire action space is defined with InputKey
       agent_config.add_actuator("InputKey")

For our example this will result in a an action space of Discrete(2) since only 1 Axis input that varies between -1 and 1.

Define the observation space#

The observation space is defined by adding U4MLSensor to the agent configuration.

Available sensors:
- AIPerception: hooks itself to the AI Perception system of UE4 (Note this is a Game-AI (Behavior Trees) system not ML-AI system)
  see AI Perception which include Hearing, Sight, Team (proximity of ally), Touch. Currently only Sight is supported, it is created by the MLAdapter system and added to the player controller.
- Attribute: listen to attribute change if you are using UAttributeSet for your character.
- Camera: Make a camera capture of the scene
- Input: Capture the inputs
- Movement: Capture the characters movement & acceleration
  Space: Box([-1. -1. -1. -1. -1. -1.], [1. 1. 1. 1. 1. 1.], (6,), float32)
  
  3 floats for the positions and another 3 floats for the acceleration

class CartPole(UnrealEnv):

    ...

    @staticmethod
    def default_agent_config():
       # Create a new agent config
       agent_config = AgentConfig()

       # Set the spawn class that is being controlled
       agent_config.avatarClassName = "Cart_Pawn_C"

       # Actuators
       ...

       # Define the observation space by adding sensors

       # Add our pawn movement (i.e cart movement)
       agent_config.add_sensor(
           "Movement",
           {
               "location": "absolute",
               "velocity": "absolute"
           }
       )

       # Add sight so we can see the pole
       agent_config.add_sensor(
             "AIPerception",
             {
                "count": "1",                   # Number of actors it can see
                'sort': 'distance',             # how the actors are sorted `distance`` or `in_front`
                'peripheral_angle': 360,        # sight cone
                'mode': 'vector',               # vector (HeadingVector) or rotator
                                                # max_age
             }
       )

# Observation space
Tuple(
   # AIPerception
   Box([-1. -1. -1. -1. -1.], [1. 1. 1. 1. 1.], (5,), float32),

   # Movement
   Box([-1. -1. -1. -1. -1. -1.], [1. 1. 1. 1. 1. 1.], (6,), float32)
)

# Observation
(
   array([ 9.8459434e-41,  3.9260104e+02,  9.6790361e-01, -2.3592940e-01, -8.6601958e-02], dtype=float32),
   array([    240.      ,      90.84363 ,      242.00069 ,      0.      ,    -77.921715,     0.      ], dtype=float32)
)

Warning

The sight sensor has an affiliation property that can filter out between friendlies/hostiles and neutrals. If the AIPerception observation is not set that would be the main reason why.

The affiliation is set using ETeamAttitude from the FGenericTeamId The team id is returned using FGenericTeamId FGenericTeamId::GetTeamIdentifier(const AActor* TeamMember) The AActor must implement the IGenericTeamAgentInterface interface (if not FGenericTeamId::NoTeam is used).

The ACPActor we defined earlier using C++ is handling this and you should not have any issue if you used it for your pole.

Run the environment#

To test our environment we can write a simple script that will launch the environment an execute some random actions.

from cartpole.env import Cartpole
from unreal.mladapter.utils import random_action, ArgumentParser
from unreal.mladapter.runner import UE4Params

parser = ArgumentParser()
parser.add_argument("--project", type=str, default=project, help="Path to the uproject")
parser.add_argument("--iter", type=int, default=3, help="number of games to play")
args = parser.parse_args()

# python will launch the game
# if false it will attach to the game that is currently running
launch = True

env = Cartpole(
    args.project,
    UE4Params() if not launch else None,
    server_port=15151,
)

print('Starting Environment')

for i in range(args.iter):
    obs = env.reset()

    reward = 0
    done = False
    print('Environment initialized')

    while not env.game_over:
        a = random_action(env)
        # print(obs, a, reward, done)
        print(reward)
        obs, reward, done, _ = env.step(a)

    print("{}: Score: {}".format(i, reward))

env.close()

You can now run the environment from python. You need to specify where the UnrealEngine editor is located and the path to your cartpole uproject.

# Add UE-DevBinaries=E:/Engine/Binaries/Win64 to the path
python Source/python/cartpole/run.py --project E:/cartpole/Cartpole.uproject

# if UE-DevBinaries is not in the path you can set it manually like bellow
python Source/python/cartpole/run.py --project E:/cartpole/Cartpole.uproject --exec E:/UnrealEngine/Engine/Binaries/Win64/UE4Editor.exe

Note

To test the code above simply install cartpole python environment using pip install -e . at the root of the cartpole repository.

Note

When running a python script from the commandline do not forget to use the annaconda commandline shortcut that was added when you installed annaconda.

Note

If the python script launch the game but actions are not executed, try executing the environment from the editor first. MLAdapter has a known issue where it can fail to connect to the RPC server when launching UE4.

Note

Only Windows was tested

Cartpole as a gym environment

Contents

Cartpole as a gym environment#

Python#

Create a new environment#

Define the action space#

Define the observation space#

Run the environment#

References#