Mario AI Competition 2009 – Getting Started

Programming is fun. That’s why I decided it would be a good career for me. I love solving puzzles, and I love the concept of making computers, these mysterious boxes of my childhood, do things. However, when you finally get to working as a professional programmer, the types of projects you get to work on are often not as interesting as you’d like. This is why I am interested in AI. AI presents new and unique challenges, and techniques of approaching these challenges. I said in a previous post that AI was more commonplace than you’d think, but didn’t really follow it up with suggestions of where you can work on AI related programs.

I’ve recently stumbled on an interesting AI competition (before it got big with the video – I’ve been writing this post for about a month), based off of an open source Java project called Inifine Mario, called Mario AI Competition 2009. The competition is to create an AI agent that can navigate randomly generated Mario levels, pitting each agent against each other. The first round of this competition has already closed, however there is another round that closes September 3.

I won’t be attending the conferences nor will I be submitting agents to the competition. I will however work on a few agents, as this is probably the most fun application of AI that I’ve seen in a while, and it’s a small enough environment to enable me to get very familiar with it very quickly, and hence get into making solutions very quickly.

On the surface, this seems a trivial, niche problem. An agent that can navigate a video game level doesn’t really have a lot of diverse applications. However, it’s a great way to begin practicing some fundamental concepts of AI. The parameters of the competition don’t limit the techniques you can use to create the agent. In the video below, Robin Baumgarten is using A* path finding techniques.

However, A* is just the start, and is a fairly basic approach (though it works pretty well by the demos), and there are many more options, including machine learning and neural networks. Unfortunately, I won’t be covering any of that yet. The goal of this post is to get myself, and anyone else interested, up to speed on the environment, and the basics of creating an agent. I will be creating a very simple reactive agent that has no planning, and is quite flawed. However, you should be able to walk away with a firm grasp of how the environment works and what tools you have at your disposal.

Before I go on, a disclaimer is necessary: I haven’t worked with Java since my university days, so about 6 years or so. I will post some code, and that code will probably suck. You have been warned.

The Game Environment

The details on the code available from the competition website are fairly vague, so I’m posting this to perhaps help some people who are interested in getting into it, and also for my reference to help while working on it. Firstly, obviously, you’re going to have to download the code available on the competition website. The code is relatively stable at this point, however it was constantly being updated with fixes for bugs posted on the Google Group for the competition. These changes would be posted on the group, so I’d check it regularly, or grab it from Google Code with svn, and just update your repo regularly. Once you’ve downloaded it, open it in your IDE/editor of choice, and roll up your sleeves.

Firstly, I’m going to reference classes by their fully qualified class names; ie: ch.idsia.scenarios.Play, which references Play.java located in the src/ch/idsia/scenarios folder. This way you can play along at home.

According to the online documentation, you can fire up a run by typing

java ch.idsia.scenarios.Play

from within the “classes” directory. This will start a copy of the Infinite Mario game with a human player. To start with a pre written agent, type

java ch.idsia.scenarios.Play ch.idsia.ai.agents.ai.ForwardJumpingAgent

to start with an agent that will run forward and jump as soon as it can.

The ch.idsia.scenarios.Play class is a simple class that sets up the agent, whether it’s human controlled or automated, and then it sets some play environment variables and runs the application. If an argument is supplied, it will attempt to instantiate an instance of the class and pass it through to the play environment.

I won’t be covering the the details of the simulation code, as it’s all pretty logically separated and you don’t really need to know how to do that. Information is passed through to the agent through an implementation of the interface ch.idsia.mario.environments.Environment. This argument to the getAction function provides the agent with a serialised view of the terrain and the location of enemies, along with information about Mario. Basically, any information you can get by looking at the scene.

This information is provided in the form of a byte matrix, through several functions. observation.getLevelSceneObservation() will return a matrix with details of the terrain. It will also tell you what the type of the terrain is, such as a pipe or a one way platform. This detail is found in ch.idsia.mario.engine.LevelScene. You can see the number codes for the different types of terrain. It’s broken down into gaps, hard borders, half borders and flower pots.

The observation.getEnemiesObservation() is very similar to observation.getLevelSceneObservation(), except it addresses enemy locations. Again, you’ll also get the type of the enemy from the matrix in the form of an integer code, which you can find in ch.idsia.mario.engine.sprites.Sprite.

I’ve written a debugging function that will output both the enemy and terrain matricies to give you a code view of what’s going on. It’s interesting and enlightening to watch the code view of the simulation next to the simulation itself, and it will help us write code to fit our design. The function is as follows:

byte[][] levelState = observation.getLevelSceneObservation();
byte[][] enemyState = observation.getEnemiesObservation();

//Debug information
for (int y = 0; y < 22; y++)
{
    for (int x = 0; x < 22; x++)
    {
        //Check for enemies first
        if (enemyState[y][x] > 0)
        {
            String enemyValue = Integer.toString(enemyState[y][x]);
            int padding = 3 - enemyValue.length();
            if (enemyState[y][x] == 1)
                System.out.print("mmm");
            else if (padding == 2)
                System.out.print(" " + enemyValue + " ");
            else if (padding == 1)
                System.out.print(" " + enemyValue);
            else
                System.out.print(enemyValue);
        }
        else
        {
            //Check for terrain next
            String terrainValue = Integer.toString(levelState[y][x]);
            int padding = 3 - terrainValue.length();
            if (padding == 2)
                System.out.print(" " + terrainValue + " ");
            else if (padding == 1)
                System.out.print(" " + terrainValue);
            else
                System.out.print(terrainValue);
        }
        System.out.print("|");
    }
    System.out.print("\n");
}
System.out.print("---------------\n");

I put the above code into a function in the agent itself, and called the function from the getAction function of the agent, purely because I know this is going to get called once per tick. You can put it wherever you want to put it.

Creating a Dumb Agent

As I mentioned at the start of this post, I’m just going to start with creating a very simple, dumb agent that will attempt to traverse the level, and even then not get it right all the time for reasons I will cover. On the surface, Mario’s tasks for navigating a stage without enemies is pretty straightforward. All he needs to do is to get himself around the level and jump over pits.

However, sometimes Mario can find himself falling into a gap, or over-jumping his mark and jumping into a hole. So basically, we need to check if there’s something in Mario’s path that needs to be jumped over, or if there’s a gap that we need to jump over. But, we can’t just jump – we also need to check if jumping is going to put is into a gap and kill Mario, and if Mario runs off of where ever he is, he might fall into a hole.

This would be fairly straight forward if we had a robust method of predicting where Mario’s jumps will take him, but unfortunately we don’t officially get this information.
There is a movement algorithm in ch.idsia.mario.engine.sprites.Mario, which starts at the move method, however, not only is this not in the spirit of the competition, but more importantly, it’s very hard to figure out, for me at least. I’ve not been able to reliably modify the movement code to act as a predictor of Mario’s path if you plug-in how long to hold each key for.

Hence, at this stage, we’ll just be making Mario go right and try and jump over obstacles. Seems fairly simple.

You’ll see in the provided agents that the function getAction is the heart of the agent, and an instance of the Environment interface discussed before. This function is the work horse of your agent, as it will tell the simulation what the agent is going to do.

Although the observation.getLevelSceneObservation() and observation.getEnemiesObservation() methods return the full view of the level, all the features are jumbled together. Later on, we’ll need to know what’s a cannon and what’s a pipe, and what’s raised ground, and maybe even what’s a block we can punch. We’ll need to know this because cannons shoot Bullet Bills and enemy flowers live in pipes, and these can appear at any time, and if Mario is near by, he’s going to get hurt.

I’ve created the following function to categorize the environment into pipes, cannons and pits:

private Boolean[] pits;
private Vector cannons;
private Vector pipes;
private void processTerrain(byte[][] state)
{
    //Init features
    pits = new boolean[22];
    cannons = new Vector();
    pipes = new Vector();

    //Init pits
    for (int i = 0; i < 22; i++)
    {
        pits[i] = true;
    }

    for (int y = 21; y >= 0; y--)
    {
        for (int x = 0; x < 22; x++)
        {
            //Look for gaps
            pits[x] = pits[x] && state[y][x] == 0;

            //Look for cannons and pipes
            if (state[y][x] == 20)
            {
                boolean pipeFound = false;
                //Check if x,y is in pipes
                int pipeLength = pipes.size();
                for (int pi = 0; pi < pipeLength; pi++)
                {
                    int[] coords = (int[])pipes.elementAt(pi);
                    if (coords != null && coords[0] == x && coords[1] == y)
                    {
                        //Pipe found, skip it
                        pipeFound = true;
                    }
                }
                if (!pipeFound)
                {
                    //Peek ahead to determine if it's a pipe or a cannon
                    if(x < 21 && state[y][x + 1] == 20)
                    {
                        //It's a pipe, add it and the next one to the pipe array
                        pipes.add(new int[]{x + 1, y});
                    }
                    else
                    {
                        //It's a cannon
                        cannons.add(new int[]{x, y});
                    }
                }
            }
        }
    }
}

The three variables are defined as private members of the agent, and calling this function will populate them with tile locations in the case of pipes and cannons, and tile columns in the case of pits. You don’t have to do this with such a simple agent – you can simply use similar techniques to parse over the whole level state.

I use this information to determine how far away obstacles are. At first I was going to make the distance between obstacles affect the jumping distance and height, however without an accurate way to determine where jumps are going to land, this can actually cause more harm than good. So I ignore it, and just use a fairly blunt approach.

Firstly, another helper function:

private int setDeltaForObstacles(Vector tiles)
{
    int length = tiles.size();
    int buffer = 0;
    int localDeltaX = 0;
    //Loop through every tile
    for (int counter = 0; counter < length; counter++)
    {
        //Get coords of tile
        int[] coords = (int[])tiles.elementAt(counter);
        //Calculate the delta, then check to see if it's good enough
        localDeltaX = coords[0] - 11;
        if (action[Mario.KEY_RIGHT] && coords[0] > 11)
        {
            //If the tile is after mario, and we're heading right, delta is good
            break;
        }
        else if (action[Mario.KEY_LEFT] && coords[0] < 11)
        {
            //If the tile is before mario and we're heading left, add to buffer
            if (coords[0] > buffer)
                buffer = coords[0];
        }
        else if (action[Mario.KEY_LEFT])
        {
            //If the tile is after mario, but we're heading left, get the last tile loc from the buffer
            localDeltaX = buffer - 11;
            break;
        }
    }
    return localDeltaX;
}

This function takes a Vector and will return the closes tile to Mario, depending on his direction. It’s a very coarse way to determine the closes obstacle to Mario. Bear in mind that it doesn’t recognise when obstacles are put close together, thus doesn’t optimise Mario’s jumping.

The goal of the agent is to jump over the closest obstacle. That’s it. Considering our obstacles are now in three different datastructures (cannons in one, pipes in another, and pits and raised terrain in the original byte array), we need to store the delta (distance between Mario and the closes obstacle) and update it only when we’ve found a smaller delta. Also, I want to keep the delta of any pits separate, as I’m going to have different rules for pits than other obstacles (because pits are the only thing that can kill Mario at this stage).

To this end, I create two integer variables at the top of the getAction method. They are intitialized to 11 because this is where Mario is:

int deltaX = 11;
int deltaJump = 11;

Here is the code to extract the delta of the nearest obstacle, and the delta of the nearest pit. Note that levelState is the results from a call to observation.getLevelSceneObservation:

//Analyse pipes
int pipeDelta = setDeltaForObstacles(pipes);
if (pipeDelta < deltaX && pipeDelta > 0)
    deltaX = pipeDelta;
//Analyse cannons
int cannonDelta = setDeltaForObstacles(cannons);
if (cannonDelta < deltaX && cannonDelta > 0)
    deltaX = cannonDelta;

//Analyse the gaps
boolean gap = false;
for(int y = 0; y < 22; y++)
{
    if (pits[y])
    {
        if (action[Mario.KEY_RIGHT] && y > 11)
        {
            int localDelta = y - 11;
            if (localDelta < deltaJump)
                deltaJump = localDelta;
        }
        else if (action[Mario.KEY_LEFT] && y < 11)
        {
            int localDelta = 11 - y;
            if (localDelta < deltaJump)
                deltaJump = localDelta;
        }
        gap = true;
    }
}

//Analyse ground above Mario
boolean raisedGround = false;
if (action[Mario.KEY_RIGHT])
{
    for (int x = 12; x < 22; x++)
    {
        for (int y = 12; y >= 0; y--)
        {
            if (levelState[y][x] == -10)
            {
                int localDelta = x - 11;
                if (localDelta < deltaX)
                    deltaX = localDelta;
                raisedGround = true;
                break;
            }
        }
        if (raisedGround)
            break;
    }
}

Once we have the deltas, we can use this information to decide what to do. I mentioned above treating obstacle deltas different from jump deltas. This is an attempt to make Mario take smaller jumps when he can see a pit is after an obstacle. This is controlled by two instance members that keep track of jump height and jump length, called jumpHeight and jumpLength respectively. This is very hit and miss. Notice that Mario does not respond to anything more than 4 tiles away, and tiles that are behind him. This is to prevent him from going jump crazy:

jumpHeight = 7;
jumpLength = 14;// * (deltaX / 11);
if (((pipes.size() > 0 || cannons.size() > 0 || raisedGround) && deltaX >= 0 && deltaX < 5))
{
    if (observation.mayMarioJump() && !action[Mario.KEY_JUMP])
        action[Mario.KEY_JUMP] = true;
}

//If mario is jumping, there must be an obstacle in the way
if (action[Mario.KEY_JUMP])
{
    if (gap && deltaJump >= 3)
    {
        jumpHeight = 7;
        jumpLength = 1;
    }
}
//No obstacles, but there is a gap
else if (gap && deltaJump >= 0 && deltaJump <= 3)
{
    if (observation.mayMarioJump() && !action[Mario.KEY_JUMP])
        action[Mario.KEY_JUMP] = true;
}
maintainJump();

Finally, the maintainJump method is to ensure we’re holding down the jump key for the optimal amount of time. It is also coded to provide rudimentary control over jump height and length, however this isn’t being taken advantage of.

private void maintainJump()
{
    if(action[Mario.KEY_JUMP] && (jumpCount > jumpHeight))
    {
        action[Mario.KEY_JUMP] = false;
        jumpCount = 0;
    }
    // otherwise you're in the middle of jump, increment jumpCount
    else if (action[Mario.KEY_JUMP])
    {
        jumpCount++;
    }
    if (jumping)
    {
        if (action[Mario.KEY_RIGHT] && (rightCount > jumpLength))
        {
            action[Mario.KEY_RIGHT] = false;
            rightCount = 0;
        }
        else if (action[Mario.KEY_RIGHT])
        }
    }
}

The complete agent is at the bottom of this post for you to download and include in your project. Simply put it in the path src/timgittos, configure the ch.idsia.scenarios.Play class to have no enemies, with options.setPauseWorld(true), and run any level you want (passable result with difficulty 3). Then build the project and run it with java ch.idsia.scenarios.Play timgittos.BlogAgent.

As you will observe, the agent will make Mario go towards the right as fast as possible, jumping over obstacles as it meets them. For the most part, it works well enough. However, there are quite a few fatal flaws, and I will explain why it fails in this way.

Most obviously, the agent frequently falls down gaps. It doesn’t fall down all gaps – which would indicate that it’s not a failure in the gap detection, but rather something else. In fact, what’s going on is that the gaps that Mario tends to fall down are those that enter his field of consideration (4 tiles ahead of him) while he’s in the air. The agent makes no effort to check whether it’s jump will land Mario in a pit (since it can’t predict where Mario will land), and my previous efforts at trying to make the agent force Mario to the left when it detects he’s going to go down a pit have failed.

Mario will also fall down pits if there are multiple pits in a row. Again, this is because the agent doesn’t exert much control over the size and length of Mario’s jumps ,and so can’t pin point that it wants to land in the middle of the two pits, so it tends to overshoot the jump and fall down the second pit.

Most annoyingly, the agent will sometimes get stuck. If it reaches an obstacle, such as a cannon or a bit of wall or a pipe and jumps at the wrong angle, it will enter a weird wall jump loop, where it jumps, gets stuck on the obstacle, jumps off, jumps again, and gets stuck again. This will loop until there is no time. This is a side effect of the environment allowing wall jumps, which is cool when you can get the agent to make Mario jump out of a pit he’s falling into, but annoying when he gets stuck.

In any case, this agent is far from complete, and just serves to demonstrate how the simulation environment works, and how to interact with it. Once you’re clear how to work with the simulation, you can concentrate on building a useful agent, either by starting from scratch (like you would if you were going to use a machine learning approach), or by building on top of this base (for a determinate approach like A*).

Download: Mario AI Blog agent (.java)

Tags: , , ,

  • Ilker
    Great blogpost:) I completely agree with your thought at the beginning of article ["Programming is fun. That’s why.....................NOT as interesting as you’d like"].
  • I really enjoyed reading this blogpost. I saw the Mario AI competition a month or so ago. I've been playing around with Neural Networks, Backpropagation, TD-Learning, and Genetic Algorithms, but I have a lot farther to go before I can thoroughly appreciate these techniques.
  • Tim
    I'm actually about to start implementing a neural network based agent soon. I'd appreciate hearing your thoughts when I finish writing it up.
  • Excellent =) I'd love to see it. Feel free to send me an e-mail or leave me a message on my YouTube page. I did my Honors Thesis at Hofstra on an AI Agent playing Othello. I've forgotten a bit (it's so much more involved than just a neural network, which is hard enough), but I'd love to chat about it!
blog comments powered by Disqus