Posts Tagged ‘Artificial Intelligence’

Mario AI Competition 2009 – Getting Started

Wednesday, August 26th, 2009

Programming is fun. That’s why I decided it would be a good career for me. I love solving puzzles, and I love the concept of making computers, these mysterious boxes of my childhood, do things. However, when you finally get to working as a professional programmer, the types of projects you get to work on are often not as interesting as you’d like. This is why I am interested in AI. AI presents new and unique challenges, and techniques of approaching these challenges. I said in a previous post that AI was more commonplace than you’d think, but didn’t really follow it up with suggestions of where you can work on AI related programs.

I’ve recently stumbled on an interesting AI competition (before it got big with the video – I’ve been writing this post for about a month), based off of an open source Java project called Inifine Mario, called Mario AI Competition 2009. The competition is to create an AI agent that can navigate randomly generated Mario levels, pitting each agent against each other. The first round of this competition has already closed, however there is another round that closes September 3.

I won’t be attending the conferences nor will I be submitting agents to the competition. I will however work on a few agents, as this is probably the most fun application of AI that I’ve seen in a while, and it’s a small enough environment to enable me to get very familiar with it very quickly, and hence get into making solutions very quickly.

On the surface, this seems a trivial, niche problem. An agent that can navigate a video game level doesn’t really have a lot of diverse applications. However, it’s a great way to begin practicing some fundamental concepts of AI. The parameters of the competition don’t limit the techniques you can use to create the agent. In the video below, Robin Baumgarten is using A* path finding techniques.

However, A* is just the start, and is a fairly basic approach (though it works pretty well by the demos), and there are many more options, including machine learning and neural networks. Unfortunately, I won’t be covering any of that yet. The goal of this post is to get myself, and anyone else interested, up to speed on the environment, and the basics of creating an agent. I will be creating a very simple reactive agent that has no planning, and is quite flawed. However, you should be able to walk away with a firm grasp of how the environment works and what tools you have at your disposal.

Before I go on, a disclaimer is necessary: I haven’t worked with Java since my university days, so about 6 years or so. I will post some code, and that code will probably suck. You have been warned.

The Game Environment

The details on the code available from the competition website are fairly vague, so I’m posting this to perhaps help some people who are interested in getting into it, and also for my reference to help while working on it. Firstly, obviously, you’re going to have to download the code available on the competition website. The code is relatively stable at this point, however it was constantly being updated with fixes for bugs posted on the Google Group for the competition. These changes would be posted on the group, so I’d check it regularly, or grab it from Google Code with svn, and just update your repo regularly. Once you’ve downloaded it, open it in your IDE/editor of choice, and roll up your sleeves.

Firstly, I’m going to reference classes by their fully qualified class names; ie: ch.idsia.scenarios.Play, which references Play.java located in the src/ch/idsia/scenarios folder. This way you can play along at home.

According to the online documentation, you can fire up a run by typing

java ch.idsia.scenarios.Play

from within the “classes” directory. This will start a copy of the Infinite Mario game with a human player. To start with a pre written agent, type

java ch.idsia.scenarios.Play ch.idsia.ai.agents.ai.ForwardJumpingAgent

to start with an agent that will run forward and jump as soon as it can.

The ch.idsia.scenarios.Play class is a simple class that sets up the agent, whether it’s human controlled or automated, and then it sets some play environment variables and runs the application. If an argument is supplied, it will attempt to instantiate an instance of the class and pass it through to the play environment.

I won’t be covering the the details of the simulation code, as it’s all pretty logically separated and you don’t really need to know how to do that. Information is passed through to the agent through an implementation of the interface ch.idsia.mario.environments.Environment. This argument to the getAction function provides the agent with a serialised view of the terrain and the location of enemies, along with information about Mario. Basically, any information you can get by looking at the scene.

This information is provided in the form of a byte matrix, through several functions. observation.getLevelSceneObservation() will return a matrix with details of the terrain. It will also tell you what the type of the terrain is, such as a pipe or a one way platform. This detail is found in ch.idsia.mario.engine.LevelScene. You can see the number codes for the different types of terrain. It’s broken down into gaps, hard borders, half borders and flower pots.

The observation.getEnemiesObservation() is very similar to observation.getLevelSceneObservation(), except it addresses enemy locations. Again, you’ll also get the type of the enemy from the matrix in the form of an integer code, which you can find in ch.idsia.mario.engine.sprites.Sprite.

I’ve written a debugging function that will output both the enemy and terrain matricies to give you a code view of what’s going on. It’s interesting and enlightening to watch the code view of the simulation next to the simulation itself, and it will help us write code to fit our design. The function is as follows:

byte[][] levelState = observation.getLevelSceneObservation();
byte[][] enemyState = observation.getEnemiesObservation();

//Debug information
for (int y = 0; y < 22; y++)
{
    for (int x = 0; x < 22; x++)
    {
        //Check for enemies first
        if (enemyState[y][x] > 0)
        {
            String enemyValue = Integer.toString(enemyState[y][x]);
            int padding = 3 - enemyValue.length();
            if (enemyState[y][x] == 1)
                System.out.print("mmm");
            else if (padding == 2)
                System.out.print(" " + enemyValue + " ");
            else if (padding == 1)
                System.out.print(" " + enemyValue);
            else
                System.out.print(enemyValue);
        }
        else
        {
            //Check for terrain next
            String terrainValue = Integer.toString(levelState[y][x]);
            int padding = 3 - terrainValue.length();
            if (padding == 2)
                System.out.print(" " + terrainValue + " ");
            else if (padding == 1)
                System.out.print(" " + terrainValue);
            else
                System.out.print(terrainValue);
        }
        System.out.print("|");
    }
    System.out.print("\n");
}
System.out.print("---------------\n");

I put the above code into a function in the agent itself, and called the function from the getAction function of the agent, purely because I know this is going to get called once per tick. You can put it wherever you want to put it.

Creating a Dumb Agent

As I mentioned at the start of this post, I’m just going to start with creating a very simple, dumb agent that will attempt to traverse the level, and even then not get it right all the time for reasons I will cover. On the surface, Mario’s tasks for navigating a stage without enemies is pretty straightforward. All he needs to do is to get himself around the level and jump over pits.

However, sometimes Mario can find himself falling into a gap, or over-jumping his mark and jumping into a hole. So basically, we need to check if there’s something in Mario’s path that needs to be jumped over, or if there’s a gap that we need to jump over. But, we can’t just jump – we also need to check if jumping is going to put is into a gap and kill Mario, and if Mario runs off of where ever he is, he might fall into a hole.

This would be fairly straight forward if we had a robust method of predicting where Mario’s jumps will take him, but unfortunately we don’t officially get this information.
There is a movement algorithm in ch.idsia.mario.engine.sprites.Mario, which starts at the move method, however, not only is this not in the spirit of the competition, but more importantly, it’s very hard to figure out, for me at least. I’ve not been able to reliably modify the movement code to act as a predictor of Mario’s path if you plug-in how long to hold each key for.

Hence, at this stage, we’ll just be making Mario go right and try and jump over obstacles. Seems fairly simple.

You’ll see in the provided agents that the function getAction is the heart of the agent, and an instance of the Environment interface discussed before. This function is the work horse of your agent, as it will tell the simulation what the agent is going to do.

Although the observation.getLevelSceneObservation() and observation.getEnemiesObservation() methods return the full view of the level, all the features are jumbled together. Later on, we’ll need to know what’s a cannon and what’s a pipe, and what’s raised ground, and maybe even what’s a block we can punch. We’ll need to know this because cannons shoot Bullet Bills and enemy flowers live in pipes, and these can appear at any time, and if Mario is near by, he’s going to get hurt.

I’ve created the following function to categorize the environment into pipes, cannons and pits:

private Boolean[] pits;
private Vector cannons;
private Vector pipes;
private void processTerrain(byte[][] state)
{
    //Init features
    pits = new boolean[22];
    cannons = new Vector();
    pipes = new Vector();

    //Init pits
    for (int i = 0; i < 22; i++)
    {
        pits[i] = true;
    }

    for (int y = 21; y >= 0; y--)
    {
        for (int x = 0; x < 22; x++)
        {
            //Look for gaps
            pits[x] = pits[x] && state[y][x] == 0;

            //Look for cannons and pipes
            if (state[y][x] == 20)
            {
                boolean pipeFound = false;
                //Check if x,y is in pipes
                int pipeLength = pipes.size();
                for (int pi = 0; pi < pipeLength; pi++)
                {
                    int[] coords = (int[])pipes.elementAt(pi);
                    if (coords != null && coords[0] == x && coords[1] == y)
                    {
                        //Pipe found, skip it
                        pipeFound = true;
                    }
                }
                if (!pipeFound)
                {
                    //Peek ahead to determine if it's a pipe or a cannon
                    if(x < 21 && state[y][x + 1] == 20)
                    {
                        //It's a pipe, add it and the next one to the pipe array
                        pipes.add(new int[]{x + 1, y});
                    }
                    else
                    {
                        //It's a cannon
                        cannons.add(new int[]{x, y});
                    }
                }
            }
        }
    }
}

The three variables are defined as private members of the agent, and calling this function will populate them with tile locations in the case of pipes and cannons, and tile columns in the case of pits. You don’t have to do this with such a simple agent – you can simply use similar techniques to parse over the whole level state.

I use this information to determine how far away obstacles are. At first I was going to make the distance between obstacles affect the jumping distance and height, however without an accurate way to determine where jumps are going to land, this can actually cause more harm than good. So I ignore it, and just use a fairly blunt approach.

Firstly, another helper function:

private int setDeltaForObstacles(Vector tiles)
{
    int length = tiles.size();
    int buffer = 0;
    int localDeltaX = 0;
    //Loop through every tile
    for (int counter = 0; counter < length; counter++)
    {
        //Get coords of tile
        int[] coords = (int[])tiles.elementAt(counter);
        //Calculate the delta, then check to see if it's good enough
        localDeltaX = coords[0] - 11;
        if (action[Mario.KEY_RIGHT] && coords[0] > 11)
        {
            //If the tile is after mario, and we're heading right, delta is good
            break;
        }
        else if (action[Mario.KEY_LEFT] && coords[0] < 11)
        {
            //If the tile is before mario and we're heading left, add to buffer
            if (coords[0] > buffer)
                buffer = coords[0];
        }
        else if (action[Mario.KEY_LEFT])
        {
            //If the tile is after mario, but we're heading left, get the last tile loc from the buffer
            localDeltaX = buffer - 11;
            break;
        }
    }
    return localDeltaX;
}

This function takes a Vector and will return the closes tile to Mario, depending on his direction. It’s a very coarse way to determine the closes obstacle to Mario. Bear in mind that it doesn’t recognise when obstacles are put close together, thus doesn’t optimise Mario’s jumping.

The goal of the agent is to jump over the closest obstacle. That’s it. Considering our obstacles are now in three different datastructures (cannons in one, pipes in another, and pits and raised terrain in the original byte array), we need to store the delta (distance between Mario and the closes obstacle) and update it only when we’ve found a smaller delta. Also, I want to keep the delta of any pits separate, as I’m going to have different rules for pits than other obstacles (because pits are the only thing that can kill Mario at this stage).

To this end, I create two integer variables at the top of the getAction method. They are intitialized to 11 because this is where Mario is:

int deltaX = 11;
int deltaJump = 11;

Here is the code to extract the delta of the nearest obstacle, and the delta of the nearest pit. Note that levelState is the results from a call to observation.getLevelSceneObservation:

//Analyse pipes
int pipeDelta = setDeltaForObstacles(pipes);
if (pipeDelta < deltaX && pipeDelta > 0)
    deltaX = pipeDelta;
//Analyse cannons
int cannonDelta = setDeltaForObstacles(cannons);
if (cannonDelta < deltaX && cannonDelta > 0)
    deltaX = cannonDelta;

//Analyse the gaps
boolean gap = false;
for(int y = 0; y < 22; y++)
{
    if (pits[y])
    {
        if (action[Mario.KEY_RIGHT] && y > 11)
        {
            int localDelta = y - 11;
            if (localDelta < deltaJump)
                deltaJump = localDelta;
        }
        else if (action[Mario.KEY_LEFT] && y < 11)
        {
            int localDelta = 11 - y;
            if (localDelta < deltaJump)
                deltaJump = localDelta;
        }
        gap = true;
    }
}

//Analyse ground above Mario
boolean raisedGround = false;
if (action[Mario.KEY_RIGHT])
{
    for (int x = 12; x < 22; x++)
    {
        for (int y = 12; y >= 0; y--)
        {
            if (levelState[y][x] == -10)
            {
                int localDelta = x - 11;
                if (localDelta < deltaX)
                    deltaX = localDelta;
                raisedGround = true;
                break;
            }
        }
        if (raisedGround)
            break;
    }
}

Once we have the deltas, we can use this information to decide what to do. I mentioned above treating obstacle deltas different from jump deltas. This is an attempt to make Mario take smaller jumps when he can see a pit is after an obstacle. This is controlled by two instance members that keep track of jump height and jump length, called jumpHeight and jumpLength respectively. This is very hit and miss. Notice that Mario does not respond to anything more than 4 tiles away, and tiles that are behind him. This is to prevent him from going jump crazy:

jumpHeight = 7;
jumpLength = 14;// * (deltaX / 11);
if (((pipes.size() > 0 || cannons.size() > 0 || raisedGround) && deltaX >= 0 && deltaX < 5))
{
    if (observation.mayMarioJump() && !action[Mario.KEY_JUMP])
        action[Mario.KEY_JUMP] = true;
}

//If mario is jumping, there must be an obstacle in the way
if (action[Mario.KEY_JUMP])
{
    if (gap && deltaJump >= 3)
    {
        jumpHeight = 7;
        jumpLength = 1;
    }
}
//No obstacles, but there is a gap
else if (gap && deltaJump >= 0 && deltaJump <= 3)
{
    if (observation.mayMarioJump() && !action[Mario.KEY_JUMP])
        action[Mario.KEY_JUMP] = true;
}
maintainJump();

Finally, the maintainJump method is to ensure we’re holding down the jump key for the optimal amount of time. It is also coded to provide rudimentary control over jump height and length, however this isn’t being taken advantage of.

private void maintainJump()
{
    if(action[Mario.KEY_JUMP] && (jumpCount > jumpHeight))
    {
        action[Mario.KEY_JUMP] = false;
        jumpCount = 0;
    }
    // otherwise you're in the middle of jump, increment jumpCount
    else if (action[Mario.KEY_JUMP])
    {
        jumpCount++;
    }
    if (jumping)
    {
        if (action[Mario.KEY_RIGHT] && (rightCount > jumpLength))
        {
            action[Mario.KEY_RIGHT] = false;
            rightCount = 0;
        }
        else if (action[Mario.KEY_RIGHT])
        }
    }
}

The complete agent is at the bottom of this post for you to download and include in your project. Simply put it in the path src/timgittos, configure the ch.idsia.scenarios.Play class to have no enemies, with options.setPauseWorld(true), and run any level you want (passable result with difficulty 3). Then build the project and run it with java ch.idsia.scenarios.Play timgittos.BlogAgent.

As you will observe, the agent will make Mario go towards the right as fast as possible, jumping over obstacles as it meets them. For the most part, it works well enough. However, there are quite a few fatal flaws, and I will explain why it fails in this way.

Most obviously, the agent frequently falls down gaps. It doesn’t fall down all gaps – which would indicate that it’s not a failure in the gap detection, but rather something else. In fact, what’s going on is that the gaps that Mario tends to fall down are those that enter his field of consideration (4 tiles ahead of him) while he’s in the air. The agent makes no effort to check whether it’s jump will land Mario in a pit (since it can’t predict where Mario will land), and my previous efforts at trying to make the agent force Mario to the left when it detects he’s going to go down a pit have failed.

Mario will also fall down pits if there are multiple pits in a row. Again, this is because the agent doesn’t exert much control over the size and length of Mario’s jumps ,and so can’t pin point that it wants to land in the middle of the two pits, so it tends to overshoot the jump and fall down the second pit.

Most annoyingly, the agent will sometimes get stuck. If it reaches an obstacle, such as a cannon or a bit of wall or a pipe and jumps at the wrong angle, it will enter a weird wall jump loop, where it jumps, gets stuck on the obstacle, jumps off, jumps again, and gets stuck again. This will loop until there is no time. This is a side effect of the environment allowing wall jumps, which is cool when you can get the agent to make Mario jump out of a pit he’s falling into, but annoying when he gets stuck.

In any case, this agent is far from complete, and just serves to demonstrate how the simulation environment works, and how to interact with it. Once you’re clear how to work with the simulation, you can concentrate on building a useful agent, either by starting from scratch (like you would if you were going to use a machine learning approach), or by building on top of this base (for a determinate approach like A*).

Download: Mario AI Blog agent (.java)

My Best Guesses About Wolfram Alpha

Wednesday, May 20th, 2009

Following the “soft launch” of Wolfram Alpha a few days ago, I’ve been wondering exactly how Wolfram Alpha works. From the description of all the hype leading up to it, it seemed like a question answering application that is specialized in answering questions, as opposed to say Google, which is more into finding and presenting data, leaving the interpretation to humans. Now that it’s here, we can have a play around and see what it can do.

Firstly, I tried to ask Wolfram Alpha how it worked, and it just gave me a puzzled look. So I can only guess, and guess I shall.

The Wolfram Alpha website lays it all out on the table, with regards to what it’s all about. Wolfram Alpha’s goal “is to accept completely free-form input, and to serve as a knowledge engine” to answer search queries entered. From the very limited playing I’ve done, it seems pretty cool.

Following my move from Perth to Texas, I typed in “Austin to Perth” into Wolfram Alpha just to see what it told me. I got back map, with a line drawn between the cities. I got the distance in miles between the two and the flight time. I got a side by side comparison of the local times, populations and approximate elevations.

Google however, delivered me what I was actually looking for, sort of. Maybe that’s just a difference in expectations and thinking.1

It’s clear that Wolfram Alpha is pretty cool, and will do some awesome things with the data before it returns it back to you, but it doesn’t really help me figure out how it works. My thoughts are that Wolfram Alpha combines a large database of indexed knowledge in the form of a knowledge base and a powerful logical language, the language being Stephen Wolfram’s own Mathematica, naturally. As he describes it, Wolfram Alpha is the “killer-app” for Mathematica.

First, lets discuss the vast store of data available to Wolfram Alpha, most likely in the form of a knowledge base. A knowledge base is an information representation scheme designed to allow information to be operated on using logic. Information is stored in “sentences” in a knowledge representation language. The knowledge base allows logical agents, in this case the Wolfram Alpha application, to receive queries, and then search and analyse the knowledge base, performing logical inferences and information compounding, then collating it and returning the results. The logical agent does all the heavy lifting, but the very structure of the knowledge base helps the agent.

The heart of the logical agent is it’s knowledge representation language. It allows the agent to “reason” through sentences and make inferences in order to derive new facts. Proposition logic and First-order logic are both knowledge representation languages. Knowledge representation languages use syntax and semantics to define knowledge, where syntax outlines the structure of a language, and the semantics are the meaning.

Logic is a very mathematical concept, and mathematics can be seen as a kind of logic that operates on numbers. Mathematics has a syntax, that defines the infix notation of logical operators on operands, and the semantics are what those operators do. 1 + 1 = 2 is an example of a complex logical language, where the syntax defines that the + operator acts on each 1, and the = operator defines the result of the previous operation. The semantics define that the value of 1 + 1 is 2.

Mathematica, the heart of Wolfram Alpha, is a computational language, according to it’s Wikipedia page. It’s built on top of the Symbolic Manipulation Program, designed by both Chris A. Cole and Stephen Wolfram. The Symbolic Manipulation Program is a “computer algebra system”, which facilitates symbolic computation, allowing computers to operate on symbols and manipulate expressions instead of operating on their values, which sounds a heck of a lot like something that could be used to create a logic that can operate on a knowledge base.

Going back to logical agents and knowledge bases, there are a number of ways logical agents can extract and infer information from knowledge bases. The topic of knowledge based logical agents is a pretty vast one, and I couldn’t possibly explain it completely in a single post, even if I knew more than just a fraction of it, which I don’t. But I can summarize what I do understand, and how it applies to Wolfram Alpha.

As referred to often in the post so far, inference or entailment plays a large part of how a logical agent can extract “new” information from a knowledge base. Inference is deeply rooted in formal logic, and techniques have been developed to allow automated software processes to apply it to a knowledge base.

Logical entailment is the idea that a sentence “logically follows” another sentence. If a given sentence a is true, and the truth of sentence b depends on the truth of a, then b must be true, and also that if b is true, then logically a must also be true. This is useful if the knowledge base doesn’t contain the fact that one of a or b is true, but it does contain definition that the truth of b depends on the truth of a.

The knowledge base is designed to be a model of reality, and holds sentences about the world intended to describe it to a level of detail. Sentences can be as atomic as facts, such as “the sky is blue”, or can be logical sentences such as “if the sun is in the sky, it is not night”, which of course has exceptions (such as in certain seasons in the polar regions), which also need to be modeled in the knowledge base.

This is relevant because, with a finite model represented by the knowledge base, there are a finite number of entailment and inferences that can be made on the model, which means that a logical agent such as Wolfram Alpha can reduce the world to a finite search space, and can run a search over this space looking for information that follows on from the set of facts in the knowledge base.

Search techniques for searching the knowledge base are similar as search techniques in other AI based problems, such as constraint satisfaction problems and touring problems. These searching algorithms include backtracking and local search methods, such as hill climbing.

When searching this knowledge base, the logical agent needs to abide by a series of rules, known as inference rules, in order to guarantee the the knowledge it entails and infers is valid.

Firstly, the logical agent needs to be aware of the equivalence of two logical sentences. If two sentences are both true under the same model, such as the Wolfram Alpha knowledge base, then they are logically equivalent, and can be used to format facts in a different way to aid inference.

Secondly, the logical agent needs to be aware of the concept of validity. A sentence is valid if it is true in every model, which essentially means it’s a tautology and is always true. These tautologies are useful because they aid in validating inferences, where a infers b if a implies b is a tautology, where implication is such that b is true if and only if a is true. This can get a little confusing, and if you can, I’d suggest you read Chapter 7 in AI:AMA
.

Thirdly, it needs to ensure the sentence is satisfiable, which means it’s true in at least one model, which is basically just searching for inferences and determining that, for a given search space state, the inference being suggested is true.

The logical agent and inference/entailment from a knowledge base can be observed in Wolfram Alpha with a fairly simple query: “red + yellow“. In this query, the logical agent would probably hit the knowledge base and tell it that both red and yellow are true, meaning that we want to find information about those. It would then search for all sentences that are entailed or logical structures that mention the fact red or yellow, such as a sentence “orange is a mix of red and yellow”. It would then tell the knowledge base that this fact had been entailed, and tell it that orange is true. Then, when it finds the sentence “blue is complementary to orange”, it can add that to the record set. Obviously there will be other sentences in the knowledge base, and the agent will have to use a filtering algorithm to determine which sentences and facts are relevant to the query, otherwise it might return results that the user is not interested in.

To successfully parse the search query, it would also need a measure of natural language processing, so that it can convert a human-readable query into a machine readable query. This, combined with a large, extensive knowledge base, a fast logical language such as Mathematica and AI techniques for inference and entailment of knowledge mean that Wolfram Alpha can answer a wide range of questions that a simple indexed and categorised search engine couldn’t.

However that’s just my best guess. And keep in mind who I am: I’m just a young web developer with an interest in AI. I don’t actively participate in AI research, and I have no formal training. My guesses and musings are probably widely off the mark, and if anyone has any other ideas or corrections, clarifications or discussions, I’d love to hear them in the comments.


  1. If you mistype a search query into Wolfram Alpha, curiously it won’t suggest a correction. Considering how relatively easy this task is (just use it’s Levenshtein Distance), I find it odd that it doesn’t support this.

Correcting 5 Misconceptions About AI

Wednesday, April 8th, 2009

For the longest time, I have been interested in artificial intelligence. The idea of computers that could make decisions and think independently fascinated me. Of course, I knew that AI wasn’t quite that advanced, but the idea still captivated me. However, the barrier to learning AI was too high. It was too complex and too academic. Just looking at the mathematical notation involved made my head spin.

I eventually took the plunge and convinced my family to buy me Artificial Intelligence: A Modern Approach for Christmas, and 12 months later, I actually started reading it. I’m still reading it, and will be the first to admit I don’t understand everything that is discussed, however there are a few things that I’m realising about AI.

  1. It’s All About Searching
    Searching is a constant in AI techniques. Problems are defined and solution spaces are created. Problems can be represented in a number of ways (graphs, trees, logic knowledge bases), however in the end, it always comes back to searching.

    This is important because searching is easy. Searching is something that a lot of programmers already know, even if it’s only at a most basic, brute force level. We do searching all the time. We loop through arrays searching for values, we use regular expressions to match string patterns, we retrieve records from databases. We search.

    At it’s simplest, you can search using brute force searching, iterating over all combinations and permutations of solutions looking for one that satisfies the problem, but beyond that, you can involve tricky heuristics to make optimal decisions about how to search. You can have local searches, which will pick a solution space and search it for local maximums, such as hill climbing, and searches that will find global minimums, such as simulated annealing. But it’s still search. It’s still something you can do.

  2. It’s More Common Than You Think
    AI is a hell of a lot more prevalent than most people realise. I didn’t think AI had much commercial application before I started learning about it, but luckily that didn’t dim my interest. For those who are interested, but holding back because they don’t see how it would benefit them, here’s some good news. AI techniques are used everywhere.

    That international 4 city flight you booked used AI techniques. A constraint satisfaction problem solver took all the constraints about needing to be at this city by that time, flying on this airline for that much, and creating a plan for you. When Amazon recommends products you might be interested in, it calculates this with Bayesian networks and classifiers, a method of probabilistically linking a set of variables. Circuit design, product manufacturing, supply chain optimisation, all of these things use techniques that AI use. They’re not the sole domain of AI, but learning AI will cause you to learn these too.

  3. You Can Use It Today
    Whatever you’re working on, you can probably use AI techniques in it. Even some of the more exotic sounding techniques like neural networks can be of use to you. Self healing databases? Hell yeah. Even Bayesian classifiers to catalog and categorise products, heuristic searches to mine data in databases, hierarchical task planners to plan that holiday or manage that Gantt chart.

    Can you use it anywhere? No. Your simple CRUD app probably won’t benefit from a wizz-bang heuristic search. But if you’re doing anything that involves large amounts of data, interacting with people, predicting trends and recognising patterns, you can use it.

  4. It’s In Demand
    You may have heard of the Netflix Prize. Guess what? That’s AI. Google is the biggest search engine around, and index billions upon billions of pages on the internet, and can get you relevant results to a question in a matter of seconds. That’s one hell of a big knowledge base, and one smart search algorithm. Amazon sell products all over the world, and aggressively upsell and cross sell. I get emails about related products I might like based on my wishlist and purchase history, and they’re actually pretty accurate.

    Also, computer games. Enough said.

    AI skills are in demand. Not huge demand, but probably more that you would have guessed. These skills are hugely profitable in the right hands, and big companies want to extract every single little morsel of useful information about your browsing, shopping, eating, travelling, viewing and reading habits in order to market to you more effectively. Now that sounds a little creepy to me, but if that doesn’t bug you, more power.

  5. There’s a Lot of Information Out There
    AI isn’t some weirdo niche science topic. There’s actually quite a lot of information out there, once you start going down the rabbit hole. “AI: A Modern Approach” cites hundreds of papers and books. There’s thousands of websites out there on the subject. There are many academic papers that are made available for free online. There are communities, like AIGameDev, dedicated to spreading that delicious knowledge.

    I think the hardest part about finding information is getting the terminology. It’s pretty dense when you first get into it, especially when talking about acyclic directed graphs, and your idea of a graph was like mine was about a year ago, namely a few bars on a 2D Cartesian axis. But once you’ve got a foot in the door of the lingo, it can become pretty accessible, and information starts becoming more bountiful. That foot in the door can be either a good, basic website, or in my case, a university level textbook designed to introduce people to AI.

AI is a big field, full of fascinating and interesting concepts and techniques, and it’s a young field that’s still full of potential. It’s not as complex or confusing as film and television would have you think. That’s not to say it’s a walk in the park, as I stated earlier, I’m probably running a 70% rate of understanding what I’m reading, but I’m managing. And if I can manage, so can you. So if you’re interested, there’s no better time to start than now.

AI Application Programming – Book Review

Monday, January 19th, 2009

I’m spending a lot of my time lately reading about AI, both game related, business application related and academia related. Having recently completed “AI Application Programming” by M. Tim Jones (Charles River Media, 2003, First Edition), I have decided to review it in the context of my current situation.

I am new to the field of AI, and my previous reading has been limited to sporadic Wikipedia research on concepts as I come across them. This book covers a lot of material that I haven’t read about before, but it also covers some material I have. This allows me to evaluate the book as both someone with prior knowledge, and someone with no knowledge.

Firstly, the book was a fairly broad overview on integrating aspects of artificial intelligence into application development. Therefore it was no surprise that a majority of the AI concepts discussed were optimization oriented. The book provided an overview of the history of AI, then it covered simulated annealing, adaptive resonance theory, ant algorithms/ant colony optimization, neural networks, genetic algorithms, a chapter on using some of the techniques described to run artificial life simulations, rules based systems, fuzzy logic, bigrams/Markov chains, agent based software and a wrap up the future of AI.

Each chapter is fairly formulaic. It starts with an overview of the theory of the subject for that chapter, and the presentation of any relevant formula. Then a basic example is covered, without code, in order to show how the theory being covered can be applied. Then a specific example with C source is covered, with a detailed explanation of what’s going on. I largely glossed over the source, as I don’t really have much familiarity with C, so cannot comment on the quality of the source.

Overall I have mixed feelings about the book. A lot of the topics have domain specific language, such as referring to parts of algorithms as genes and candidate algorithms as chromosomes in genetic algorithms, and I feel the author sticks to these to the detriment of comprehension of the topic. For example, the author was discussing “energy” and “temperature” with relation to simulated annealing, without outlining that it’s goal was to narrow the search space for a solution to a  given problem.

Another detracting factor was the numerous errors in diagrams, equations and the text. These errors can be attributed to a variety of sources, from poor copying jobs on some of the diagrams to bad editing. One particularly annoying error was the diagram showing how a NOT logic gate could be constructed using a single neuron neural network – the diagram showed an input bias of 1, with a weight of 1 on the input, claiming that the output will always be the inverse of the input, by the equation output = input * input weight + bias.
An input of 0 into this equation will indeed output a 1, however an input 1 will yield 2, which is incorrect. The correct diagram should have a input weight of -1, which will yield a correct answer with both the 0 and 1 inputs. These errors hindered the understanding of the topic.

However, it’s not all bad news. From my personal reading, I had always struggled with how the back propagation algorithm in neural networks worked, and this book managed to successfully explain it. A lot of the other topics, such as fuzzy logic and the rules based system were also well presented and relatively straight forward.

From reading this book, I feel I have a solid enough understanding of the basics of the topics covered that I could successfully go out and perform my own, more in depth research and succeed fairly well, as well as integrate a few of the more rudimentary features of these approaches into an application.