Flappy Bird Model Predictive Control

Phil came up with this idea. Play the 2013 smash-hit game Flappy Bird using a model predictive control approach where the model is phrased as a mixed integer program. You can see the result in the above video. The red line shows our controller's planned route, which changes as new obstacles come into view. I'll break this all down a little bit.

Model predictive control is a method of control where you model the system out to a finite horizon and optimize the trajectory with respect to the series of inputs. Then you act according to the first step in your optimal input and repeat the process at the next time step with a little bit of new information.

We expressed the model as a mixed integer linear program, an optimization problem with linear constraints and objective functions where all or some of the variables can be integers or booleans. In this case, the constraints impose both the physics of the game—ballistic motion with discrete impulses from flaps—and the objective of avoiding the pipes, floor, and ceiling. The input in this case are a series of booleans describing whether or not the bird jumps in each time step. We implemented the model in CVXPY, a Python package and domain-specific modeling language for convex optimization problems.

We forked Sourabh Verma's Pygame implementation of Flappy Bird and hacked it up a bit. At each time step, Flappy Bird calls our function for input. It passes the current state to our controller, which updates the initial conditions and pipe positions, solves for a trajectory, and returns the first action from its optimal input.

This technique works pretty well. It doesn't quite run in real time with the lookahead set to a distance that allows it to succeed. We used a neat trick to improve the speed and look ahead distance. The model's time step increases with look ahead time. In other words, the model is precise for its first few time steps, and gets less careful later in its prediction. The thinking is that this allows it to make approximate long term plans about jump timing without over-taxing the solver.

import cvxpy as cvx
import numpy as np
import matplotlib.pyplot as plt


N = 20 # time steps to look ahead
path = cvx.Variable((N, 2)) # initialize the y pos and y velocity
flap = cvx.Variable(N-1, boolean=True) # initialize the inputs, whether or not the bird should flap in each step
last_solution = [False, False, False] # seed last solution
last_path = [(0,0),(0,0)] # seed last path

PIPEGAPSIZE  = 100 # gap between upper and lower pipe
PIPEWIDTH = 52
BIRDWIDTH = 34
BIRDHEIGHT = 24
BIRDDIAMETER = np.sqrt(BIRDHEIGHT**2 + BIRDWIDTH**2) # the bird rotates in the game, so we use it's maximum extent
SKY = 0 # location of sky
GROUND = (512*0.79)-1 # location of ground
PLAYERX = 57 # location of bird


def getPipeConstraints(x, y, lowerPipes):
    constraints = [] # init pipe constraint list
    for pipe in lowerPipes:
        dist_from_front = pipe['x'] - x - BIRDDIAMETER
        dist_from_back = pipe['x'] - x + PIPEWIDTH
        if (dist_from_front < 0) and (dist_from_back > 0):
            constraints += [y <= (pipe['y'] - BIRDDIAMETER)] # y above lower pipe
            constraints += [y >= (pipe['y'] - PIPEGAPSIZE)] # y below upper pipe
    return constraints

def solve(playery, playerVelY, lowerPipes):

    pipeVelX = -4 # speed in x
    playerAccY    =   1   # players downward accleration
    playerFlapAcc =  -14   # players speed on flapping

    # unpack path variables
    y = path[:,0]
    vy = path[:,1]

    c = [] # init constraint list
    c += [y <= GROUND, y >= SKY] # constraints for sky and ground
    c += [y[0] == playery, vy[0] == playerVelY] # initial conditions

    x = PLAYERX
    xs = [x] # init x list
    for t in range(N-1): # look ahead
        dt = t//10 + 1 # let time get coarser further in the look ahead
        x -= dt * pipeVelX # update x
        xs += [x] # add to list
        c += [vy[t + 1] ==  vy[t] + playerAccY * dt + playerFlapAcc * flap[t] ] # add y velocity constraint, f=ma
        c += [y[t + 1] ==  y[t] + vy[t + 1]*dt ] # add y constraint, dy/dt = a
        c += getPipeConstraints(x, y[t+1], lowerPipes) # add pipe constraints

    objective = cvx.Minimize(cvx.sum(flap) + 10* cvx.sum(cvx.abs(vy))) # minimize total flaps and y velocity

    prob = cvx.Problem(objective, c) # init the problem
    try:
        #prob.solve(verbose = False) # use this line for open source solvers
        prob.solve(verbose = False, solver="GUROBI") # use this line if you have access to Gurobi, a faster solver

        last_path = list(zip(xs, y.value)) # store the path
        last_solution = np.round(flap.value).astype(bool) # store the solution
        return last_solution[0], last_path # return the next input and path for plotting
    except:
        try:
            last_solution = last_solution[1:] # if we didn't get a solution this round, use the last solution
            last_path = [((x-4), y) for (x,y) in last_path[1:]]
            return last_solution[0], last_path
        except:
            return False, [(0,0), (0,0)] # if we fail to solve many times in a row, do nothing

If you want to run it, you'll need to download Gurobi or switch to an open source solver.

Author | Ben Wiener

Background in physics. Also interested in computing, robotics, hiking, woodworking, and other things.