Flappy Bird Model Predictive Control

Phil came up with this idea. Play the 2013 smash-hit game Flappy Bird using a model predictive control approach where the model is phrased as a mixed integer program. You can see the result in the above video. The red line shows our controller's planned route, which changes as new obstacles come into view. I'll break this all down a little bit.

Model predictive control is a method of control where you model the system out to a finite horizon and optimize the trajectory with respect to the series of inputs. Then you act according to the first step in your optimal input and repeat the process at the next time step with a little bit of new information.

We expressed the model as a mixed integer linear program, an optimization problem with linear constraints and objective functions where all or some of the variables can be integers or booleans. In this case, the constraints impose both the physics of the game—ballistic motion with discrete impulses from flaps—and the objective of avoiding the pipes, floor, and ceiling. The input in this case are a series of booleans describing whether or not the bird jumps in each time step. We implemented the model in CVXPY, a Python package and domain-specific modeling language for convex optimization problems.

We forked Sourabh Verma's Pygame implementation of Flappy Bird and hacked it up a bit. At each time step, Flappy Bird calls our function for input. It passes the current state to our controller, which updates the initial conditions and pipe positions, solves for a trajectory, and returns the first action from its optimal input.

This technique works pretty well. It doesn't quite run in real time with the lookahead set to a distance that allows it to succeed. We used a neat trick to improve the speed and look ahead distance. The model's time step increases with look ahead time. In other words, the model is precise for its first few time steps, and gets less careful later in its prediction. The thinking is that this allows it to make approximate long term plans about jump timing without over-taxing the solver.

import cvxpy as cvx
import numpy as np
import matplotlib.pyplot as plt


N = 20 # time steps to look ahead
path = cvx.Variable((N, 2)) # initialize the y pos and y velocity
flap = cvx.Variable(N-1, boolean=True) # initialize the inputs, whether or not the bird should flap in each step
last_solution = [False, False, False] # seed last solution
last_path = [(0,0),(0,0)] # seed last path

PIPEGAPSIZE  = 100 # gap between upper and lower pipe
PIPEWIDTH = 52
BIRDWIDTH = 34
BIRDHEIGHT = 24
BIRDDIAMETER = np.sqrt(BIRDHEIGHT**2 + BIRDWIDTH**2) # the bird rotates in the game, so we use it's maximum extent
SKY = 0 # location of sky
GROUND = (512*0.79)-1 # location of ground
PLAYERX = 57 # location of bird


def getPipeConstraints(x, y, lowerPipes):
    constraints = [] # init pipe constraint list
    for pipe in lowerPipes:
        dist_from_front = pipe['x'] - x - BIRDDIAMETER
        dist_from_back = pipe['x'] - x + PIPEWIDTH
        if (dist_from_front < 0) and (dist_from_back > 0):
            constraints += [y <= (pipe['y'] - BIRDDIAMETER)] # y above lower pipe
            constraints += [y >= (pipe['y'] - PIPEGAPSIZE)] # y below upper pipe
    return constraints

def solve(playery, playerVelY, lowerPipes):

    pipeVelX = -4 # speed in x
    playerAccY    =   1   # players downward accleration
    playerFlapAcc =  -14   # players speed on flapping

    # unpack path variables
    y = path[:,0]
    vy = path[:,1]

    c = [] # init constraint list
    c += [y <= GROUND, y >= SKY] # constraints for sky and ground
    c += [y[0] == playery, vy[0] == playerVelY] # initial conditions

    x = PLAYERX
    xs = [x] # init x list
    for t in range(N-1): # look ahead
        dt = t//10 + 1 # let time get coarser further in the look ahead
        x -= dt * pipeVelX # update x
        xs += [x] # add to list
        c += [vy[t + 1] ==  vy[t] + playerAccY * dt + playerFlapAcc * flap[t] ] # add y velocity constraint, f=ma
        c += [y[t + 1] ==  y[t] + vy[t + 1]*dt ] # add y constraint, dy/dt = a
        c += getPipeConstraints(x, y[t+1], lowerPipes) # add pipe constraints

    objective = cvx.Minimize(cvx.sum(flap) + 10* cvx.sum(cvx.abs(vy))) # minimize total flaps and y velocity

    prob = cvx.Problem(objective, c) # init the problem
    try:
        #prob.solve(verbose = False) # use this line for open source solvers
        prob.solve(verbose = False, solver="GUROBI") # use this line if you have access to Gurobi, a faster solver

        last_path = list(zip(xs, y.value)) # store the path
        last_solution = np.round(flap.value).astype(bool) # store the solution
        return last_solution[0], last_path # return the next input and path for plotting
    except:
        try:
            last_solution = last_solution[1:] # if we didn't get a solution this round, use the last solution
            last_path = [((x-4), y) for (x,y) in last_path[1:]]
            return last_solution[0], last_path
        except:
            return False, [(0,0), (0,0)] # if we fail to solve many times in a row, do nothing

If you want to run it, you'll need to download Gurobi or switch to an open source solver.

»


Solving O'Gear Brain Teaser with Haskell Graph Search

Declan showed me a brain teaser his friend sent him. Apparently, it's known as the Hanayama O'Gear Puzzle. It consists of a bronze colored metal cube with a captive gear which can roll from face to face on the cube. The gear can also rotate between two directions on each face. Some edges of the cube are made so that the gear can't roll over them. One tooth of the gear and one face of the cube have a special cut-out. Only when the special tooth is engaged with the special face can the gear be freed. The object of the puzzle is to manipulate the gear into this orientation. Here's a video of the puzzle in action.



I tried solving the puzzle for about fifteen minutes before deciding to use a computer. My basic strategy was to treat the problem as a graph search, where the nodes of the graph are the different states the cube and gear can be in. At any time, you can only do three things: rotate, roll forward, and roll backward. I figured that the limited set of moves and possible positions would prevent the search space from getting too big.

I marked up the cube with stickers, naming each face and defining its 'north' direction. I also marked the gear by labeling each tooth and setting a forward rotation direction.

I don't know Haskell very well, so I thought using it would be a fun challenge. Plus, I had Haskell guru Phil to ask for help and advice. A lot of the code is just defining the connectivity of the puzzle. I used a list to keep track of the links between different faces and a pattern-matched function to keep track of the possible rotations. I defined north on each face so that these rotations were always the same. A north-facing gear could only rotate to the east, and a south-facing one could only rotate west.

Beyond the trickiness of defining directions and connectivity, I just implemented a straightforward breadth-first search. The program creates a list of all states accessible from the starting state. Because the list is generated with successive expansions of the search frontier, it is naturally sorted by search depth. I take advantage of Haskell's lazy evaluation, creating an infinite list of accessible states but only ever calculating a few of them. Code below.

import Data.Maybe (catMaybes)
import Data.List (find)
import Data.Tuple (swap)

data Face = Top | Bot | Lef | Ri | Front | Back deriving (Show, Eq)
data Dir = North | South | East | West deriving (Show, Eq)

data GearState = GearState{ face :: Face, dir :: Dir, tooth :: Int, history :: [(Face, Dir)] } deriving Show

-- a list defining connections between faces
forward_connections = 
    [ ((Top, East),(Ri, South))
    , ((Top, West),(Lef, South))

    , ((Bot, North),(Front, West))
    , ((Bot, South),(Back, East))
    , ((Bot, West),(Ri, North)) 

    , ((Front, South),(Bot, North)) 
    , ((Front, West),(Back, West)) 

    , ((Back, South),(Bot, North)) 
    , ((Back, East),(Ri, East)) 
    , ((Back, West),(Lef, West)) 

    , ((Lef, North),(Top, East)) 
    , ((Lef, East),(Front, East))
    , ((Lef, West),(Back, West)) 
    
    , ((Ri, North),(Top, West)) 
    , ((Ri, South),(Bot, East)) 
    , ((Ri, East),(Back, East))  
    ]

-- mapping the reverse tuple function over the connections gives the connections for reverse gear moves
backward_connections = map swap forward_connections

-- a function which returns a gear moved forward or backward
-- takes a list of connections between faces and a function for finding the next tooth
-- forward_connections and incr for forward moves
-- backward_connections and decr for backward moves
move :: GearState -> [((Face, Dir),(Face, Dir))] -> (Int -> Int) -> Maybe GearState
move (GearState face dir tooth history) connections nextTooth = 
    case (lookup (face, dir) connections) of -- looks for a connection
        Nothing -> Nothing -- if no forward connection exists from the current state
        (Just (new_face, new_dir)) -> Just (GearState new_face new_dir (nextTooth tooth) ((new_face,new_dir):history)) -- returns a new state, incrementing the tooth and adding to the history

-- a function which applies a turn to the gear
turn :: GearState -> GearState
turn (GearState face North tooth history) = GearState face East tooth ((face,North):history)
turn (GearState face East tooth history) = GearState face North tooth ((face,East):history)

turn (GearState face South tooth history) = GearState face West tooth ((face,South):history)
turn (GearState face West tooth history) = GearState face South tooth ((face,West):history)

-- some mod 5 math for the teeth
mod5 :: Int -> Int
mod5 = (flip mod) 5

incr :: Int -> Int
incr x = mod5 (x+1)

decr :: Int -> Int
decr x = mod5 (x-1)

-- determines if the gear can leave the cube
isWinningState :: GearState -> Bool
isWinningState (GearState Top North 0 _) = True
isWinningState _ = False

-- creates a list of all the states accessible from the current state
nextStates :: GearState -> [GearState]
nextStates state = catMaybes [Just (turn state), (move state forward_connections incr), (move state backward_connections decr)] -- removes Nothings and strips the values out of Just

-- creates a new search frontier by replacing each state in the search frontier with all the states accessible from that state and flattening
newFrontier :: [GearState] -> [GearState]
newFrontier frontier = (concatMap nextStates frontier)

-- creates a list of all future frontiers
reachableStates :: GearState -> [GearState]
reachableStates start = concat (iterate newFrontier [start])

-- finds the winning state
solve :: GearState -> Maybe [(Face, Dir)]
solve start = fmap history (find isWinningState (reachableStates start))

You can run it with a command like, solve (GearState Top South 4 []). This code, and a different Python solver I wrote are available in my Github.

»


Walnut and Oak Coffee Grinder



In my experience, woodworking is all about finding an excuse to build something. Liz's cousin was getting married. We thought the couple, an extremely sweet pair of coffee-loving organic farmers, would especially appreciate a hand made gift. A coffee grinder seemed appropriate. I ordered some wood and we got designing. Here's what we came up with. The main body contains a ceramic burr grinder core I bought online and accepts beans in the top. It nests into a square box-shaped base that collects the grounds.



I'd never ordered wood online before but it worked well. The biggest difference is selection. I was able to buy us boards that were already cut to the 3/8" thickness we designed around without having to pay for custom milling. We started by cutting the pieces of the body. It has the shape of a truncated pyramid with a 9° half-angle, with miter joint edges.



The best way we found to cut these pieces was on the chop saw. This tool allowed us to simply set the angles we wanted without thinking too much. It was just shy of being able to make the cut we needed, leaving a tiny bridge of wood we needed to remove by hand.



We were quickly able to glue up the body. We tried a tip I saw online: lining the inner-corners-to-be with masking tape. The idea is that this would prevent glue residue from hiding in these corners, since removal of that glue is always tricky. The problem is that it left us with tiny pieces of blue tape in the corners instead. These pieces were not particularly easy to remove and were particularly easy to see. Maybe we needed to be a little more careful about the tape alignment.



Next we worked on the base. It is a simple square shape, again with miter joints all around. We were able to cut this very easily with a stop set on the table saw sled.



It was a little tricky to glue up. Liz was unphased.



The next part was a bit of an adventure. We needed a block to accept the grinder core and nest inside the body. It required a bore of almost 2" and a thickness of about 1.25". I bought us an adjustable circle cutter bit for the drill press to drill the hole, and we used a spare piece of 8/4 red oak. Cutting the hole was at the very edge of our limits. The cutter could barely reach, even attacking the wood from both faces. Oak is hard. On even the highest torque setting, the motor kept stalling. The drill press, circle cutter, and Liz were all unhappy. I was, as usual, extremely brave and also cool. Eventually we got the hole cut.



We used a router to chamfer the top edge of the hole and reduce the thickness near the hole to 1.25". I also chiseled out flats for the grinder core's clips to sit. You can see the chamfer and flats below. We also had to cut the wall angle into the edges.



We reinforced all these miter joints with oak splines. This was pretty easy on the table saw sled. For the body, we tilted the blade to that 9° angle so that the splines would be parallel to the top and bottom edges of the grinder.



After fitting the core block in place, we drilled holes for pins and glued them in place.



We planed and sanded it smooth.



Last was finishing. Since this was for food, we just wiped it down with walnut oil sold for cooking.



We kept wiping it down with more oil every day for about a week. I read a rule of thumb about oil finish on kitchen pieces. Once a day for a week, once a week for a month, then once a month forever. Yeah, that's gonna happen for sure. The coffee bean oil will keep it safe.



We were very pleased with how it came out. Also, this was my first woodworking project with Liz. It was nice!



At press time, we have not tested the coffee grinder. I hope it works.

»


Oak and Canarywood Desk



Matt and I just finished our greatest woodworking project yet! We designed and built a desk for Liz out of red oak with canarywood highlights. It took us about ten days, working most of each day. We started with a CAD design.



Here's what we came up with. The design featured a large edge joined top with two canarywood stripes in it. Showing through the top were through tenons from inward tapered legs. We also included a pair of small drawers, housed in a case which was suspended from the top with sliding dovetails. We hid the ends of the dovetail slot by gluing the top in two stages.

We used a long piece of canarywood for accents all over. Thanks to Bruce and Eileen for the canarywood board!



We loaded up the car with tools stored at Mom and Dad's and headed to my house in Providence, where we set up a temporary shop in my landlord's garage. On the way, we snapped up a $50 router table off of a Craigslist in White Plains.



First, we laid out the top from 5/4 red oak boards and canary wood ripped to 5/4 slices.



We edge joined the front part of the top, leaving the last board out for now. Once the glue had dried, we cut two long dovetail slots which ended a few inches from the front edge. These slots would receive the drawer case and would be hidden once we edge joined the final board.



Next, we built the case for the drawers. Our design called for miter joints on the bottom corners and sliding dovetails on top. A shelf to support the second drawer would slide into a dado. We ripped 1x12 red oak boards to size, and cut the sliding dovetail into the top edge using our new router table. We made the dado and miters on the table saw. We also cut 2" oak boards to length for the legs and tapered them on the table saw.



Our design had these dovetails running with the grain of the wood, and we started thinking about their durability. A few experiments showed them to be spookily fragile. Paranoid, we drilled holes and knocked in some dowels. Not sure if this even makes a difference. Maybe it made them weaker.



We strengthened each miter joint with a pair of dovetailed splines. A router table jig helped us cut slots into the corners. We used canarywood for the splines themselves.



After the glue dried, we cut off the excess and planed the splines flat. We were very happy with the results.



We slid the drawer case into the slots in the top and edge joined the last board in place.



We cut through mortises into the top and tenons into the legs. The ends of the tenons would be prominent, so we drove two canarywood wedges into each. We also put together the stretchers with mortises and tenons.



Drawer time. We used our router based planer to thin a bunch of 1" thick stock to 3/8". This was unpleasant and didn't work very well. We had trouble finding good ways to clamp the work, and the thickness varied as a result.

We designed the drawers with the same basic construction as the drawer case. Sliding dovetails on one end, splined miter joints on the other.





We hand carved some nice drawer pulls, too. Used the band saw to rough them out and then chisels and sand paper to get the shape we wanted. The pulls connected to the drawers with one large dovetail each.





We planed the top flat and sanded everything.


Looks pretty good, if you ask me!



We finished the project with several coats of danish oil.







»


Optimizing Steering Car Paths with PyTorch


Want to see another bizarre way to use PyTorch? If you've read some of my recent posts, you've seen the basic idea before. I'm using automatic differentiation and gradient descent, this time to optimize the path of a car. The model is similar to a Reeds-Shepp car, a simple steering car model. The state at time step \(i\) is defined by a 2D position and an angle, \((x_i, y_i, \theta_i)\). In each step of the simulation, the car's state is updated based on an input consisting of speed and steering angle \((v_i, \phi_i)\).

\(x_{i+1} = x_i + v_i \cos{\theta_i} dt\)
\(y_{i+1} = y_i + v_i \sin{\theta_i} dt\)
\(\theta_{i+1} = \theta_i + v_i \phi_i dt\)

I simulated the car for \(n_\text{steps}\) steps, with a target position and angle, \((x_t, y_t, \theta_t)\) and a cost function, \(c\), which depended on the distance from the target position and the distance from the target angle in the final step.

\(c= \left(\theta_t - \theta_n\right)^2 +\left(x_t - x_n\right)^2 + \left(y_t - y_n\right)^2 \)

I used automatic differentiation to find the derivates \(\frac{dc}{dv_i}\) and \(\frac{dc}{d\phi_i}\), which I used to update the inputs.

Check out some results. The animated plot below shows an example optimization, with the working solution evolving over time. The car starts at \((0,0)\), facing right and its target is \((3,3)\), facing down. The starting position and direction are represented by the red arrow and the target position and direction are represented by the green arrow.

In the first frame of the animation, the inputs are initialized to zero and the car does not move. In the ultimate solution, the car performs a wide left turn into its final position. The plots below shows the input that produced that trajectory and the evolution of the cost function.

Code below.

import torch
import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize

n_steps = 200 # number of steps to test the car performance
speeds = torch.zeros(n_steps-1, requires_grad=True) # init speed control tensor
steers = torch.zeros(n_steps-1, requires_grad=True) # init steering control tensir
dt = .1 # time step for dynamics

target_pos = torch.tensor([3.,3.]) # target position for the car
target_angle = torch.tensor(3*np.pi/2.) # target angle for the car

# finds the trajectory from the speed and steering commands
def test_car(speeds, steers):
    angles =[torch.tensor(0.)]
    xs = [torch.tensor(0.)]
    ys = [torch.tensor(0.)]

    # simulate the car performance
    for i in range(1,n_steps):
        speed =  torch.clamp(speeds[i-1], -1., 1.) # limit the speed
        steer = torch.clamp(steers[i-1], -10., 10.) # limit the steering angle
        xs.append(xs[-1]+ dt * speed * torch.cos(angles[-1]))
        ys.append(ys[-1]+ dt * speed * torch.sin(angles[-1]))
        angles.append((angles[i-1] + dt * speed * steer)%(2*np.pi))
    return xs, ys, angles


# list to keep track of objective progress
costs = []
angle_costs = []
dist_costs = []

# cost plot
fig_cost = plt.figure()
ax_cost = fig_cost.add_subplot(111)
ax_cost.autoscale(enable=True, axis="y", tight=False)
ax_cost.set_xlabel("iteration")
ax_cost.set_ylabel("cost")

li_cost, = ax_cost.plot([],[])
li_cost_ang, = ax_cost.plot([],[])
li_cost_dist, = ax_cost.plot([],[])

# input plot
fig_in = plt.figure()
ax_in = fig_in.add_subplot(111)
ax_in.autoscale(enable=True, axis="y", tight=False)

li_speed, = ax_in.plot(speeds.detach().numpy())
li_steer, = ax_in.plot(steers.detach().numpy())

# trajectory plot
fig_traj = plt.figure()
ax_traj = fig_traj.add_subplot(111)

li_traj, = ax_traj.plot([])
l = 0.25
ax_traj.arrow(0,0,0,l, width=0.05, color="red")
ax_traj.arrow(*target_pos, l* torch.cos(target_angle), l*torch.sin(target_angle), width=0.05, color="green")
ax_traj.set_xlim(-1.,4.)
ax_traj.set_ylim(-1.,4.)

fig_cost.canvas.draw()
fig_traj.canvas.draw()
fig_in.canvas.draw()
plt.show(block=False)

optimizer =  torch.optim.SGD([speeds, steers], lr=0.1)
#optimizer =  torch.optim.Adam([speeds, steers])

for i in range(170):
    # run simulation
    xs, ys, angles = test_car(speeds, steers)

    # calculate costs
    angle_error = target_angle - angles[-1]
    angle_cost = torch.min(torch.min((angle_error+2*np.pi)**2, (angle_error-2*np.pi)**2), angle_error**2)

    dist_cost = (target_pos[0] - xs[-1])**2 + (target_pos[1] - ys[-1])**2   

    cost = dist_cost + angle_cost

    # save costs
    costs.append(cost)
    angle_costs.append(angle_cost)
    dist_costs.append(dist_cost)

    #calculate gradients
    optimizer.zero_grad()
    cost.backward()

    # update inputs
    optimizer.step()

    # update plots
    li_cost.set_data(range(len(costs)),costs)
    li_cost_ang.set_data(range(len(angle_costs)),angle_costs)
    li_cost_dist.set_data(range(len(dist_costs)),dist_costs)
    li_speed.set_ydata(speeds.detach().numpy())
    li_steer.set_ydata(steers.detach().numpy())
    li_traj.set_data(xs, ys)
    
    ax_cost.relim()
    ax_cost.autoscale_view()
    ax_cost.legend(["total", "angle", "dist"])

    ax_in.relim()
    ax_in.autoscale_view()
    ax_in.legend(["speed", "steer"])

    fig_traj.canvas.draw()
    fig_cost.canvas.draw()
    fig_in.canvas.draw()
»