Minkowski Sum Collision

10/30/24

Minecraft worlds are great. They are completely axis-aligned, which means we can slide along any part of the terrain along one axis at a time, which lets us use our simple collision detection. As soon as we introduce non-axis aligned planes to the terrain, this no longer holds true. Our simple collision test no longer works, and we need something that would allow us to slide along diagonal faces. So how do we handle arbitrary terrain?

Introducing the Minkowski Sum

The Minkowski sum of two convex shapes is the convex hull of the list of points formed by taking every vertex of shape A and adding every vertex of shape B to it. That is, the list is A_total_vertices * B_total_vertices long. The convex hull can be found any way you want. The quickhull algorithm is the best for arbitrary shapes.

The Minkowski sum represents everywhere the origin of shape B can be, such that the two shapes are touching but not overlapping. This means that we can cast a ray from the origin of shape B onto this shape to see where the two will collide.

So we know we can collide any convex shape with any other convex shape. If we want to be like Quake or its sequels, we will make entities be AABBs, but have the world be made out of arbitrary convex shapes.

Quake 1 baked expanded versions of these convex shapes into separate BSP trees for several different fixed size AABBs. There was one for human shaped entities, another for big monsters, maybe one for projectiles, I don't know. Quake 2 on the other hand kept the base convex shapes (they called them "brushes") in a list, and would apparently expand them at collision time for arbitrary AABBs.

Raycasting against a convex shape

All we need to know is the set of faces that make up the shape. I like to have a list of positions and a list of faces, where each face is a list of indices into the positions. Then we just cast our ray against each of those faces and pick the nearest hit (aka lowest t value). Importantly, faces must be specified with a consistent winding order.

Raycasting a face:

Get the normal of the face. If the ray is pointed in the direction of the normal (aka the dot product with the ray direction is positive), ignore the face.

Dot any point on the face with the normal to get the distance part of the plane equation for the face. Now you have a plane.

Get the distance from the plane to the origin of the ray. If it is less than zero, ignore the face.

Get the distance from the plane to the endpoint of the ray.

Divide the first distance by the sum of the absolute value of the two distances to get the t value for the raycast. The first distance is always >= 0, and the second is always <= 0, so just negate the second and add it to the first for the total distance.

Scale the ray by t and add it to the ray origin to get the potential hit point on the face.

Iterate through the face's edges in the winding order. For each edge, calculate the cross product A x B, where A is the edge vector and B is the vector from the first vertex to the potential hit point. Dot this cross product with the face normal. If the result is less than zero, the ray did not hit. Explanation: If the cross product agrees with the normal, the potential hit point is on the internal side of the edge. Otherwise, it's on the external side. If we come across an edge that the potential hit point is on the external side of, that point is not on the face, since all our faces are convex.

If you make it through all the edges without breaking out, congratulations, your ray hit.

Move and slide

Each tick, our entity has a displacement to consume, as before in the Minecraft example. We do the following in a loop while the displacement is not approximately zero:

Cast the origin of the entity AABB against the expanded shapes. Pick the nearest hit point (lowest t).

If we hit nothing, just move the entity there and exit the loop.

Otherwise, move the entity to the hitpoint and nudge it away from the hit surface by a small distance like 0.0001. This is to prevent tunneling. Since we support sliding along non-axis-aligned planes, we no longer have the assurance of stable floating point distances from those planes while sliding, like we did in the Minecraft example. That is, the Minecraft example relies on vector components normal to hitplanes staying exactly the same while sliding along them.

Scale the displacement by (1 - t) and project it onto the hitplane normal to get the new, remaining displacement.

Project the entity's velocity onto the hitplane normal to get the new velocity.

The problem

The process described above will work fine for environments with no acute crevices. However as soon as we hit near the seam of an acute crevice, the nudge away from the hitplane will push us past the opposite hitplane, leading to tunneling.

The solution

We need to detect when we hit near a seam, then see if that seam is acute. A seam is acute if the normals of its two planes are obtuse to each other (their dot product is negative). If the seam is acute, push the entity out along the vector bisecting the angle between the planes instead of along the hitplane normal. In the case of an acute crevice formed by multiple planes, the vector will be the normalized average of all the plane normals.

To get the faces "touching" the hitpoint, we iterate through all the faces in the scene. For each face:

Get the distance from the plane of the face to the hitpoint.

If the absolute value of that distance is greater than the epsilon we used earlier (0.0001), move on to the next face.

Otherwise, do the "in polygon" test that we did earlier for the raycast on the hitpoint, which tests if the hitpoint is in the region formed by extruding the face infinitely in both directions. If so, the face is "touching" our hitpoint. Add it to the list.

Once we have the faces, get the normalized average of their normal vectors as described above. Now, every unit traveled along this vector will move us only some fraction of a unit away from all the "touching" planes. So we have to move by something larger than our epsilon if we want to be at least 1 epsilon away from all the planes once all is said and done. You can play with the value. I found 4 * epsilon worked well in my block game, but I don't have super acute angles, so you might have to play with it.