How to Make an Unstable Physics Engine
Main demo here
physics.js is a physics engine I'm making. This page covers the main pieces someone might be interested in learning from. Lessons learned concerning stability and performance that are buried in larger engines.
Collisions
The engine is based around simulating everything with atoms and bonds, which means that collisions would form the core of the engine and that it was important I get them right. A bit of research turned up results to just use ready-made engines, and a bit more research turned up basic elastic collision equations, like wikipedia.
This is what the simulation looks like when implementing the equations as they're written:
Simulation grade. The problem is that the equations are modeled after reality, which effectively has an infinitely small timestep, and we can't simulate that in any reasonable timeframe. Our simulation will have to make do with the large timesteps we give it.
This is the pseudocode the simulation is using, implementing real-world equations:
The problem with the stack of atoms collapsing together occurs because the atoms never get the chance to fully separate. Our engine applies the collision equations whenever the atoms are colliding, but those equations only make sense if the atoms are traveling towards each other. Our stack is so dense that atoms can stay overlapping each other even though their velocities should separate them.
The fix, pretty simply, is to ignore the velocity component when the atoms are already traveling away from each other. The only modification needed is to the veldif equation.
With this we can see a pretty noticeable improvement:
veldif | max(veldif,0) |
The stack still looks a little too crushed at the bottom. Even if this is reflective of high pressures in the real world, the engine isn't really applying pressure to the bottom of the stack. The volatile movement is mostly being caused by the atoms pushing each other's positions away from each other. We need a way for that change in position to result in a change in velocity to allow the bottom to build up pressure and press up against gravity. The modification is once again to veldif by adding posdif to it
With this we finally have a stack that can support itself.
veldif | max(veldif,0) | max(veldif,0) + posdif |
The final corrected function is:
The choice to add posdif to the velocity came after a lot of trial and error. Other attempts involved having a multiplier increase every time an atom got hit, or increasing the relative mass of atoms lower to the ground. All of these felt hackish, especially since some simulations might have multiple fields of gravity. posdif was the simplest solution that offered decent results and didn't blow up the simulation, so it's what I stuck with.
Springs
Springs (called bonds) form the second half of creating solid objects in the engine, so it's equally important that we get them right too. The standard equations for springs follow Hooke's law. Unlike the early problems with atoms collapsing, applying Hooke's law results in meshes that explode:
This is the pseudocode that implements Hooke's law as it's written:
Once again, the infinitesimal timestep of the real world allows the traditional spring equations to work. The mesh explosion results from the engine accumulating all of the acceleration changes at once, and then applying them all at once in a separate step. When the atoms are perfectly aligned, forces stay in balance. But, because our simulation is a flawed world with large timesteps and floating-point rounding, as soon as that balance is upset a little, the corrective spring forces build up in one direction, and in the next step build up in the other direction and so on.
The fix is to change when we apply the forces so that the springs naturally moderate themselves. We'll modify the function to apply the acceleration when the spring is evaluated instead of waiting to do it all at once, like so:
Modifying the positions during evaluation prevents the forces from building up. This results in a much more stable mesh.
acc += force
pos += force * dt * dt * 0.5 |
pos += force * dt * dt * 0.5 vel += force * dt |
The final corrected spring function is:
Once meshes get large enough, however, forces can still build up regardless.
The problem isn't really on the engine this time. It has to do with how meshes are normally constructed. In the demo above, for example, we have two loops that build the meshes from upper-left to bottom-right.
When meshes are evaluated in an order like this, forces tend to build up in one direction like a wave. The oscillations in the demo are always biased to one corner if you watch closely.
We can fix this by randomizing the order we evaluate springs. Any randomization will work, even if it's low quality. As long as the forces are spread out then the mesh will stabilize.
Static ordering | Random ordering |
When figuring out how to stabilize the springs, updating the position quickly became the obvious choice. The only problem was that choosing the wrong multiplier created springs that were almost too stable. Boxes would barely rotate when a corner was hit. After a lot of trial and error, I eventually found that just unrolling the acceleration equations in the spring function was enough to make things stable without any weird hacks or multipliers.
Broadphase
If you're driving a car in the U.S., you likely won't be worried about a car making a left turn in China. Although humans can make this distinction easily, a computer won't know to. Anything can collide with anything, even if they're far away. So, if we have 1000 objects, then there will be about half a million pairs of collisions we'll need to test - half a million for every single step of the engine.
The broadphase step does the job of quickly determining which objects are likely to collide and which ones aren't. An easy way to do this is to draw a grid. If two objects overlap the same cell, check if they collide. This grid scheme runs into problems with differently scaled objects though: a mountain will take up millions of cells, while a mouse will take up one.
Our engine instead uses a bounding volume hierarchy (BVH), which looks like this:
This demo doesn't show the full hierarchy since it would be too messy, but the algorithm works by subdividing the world into smaller and smaller boxes. We first check if an object overlaps the top box. If it does then we check if it overlaps any of its children until we get all the way down to an actual, tangible object. The hard part of this scheme is deciding the best way to divide up everything.
I decided to compare the different construction methods for evaluation time and BVH quality. I've listed the performance values below, but without going into the exact differences, the method I decided to go with involved:
1. | Pick a point in each object's AABB. I went with the minimum vertex to keep things simple. |
2. | Compute a parent AABB containing all of these points. |
3. | Find the axis with the largest max[axis]-min[axis] in the parent AABB. |
4. | Find the midpoint (min[axis]+max[axis])/2 and split objects by whether their AABBs are above or below this value. |
5. | Continue subdividing each side. |
I call this the center split method. Other common methods involve splitting by surface area of the child AABBs, or sorting them by Morton codes and building the tree from the bottom up.
The next thing to look at was traversing the BVH during collision checks. In javascript, every array access is treated as an affront to god and punished just as harshly, so anything that could help accesses would pay dividends. To speed up traversal, we'll flatten the tree.
Flattening a tree involves rearranging it in memory so parents and siblings are nearby to each other in physical memory. In our case, the tree was arranged so the left child would immediately follow a parent. The right child would come after the left child and all its children. This is what it looks like compared to a tree arranged top to bottom in memory:
Notice that differences between child - parent values in the top-to-bottom tree sum to 15, while it's only 11 in the flattened tree.
In the end flattening the tree resulted in an amazing 43% speed up. Converting to webassembly was always slower no matter how I did it.
2D times
Scheme | Time (s) | Volume (u^2) | AABB Tests (k) |
Center (flat) | 0.106 | 13,467 | 3,755 |
Mean (flat) | 0.108 | 16,260 | 3,969 |
Morton (WASM) | 0.127 | 16,110 | 2,432 |
Grid | 0.137 | N/A | N/A |
Mean | 0.180 | 16,260 | 2,333 |
Center | 0.184 | 13,467 | 2,340 |
Median | 0.366 | 19,279 | 2,413 |
Median (WASM) | 0.421 | 19,279 | 2,413 |
SAH (fast) | 0.664 | 11,375 | 2,259 |
SAH (full) | 1.542 | 11,120 | 2,254 |
3D times
Scheme | Time (s) | Volume (u^3) | AABB Tests (k) |
Center (flat) | 0.137 | 446,394 | 5,672 |
Mean (flat) | 0.137 | 588,805 | 5,713 |
Morton (WASM) | 0.153 | 622,660 | 4,278 |
Center | 0.229 | 446,394 | 4,052 |
Mean | 0.229 | 588,805 | 3,963 |
Median | 0.420 | 970,161 | 4,056 |
Median (WASM) | 0.471 | 970,161 | 4,056 |
Grid | 0.502 | N/A | N/A |
SAH (fast) | 0.789 | 417,253 | 3,858 |
SAH (full) | 2.418 | 415,410 | 3,831 |
The important thing about the center split method is that it is fast. It beat my previous broadphase algorithm (grid partitioning) by 23% while being simpler, scaling to higher dimensions better, and being more useful for queries.
Drawing
Because I wanted to construct objects out of atoms, the problem arose with how to render them accurately while also looking good. In minecraft, everything is a block which ends up looking pleasing because they line up flush with each other. With circles, there end up being annoying gaps no matter how they're stacked.
The obvious solution is to keep the drawing separate from the physical object, which quickly becomes untenable when the underlying atoms are deformed - we would have to deform the drawing too. There are also fancy filling algorithms I looked at but abandoned because they would be too complicated and slow.
The solution was to change how objects were constructed to allow overlapping atoms if there was a bond between them. When rendering, if multiple atoms overlap, pixels from the closest atom overwrite others. This allows straight edges to form even when we're modeling everything with circles.
This is what the effect looks like in action:
Notes
I'm really happy with how the dynamics turned out. I've been working with physics engines for a long time, and it was only recently that everything came together to create an engine that's stable and also looks good.
The broadphase detection was also a long time coming. I used to use grid-based detection because of its speed, and before that an algorithm that would find the covariance of the points and split them along that vector. Only after sitting down to try a dozen or so different types of BVH algorithms did I find the center split method that is not only more functional than the grid, but also faster.
I thought about making an engine with only atoms and modeling everything with protons and electrons. That might come at another time. I don't know how electrons stay stuck to protons if they're neutral...