To simulate natural formations of fighters, we decided to implement boid swarms. While they look super fancy, the implementation is pretty straightforward, albeit pretty expensive. Each boid represents a flocking object, which looks at every other boid and tries to follow a set of rules. The three core rules of a boid are cohesion, alignment, and separation, but more can be added to produce results like destination points and object avoidance (shown above). Each rule is incredibly simple and usually involves looping through each boid and calculating an average of something, like position or velocity. Then using the average, you can calculate a vector that represents the desired velocity for that rule. For example, calculating alignment involves finding the average velocity between all boids, which attempts to "steer" the boid in that average direction.
To finally compute all of these rules, simply loop through each boid, calculate each rule, add them up, and add it to that boid's current velocity. Once it's all coded and in place, it's just a matter of fiddling with properties until it behaves right!
Adding custom rules isn't too hard. To travel to a destination, we just calculate a vector between the current boid position and whatever the target is, multiply it by some constant, then return it. To create object avoidance, a seemingly complex algorithm, is actually quite straightforward. We keep track of a separate list of "obstacles" which store a position and radius (the two big spheres in the image above). Then, using an almost identical rule to separation, calculate the distance between the boid and every obstacle, do the same math from separation, add them up, and you're done.
An unfortunate drawback of boids is the computational intensity. Computing a group of boids takes n(n - 1) time, where n is the number of boids. Computing obstacles is n*m, m being the number of obstacles in the scene. Our solution to this is to have a lot of small boid groups (4-10 per group) rather than one large group, which allows us to have a LOT more obstacles. This exponential amount of time, while small per boid, really starts to add up when you throw more into each group. While expensive, the effect is pretty easy to obtain and looks really good. Future optimizations could include computing boids in parallel or using octrees similarly to collision detection.
We wanted a particle system that was fast, and one that could dish out loads of great VFX without taking the CPU for more expensive algorithms later on, such as boid swarms or collision detection. We ended up making a couple shader permutations for burst and stream emitter particles. Each particle keeps track of its personal duration and a few other things for attributes like which corner of the quad the current vertex is, and a random seed which I'll explain.
Along with these 3-4 attributes comes a flood of uniforms, which is most likely where the most optimization could happen with this system. Each uniform represents an option or range of properties that the particle will keep track of, including velocity, acceleration, color over time, rotation, and more. Most of these properties have min and max uniforms, which represent a range for starting and ending states for that property. Using the seed passed in as an attribute, we're able to calculate a random particle over time while still maintaining predictability. By passing the seed through a random function we're able to simulate random particles and are able to generate unique values for each of the property ranges.
To handle lifetime, we don't. The emitter never creates or destroys particles during an update, just during initialization and destruction. To prevent repetitive particle patterns we also calculate the number of lifetime cycles that have elapsed for the particle by dividing its elapsed time since creation by its duration. Round that down (or truncate to an int) and you got yourself the number of cycles the particle has had. Then all we do is add that to the seed and the particle is suddenly able to be truly "random."
The result of this shader magic is the ability to render over 10 million particles at 60fps!
While the spheres were looking pretty good, they still felt a little off. The problem was the SOIL function for cubemaps downgraded the HDR images to LDR ones. After using stb_image to manually load the HDR images and upload them into the cubemap, everything looked dramatically different. After applying tonemapping, the resulting image looked a lot better. For now, we're using standard Reinhard for the tonemapping, although we're thinking of switching to Uncharted since Reinhard feels a bit washed out.
However, we still had a problem - high roughness spheres looked fauceted. This turned out to be a problem with the automatic mipmapping. To fix this problem, we created some functions to precompute the different mipmap levels using the GGX mulitple importance sampling. While there were some difficulties: some edges had seams due to mis-matched mappings, partly due to the source images having a left handed coordinate system, after resolving them, the result looked much closer to our goal, and it runs faster (only 1 cubemap sample inside the shader)!
With most of the basics of PBR done, the next target was adding more functionality by implementing Normal Mapping and Skeletal Animation.
While the idea of normal mapping isn't that complex - you essentially just replace the tangent-space normal with the values from a texture - the calculation of the tangent and bitangent can be difficult. Fortunately, Assimp has a flag to autogenerate the tangents and bitangents from the objects normals and UVs, which made that part significantly easier. All that was left was to load these vectors and use them as a basis to transform the image's tangent-space normals into world space normals, which we can then use for lighting & deferred rendering output.
Skeletal Animaiton was a bit trickier. Assimp places the animation information all over the place, which was a bit confusing at first, but actually works out pretty well. Each mesh contains vertex weight information, as well as the names of each bone, which allows us to construct a mapping of name->id. The bones themselves are stored as nodes in the scene graph, which means we treat them like any other object, which works really well as this lets us attach other objects to a bone too. Finally, all the animation data is stored in the scene, with separate keyframes for position, rotation, etc., which are used to overwrite the tranform of the given bone. So, the high-level of our engine is just to apply the current transform to each bone, then when each skinned mesh renders, it loads the current transform of each bone & puts it into the shader.
When all the tech is finished, what will we do? We make a game! This library is meant to be as easy to use as possible, and allows for custom keybinding across multiple input types, including mouse, keyboard, and gamepad. All we need to do is create input data including a name, positive/negative button strings ("space", "joystick button 1", "mouse 1", etc.), which axis to use, and a few other minor properties, then you're done! Just call getAxis() or getButton(), pass in the name of your input, and it'll automatically grab input from every type of device! For binary inputs like keyboard or mouse buttons, the library will even interpolate the state changes in order smooth out axis input.
To make it, we used a lot of maps, a lot of interpolation, and a state machine. To grab edge inputs like key down or key up, we keep track of the previous states and compare the current with it to determine if the state warrants an edge state. We do it in such a way that allows each edge input to happen for exactly one frame, no matter what! It's treated like a simple state machine, where the initial node is IDLE, which moves to BUTTON_DOWN on a keypress. The next frame it'll move to either PRESSED or BUTTON_UP depending on the input. If it's PRESSED and the state changes to up, it proceeds to BUTTON_UP, and exactly one frame later back to IDLE. This method makes it really easy to do complex inputs that should only happen once, and it turns out to be really reliable.
After getting basic rendering up and running, I started work on adding components of physically based rendering (pbr) to our application. Physically based rendering is the idea of using more physically accurate models and values in order to get a more realistic image.
To incorporate these ideas, we decided to make the following additions: 1) Cook-Torrance BRDF, using GGX ( Trowbridge-Reitz) distribution and visiblity terms, as well as Schlick's Fresnel approximation.
2) Environment Mapping with multiple importance sampling
3) "Metalness" and "Roughness" inputs in order to allow the shader to render conductors and dielectrics of varying roughnesses.
4) Spherical lights (instead of punctual point lights)
The Cook-Torrance BRDF is the standard BRDF used, which multiplies two terms, one for the distribution and one for geometric self-shadowing, that depend on the chosen microfacet distribution with the Fresnel term and some normalization values.
We chose to go with the GGX distribution as it is reasonably fast and its long tail matches the effect of highlights in the real world (for a great comparison, see http://www.neilblevins.com/cg_education/ggx/ggx.htm ). GGX is also the standard used in both many game engines, such as Unreal, as well as in 3D modeling packages, like Blender. For the visiblity term, we used the Schlick approximation of GGX instead of the full GGX term, as produced visually similar results while requiring less work.
For the Fresnel effect, we used Schlick's approximation, and also used the trick of interpolating the initial reflectance F0 with the material's color based on metalness, discussed here:http://www.codinglabs.net/article_physically_based_rendering_cook_torrance.aspx Since metals have no diffuse value, we can even use the albedo color as the reflectance color, which allows us to have a simple pipeline that handles both dielectrics and conductors.
For the environment map, we load in a HDR image and generate spherical harmonics for the irradiance based on this paper by Ramamoorthi et al: https://cseweb.ucsd.edu/~ravir/papers/envmap/envmap.pdf This allows us to quickly apply the diffuse lighting from the environment map to objects. For the specular component, we take multiple samples using the GGX distribution and the Hammersley point set. Now that we have the sample direction, we can treat the direction as a light direction, and then use the regular Cook-Torrance BRDF to determine the effect of the lighting. One caveat to note is that the multiple importance sampling already takes the distribution into account, so that needs to be removed from the BRDF.
Finally, we also implemented spherical lights based on the approximation from the Epic Game's paper on PBR:http://blog.selfshadow.com/publications/s2013-shading-course/karis/s2013_pbs_epic_notes_v2.pdf. This is especially helpful, as before, objects with roughness 0 would not reflect point lights as they were infinitely small. With the approximation, objects with roughness 0 show highlights, and we can more accurately represent lights of different size. There's still a bit more work to do, such as precomputing the mipmaps for the environment map for different roughness values and other optimizations, but it's getting there.