In this article we describe a culling technique that reduces the memory requirements for Facet Normals in backface culling. This high-precision technique, which performs a precise backface culling in the object space, is especially suitable for frontend culling and requires only half as much memory as the standard facet normal technique.
The motivation behind front-end culling.
Backface culling can be performed at several points in the 3D pipeline. Although we could just sit back and let the Rastner generator be selected for us, it is advantageous to perform the culling process earlier. Because the earlier we get rid of irrelevant data, the less data has to be moved in the system (bandwidth saving) and the less calculations have to be performed (CPU load saving). Culling can be performed in one of three stages.
- Before transformation and lighting.
- After the transformation, but before the lighting.
- In the rasterizer (after transformation and illumination).
Culling during phases 2 and 3 is typically performed in screen space by checking the clockwise and counterclockwise order of the polygon corner points. Frontend culling (step 1) is typically performed by calculating the product point of the viewing vector and the face normal of the polygon. The Facet Normals can be calculated or precalculated in no time at all and saved with the data set. In any case, advance culling (step 1) is typically faster than other culling strategies (step 2 or 3) because it saves bandwidth and requires fewer calculations.
The actual effort of frontend selection is that we either have to calculate the Facet Normals on the fly or use a precalculated Facet Normal, which increases the size of our model database. However, this increase can be reduced by half and although the precalculated Facet Normals increase the model size, the procedure accesses the memory sequentially, which is advantageous for us.
On the other hand, if the culling is based on vertex order tests (stages 2 and 3), the culling passes through the triangles and accesses the nodes for each triangle. The vertices of adjacent triangles (and even the same triangle) can be distributed over the entire node pool, resulting in random memory accesses during extraction. These random accesses are slow and lead to suboptimal cache usage.
The Facet Normals based culling technique also passes through the triplets, but accesses their Facet Normals instead of their vertices. Since the facet normals are stored per triangle, they are fetched sequentially in the culling loop. Thus the sequential memory accesses are fast, use the cache effectively and can be further accelerated by prefetch.
Our work with game developers has shown that frontend culling often leads to a significant increase in performance (10 to 20 percent increase in frame rates) when used correctly.
So we are, admittedly, very short introduction to Backface Culling so far through. If you have any questions or suggestions, please feel free to contact our experts in our forum.
Thank you very much for your visit.