
Original Link: https://www.anandtech.com/show/193
Fillrate and VSD (Visible Surface Determination)
by Ga'ash Soffer on December 10, 1998 7:43 PM EST- Posted in
- GPUs
It is brought to our attention that some games have different fill-rate requirements than others. The Quake2 engine, for example, does two pass rendering (multitexturing), and this lowers the FPS of games using the Quake2 engine. A major aspect of fill-rate that is somewhat neglected; however, is overdraw. Overdraw is largely dependant on the algorithm the 3D engine uses to do something called Visual Surface Determination (VSD for short). The algorithm used by the 3D engine greatly affects overdraw, thus greatly impacting fill-rate requirements.
Overview
VSD stands for Visual Surface Determination, as mentioned above. As the name implies, VSD is the process of determining which surfaces are visible to the viewer. (in a 3D engine) As you may have guessed, there are many different methods of approaching the problem of determining the surfaces (pixels, polygons, or scanlines, depending on the algorithm) which are visible. Many of them involve storing the world in an organized data structure. The algorithms which will be discussed are: BSP Trees, Portals, an algorithm I call "Sectors" [Still haven't made up a cool acronym for it], and other approaches such as only Z-buffer (not really feasible for games, explained later), and depth sorting (not really used because it sucks, maybe in Playstation) Read on to learn about BSP Trees and how they can be a pain for your graphic accelerator.
BSP Trees & Overdraw
As far as we are concerned, A BSP tree is a method of organizing polygon data. How does a BSP tree organize the polygon data? Despite the complicated name (BSP + Binary Space Partition), the *theory* behind the BSP tree is not that complicated: Construct a binary tree (bold face terms explained at the end of article) starting with a plane (the plane you would like to organize your data by) as the root node, and then classify each polygon as either behind the plane, or in front of the plane. Store the behind polygons in one branch, the infront in another. Get the next polygons and put them in the tree using the same method. (recurse through the tree placing the polygon either in the left branch or right branch, depending on orientation in relation to the splitting plane). While implementing this in actual code is far from simple, the basic idea is not that complicated. As you may have noticed; however, there is a flaw in the above. What happens if polygons pierce each other? How do we know which polygon is in front or in back of which? Unfortunately, there is no way to classify these polygons. What do we do then? Well, we split the piercing polygon into two polygons, and then store each of those parts in the tree. Traditional rendering the polygons stored in a BSP Tree is done by performing a back to front traversal of the tree and rendering each polygon.
Back to Front
Everytime you see back to front, that probably means a lot of overdraw. The diagram below illustrates this.
In the example above, well over 200% of the pixels necessary would be drawn if the above were stored in a BSP Tree.
Why back to front?
In actuality, no current game uses Back to Front rendering. One method of reducing the overdraw associated with BSP Trees is to render front-back using a Z-buffer, or simpler yet (if you don't want to have any moving characters), simply check if a pixel was already drawn before proceeding to draw.
Quake2 takes BSP Trees a set further and uses an algorithm known as PVS (potentially visible set) to determine the, you guessed it, potentially visible set of polygons. While this algorithm is extremely complicated (it involves constructing a tree of intersections/unions of primitives (tetrahedrons, cylinders,etc.) to form more complex shapes.) I don't believe any current game supports all of these primitives for constructing the world (these unions/intersections of primitives are subspaces). Quake supports 3D planes, Quake2 and Unreal support polyhedra only, I believe. (Speculation) Quake uses a ray cast algorithm from sample points to determine which part of the scene is visible (according to a reliable e-mail received earlier today) and then probably renders them front-back using a Z-buffer, since a Z-buffer is necessary to display the characters/items anyway. Unreal uses Span Buffering (speculation again) to help reduce the potentially visible set.
What else do BSP Trees mean?
BSP trees also mean static environments. If you have played Unreal or Quake2, you know what I mean. There is very few, if any real changes in the environment. There is virtually no deformations (i.e. changing the shapes of things, not just moving static objects around) in these games at all. Since BSP trees cannot currently be calculated in real time, games using them are limited to static environments.
Why use BSP Trees?
The main reasons BSP Trees are used is for blazing fast raycasting. Raycasting is used all the time in 3D games for tasks such as determining if bullets hit targets or if you are in the line of sight of a monster. In many cases, BSP Tree's limitations are worth the speed advantages.
Read on to find out about an arrangement which allows for dynamic environments and reduces overdraw!
Portals
The portals I'm talking about aren't doorways which lead you to another world; even though they can be used rather efficiently to simulate these scenarios in 3D games. So what are portals? Well, the idea behind a portal rendering system is to divide the world into convex polyhedrons, with a bunch of polygons inside. There is then a 'door' leading to another convex polyhedron which stores another bunch of polygons. This 'door' is a polygon which is where the two convex polyhedra are connected. The rendering scheme is as follows:
Render a polygon
If polygon is a "door" (Portal?) then render the next convex
polyhedron, clip the results to the clipping window.
Otherwise, just render it plain
Since Portal rendering starts in the current polyhedron the viewer is in, it renders the closest polygons (generally) first. This greatly reduces overdraw because when we finally reach the further polygons, chances are that something will be covering most of its pixels, so we don't draw them. When it comes down to overdraw alone, Portals is one of the most efficient methods of reducing overdraw.
As you can see from the above example, Portals work best in environments with many rooms with none, or very little detail inside the rooms. Descent is the perfect example of portals in action. That kind of environment is perfect for portals.
When it comes to overdraw in indoor environments, Portals are hard to beat. The average overdraw rate is about 50% for a portal based engine, while overdraw can reach up to 400% (meaning 400% of the necessary pixels needed are drawn) using BSP Trees. Not only are portals faster at rendering, they allow for fully (well, almost) dynamic environments. Also, there are many hacks involving portals which allow for cool teleportation effects (something like DmRadikus, In Unreal, even though Unreal doesn't use portals to render), as will be used in Prey. If portals are so great, why doesn't everyone use them. Well, it turns out that portals do have their downsides.
What can go wrong with portals?
The first problem with portals is that, if you want a fully dynamic environment, the polygons inside each portal cannot be arranged in any way (i.e. their orientations are completely arbitrary) there is no way to speed up raycasting and collision detection (plus a bunch of other stuff) other than checking every polygon in the portal. If the portals are small, this isn't much of a problem; however, large portals can cause significant slow downs. Speaking of large portals, remember when I mentioned that portals are best suited for enclosed type environments (rooms)? Well, the reason for this is that outdoor environments require either very large portals, or a bunch of portals with tons of "doors". Both of these approaches aren't very efficient.
Are there work arounds?
Well, currently, I don't believe there is a workaround which does not sacrifice something, namely the fully dynamic aspect. I am currently exploring possible organization methods which would not make any restrictions on the game world, and allow for fast raycasting/collision detection as well.
Let's hear about it...
As I search for a better acronym, the current one is ECP, or Equilateral Cubic Partitioning. The idea is quite simple, divide the world into a 3dimensional grid and then place each polygon in its appropriate location. (Some polygons, which are borderline go in the lists of a few cubes) Since the cubes should be made much smaller than portals, raycasting is much faster. Also, fully dynamic is possible because placing polygons inside their appropriate cube is an easy task which should be able to be done in real time (haven't tried it yet though... I'm too scared it won't work right :) The only restriction on this type of environment is that no polygon can have a radius which exceeds the 1/2 the length of an edge of a cube. (Polygons which can span more than 4 cubes will be a problem to classify speedily)
Other Stuff
Don't think that there are only two methods of VSD. Why are they the only methods really used though? Well, the reason is some of the other methods are very inefficient, or have a serious flaw in them.
Depth Sorting
One of the earliest methods of VSD was depth sorting. This process involved sorting the polygons in order of farthest to closest, and then rendering the list, back to front (e.g. draw the farthest polygon first, etc.) The major flaw with depth sorting is that it doesn't always draw the polygons in the right order. Why doesn't it work? Well, first of all, what do we sort by? Do we sort by the farthest point in the polygon, the center of the polygon? Perhaps the closest point in the polygon. Each one of these methods will cause errors in certain cases. Piercing polygons never render correctly, regardless of the method used to sort the polygons. For these reasons, and because depth sorting is not very fast. (Sorting 10,000 polygons or more isn't fast, even using the fastest sorting algorithms).
Zbuffer alone (no organization)
Using a Zbuffer to render polygons is what is used by hardware. Why do we need organization anyway? Why can't we just go through all the polygons and render them using a Z-buffer? This allows a fully dynamic world and always renders correctly. (Provided enough Z buffer accuracy). The reason no one does this is that it is SOOO SLOW. While the actual rendering is only very slow, collision detection and raycasting are incredibly slow, many many many polygons must be tested for ray collisions or player/object collisions. If the polygons are in arbitrary order, all the polygons must be cycled through. The actual collide test does not need to be performed, however, since many polygons can be eliminated quickly. Regardless, this method is very slow and inefficient. For this reason no recent games have no organization method.
Conclusion
While VSD method isn't exactly what gamers care about, it turns out that it is one of the most, if not the most, important aspect of a 3D engine. VSD determines dynamism (is that a word?) of the world, CPU dependency, and graphic card dependency. I hope this article was informative and interesting to all of you.